CN-122002026-A - Reconstruction-free decoding self-adaptive quantization method for low-code-rate characteristic compression classification

CN122002026ACN 122002026 ACN122002026 ACN 122002026ACN-122002026-A

Abstract

The invention discloses a reconstruction-free decoding self-adaptive quantization method for low-code-rate feature compression classification, which belongs to the technical field of communication and comprises the steps of collecting multi-mode sensing data, respectively extracting features to obtain feature representations corresponding to all modes, then carrying out feature fusion to obtain fusion semantic features, carrying out quantization operation on the fusion semantic features by adopting updated learnable quantization step sizes to obtain quantization features, calculating a total code rate based on the quantization features, carrying out entropy coding on the quantization features and transmitting a coded bit stream, carrying out entropy decoding on the bit stream to obtain restored quantization features, and carrying out classification reasoning to output classification results. The invention solves the problems that reconstruction and decoding are complex and classification semantic information is easy to lose when multi-mode characteristics are compressed and transmitted in a bandwidth limited environment in the prior art.

Inventors

WANG LEI
WANG JIE
Dou Haie
JIANG XUE
QI TING
XIA ZHIJIE

Assignees

南京邮电大学

Dates

Publication Date: 20260508
Application Date: 20260409

Claims (10)

1. A reconstruction-free decoding self-adaptive quantization method for low-code-rate characteristic compression classification is characterized by comprising the following steps: Collecting multi-mode sensing data; respectively extracting features of the multi-mode sensing data to obtain feature representations corresponding to all modes; Feature fusion is carried out on the feature representations corresponding to the modes to obtain fusion semantic features; Constructing a quantizer with a learnable quantization step size; in a training stage, enabling the gradient of the quantizer to be conductive through a straight-through estimator, and updating the learnable quantization step size based on a gradient descent method; Performing quantization operation on the fusion semantic features by adopting the updated leachable quantization step length to obtain quantized features; Calculating a total code rate based on the quantization characteristic to construct a joint loss function comprising a classification task loss function and the total code rate, and performing end-to-end joint optimization on the joint loss function, so that the leachable quantization step size can be adaptively adjusted according to target code rate constraint; Entropy coding is carried out on the quantization characteristic, and the leachable quantization step length is used as side information to be transmitted in a multiplexing way with the coded bit stream; Entropy decoding is carried out on the received bit stream at a receiving end, and quantization characteristics are restored based on the side information; and carrying out classification reasoning based on the recovered quantized features, and outputting a classification result.
2. The reconstruction-free decoding self-adaptive quantization method for low-code-rate feature compression classification of claim 1, wherein the multi-modal sensing data at least comprises visual modal data and tactile modal data, wherein the feature extraction is performed on the multi-modal sensing data to obtain feature representations corresponding to each mode, and the method comprises the following steps: The method comprises the steps of carrying out time synchronization and sample level alignment on visual mode data and tactile mode data, enabling the visual mode data and the tactile mode data to correspond to a group of multi-mode inputs at the same moment or on the same detection object, preprocessing the multi-mode inputs to form multi-mode input pairs, processing the visual mode data in the multi-mode input pairs through an independent visual feature extraction network to obtain visual features, and processing the tactile mode data in the multi-mode input pairs through an independent tactile feature extraction network to obtain tactile features.
3. The reconstruction-free decoding adaptive quantization method for low code rate feature compression classification as claimed in claim 2, wherein said multi-modal input pair is represented as: ; in the formula, Representing the input of a visual modality, Representing a haptic modality input; The visual features are expressed as: ; in the formula, The visual characteristics are represented by the visual characteristics, Representing a visual feature extraction network; the haptic characteristics are expressed as: ; in the formula, The tactile characteristics are represented and the tactile characteristics are represented, Representing a haptic feature extraction network.
4. The reconstruction-free decoding self-adaptive quantization method for low-code-rate feature compression classification according to claim 3, wherein feature fusion is performed on feature representations corresponding to the modes to obtain fusion semantic features, and the method comprises the following steps: and performing feature fusion on the visual feature representation and the tactile feature representation by adopting at least one of linear mapping, attention weighted fusion or gating fusion modes after feature splicing to obtain fusion semantic features.
5. The reconstruction-free decoding adaptive quantization method for low-code-rate feature compression classification as claimed in claim 4, wherein said fused semantic features are expressed as: ; in the formula, The representation is a fusion of the semantic features, Representing a feature fusion operation.
6. The reconstruction-free decoding adaptive quantization method for low code rate feature compression classification as claimed in claim 5, wherein the gradient of the quantizer is expressed as: ; in the formula, Representing the gradient of the quantizer as a joint loss function Is the derivative of (2) For leachable quantization step sizes Is the derivative of (2) Is used for the gradient of (a), Representing a classification task loss function Is the derivative of (2) For leachable quantization step sizes Is the derivative of (2) Gradient of (2) Is set in the set-up coefficient of (a), Representing the total code rate Is the derivative of (2) For leachable quantization step sizes Is the derivative of (2) Gradient of (2) Is set in the set-up coefficient of (a), The partial differentiation element is represented by a partial differentiation element, Representing joint loss function Is the derivative of (2) For quantized features First, the Individual elements Is used for the gradient of (a), Representing the first of the fused semantic features The number of elements to be added to the composition, The number of all elements representing the quantized feature, Representing the number of all elements fusing semantic features; the updated learnable quantization step length is as follows: ; in the formula, Represent the first The next updated learnable quantization step size, Represent the first The next updated learnable quantization step size, The learning rate is indicated as being indicative of the learning rate, Representing joint loss function Is the derivative of (2) For the first Differentiation of sub-updated learnable quantization step sizes Is a gradient of (2); The quantization characteristic is expressed as: ; in the formula, The quantization characteristic is represented by a representation of the characteristic, Representing quantization operations.
7. The reconstruction-free decoding adaptive quantization method for low-code-rate feature compression classification as claimed in claim 6, wherein the total code rate is expressed as: ; in the formula, Representing quantized features First, the The probability of occurrence of an individual element over a corresponding quantization interval, A logarithmic function based on natural number 2; Classification task loss function Expressed as: ; in the formula, Represent the first The true labels of the individual defect categories are, Represent the first The probability of prediction of the individual defect class, Indicating the number of defect classes, Representing natural logarithms; Joint loss function Expressed as: ; in the formula, Representing a classification task loss function Is set in the set-up coefficient of (a), Representing the total code rate Is set, and is set to be a preset coefficient of the following formula (i).
8. The reconstruction-free decoding adaptive quantization method for low code rate feature compression classification as claimed in claim 7, wherein entropy encoding the quantization features and transmitting the encoded bitstream comprises: Modeling the probability distribution of the fusion semantic features by adopting a Gaussian mixture model to obtain a probability density function of the fusion semantic features; Calculating the occurrence probability of the elements of the quantization characteristic on the quantization interval according to the probability density function; Entropy coding is carried out on elements of the quantization characteristic according to the occurrence probability, and a compressed bit stream is generated; Transmitting the bit stream to a receiving end through a communication link.
9. The reconstruction-free decoding adaptive quantization method for low-code-rate feature compression classification as claimed in claim 8, wherein the probability density function of the fused semantic features is expressed as: ; in the formula, Represents the probability density function of the fused semantic features y, Represent the first The weights of the components of the gaussian distribution, Indicating the amount of the gaussian distributed component, Represent the first The mean value of the individual gaussian distribution components, Represent the first The variance of the components of the gaussian distribution, Representing a gaussian distributed probability density function.
10. The reconstruction-free decoding adaptive quantization method for low-code-rate feature compression classification as claimed in claim 9, wherein the occurrence probability of the element of the quantization feature over the quantization interval is expressed as: ; in the formula, The probability of occurrence of elements representing quantization characteristics over a quantization interval, for controlling the interval size of quantization elements, The upper limit of the quantization interval corresponding to the element representing the quantization characteristic, The lower limit of the quantization interval corresponding to the element representing the quantization characteristic.

Description

Reconstruction-free decoding self-adaptive quantization method for low-code-rate characteristic compression classification Technical Field The invention relates to a reconstruction-free decoding self-adaptive quantization method oriented to low-code-rate characteristic compression classification, and belongs to the technical field of communication. Background Along with the rapid development of the industrial internet of things (IndustrialInternetofThings, IIoT) and intelligent manufacturing technology, applications such as industrial defect detection and equipment state identification have higher requirements on real-time processing of multi-mode sensing data. In such application scenarios, the system typically needs to collect multiple sources of sensing data, such as visual, tactile, vibration, etc., simultaneously, and perform feature extraction, data transmission, and classification decision on the edge device or near-end node. However, due to the computational effort, power consumption and communication bandwidth conditions of field devices, conventional processing approaches that rely on high bandwidth transmission and complex computation have difficulty meeting practical deployment requirements. In the prior art, the processing flow of compression-decoding reconstruction-reclassification is generally adopted for compressing and classifying the perceived data, namely, the original signal or image is compressed and transmitted by adopting the traditional coding standards such as JPEG, JPEG2000 and the like or a compression method based on deep learning, and the subsequent recognition or classification task is executed after the receiving end reconstructs the original data through a decoder. Such methods target minimizing reconstruction errors, typically compressing the data with a fixed quantization step size or a quantization strategy based on rate-distortion optimization. When bandwidth conditions change, the methods often adapt to the channel state by adjusting quantization parameters, so as to sacrifice reconstruction quality and reduce code rate. However, the prior art has the following defects that firstly, the decoding and reconstruction process has high computational complexity and large system time delay, and particularly in the industrial field environment with low bandwidth and low calculation power, the serial processing flow of compression-decoding-classification easily causes the problems of insufficient response speed, overlarge system load and the like, and the industrial real-time detection requirement is difficult to meet. Secondly, in the traditional compression method, a signal or image reconstruction error is mostly used as an optimization target, under the condition of low code rate compression, a fixed or coarse granularity quantization strategy easily causes irreversible damage to high-frequency details and key discrimination characteristics in original data, so that the reconstructed data at a receiving end maintains certain fidelity in a pixel domain, but high-level semantic information is seriously lost, and the performance of a subsequent classifier is obviously reduced. For industrial detection tasks that rely on high-level semantic features to accomplish decisions, this quality-of-reconstruction-centric compression approach is not effective in serving classification targets. Third, the above problem is further exacerbated in multi-modal sensing scenarios. Because different modes have obvious differences in data distribution, noise characteristics and discrimination contribution, if a unified fixed quantization strategy or simple characteristic splicing is adopted for compression, key mode information with high contribution is easily weakened excessively in the compression process, and redundant mode information occupies excessive code stream resources, so that the overall recognition precision and robustness of the system are reduced. Therefore, how to realize efficient compression and transmission of multi-mode features in a bandwidth limited environment, avoid a complex decoding and reconstruction process, and effectively reserve semantic information required by classification tasks has become a technical problem to be solved in the art. Disclosure of Invention The invention aims to provide a reconstruction-free decoding self-adaptive quantization method for low-code-rate feature compression classification, which is characterized in that a quantizer with a leachable quantization step length is constructed, a straight-through estimator is utilized in a training stage to enable a quantization gradient to be conductive and update the leachable quantization step length based on gradient descent, after quantization operation is carried out on fusion semantic features, a joint loss function comprising classification loss and total code rate is constructed based on the quantization features to carry out end-to-end joint optimization, so that the leachable quantization step len