CN-121985125-A - Synonymous variational image transmitting and receiving method and related model training method
Abstract
The embodiment of the disclosure provides a synonymous variation image sending and receiving method and a related model training method. The synonymy variational image sending method comprises the steps of carrying out synonymy analysis transformation processing on an original image based on the fact that semantic equivalence constraint is met to obtain a continuous hidden characteristic image, determining probability distribution of elements of the continuous hidden characteristic image, calculating channel overhead corresponding to feature vectors in the continuous hidden characteristic image according to the probability distribution to generate a channel overhead sequence, determining a target synonymy level according to current channel state parameters of a channel, determining a synonymy signal sequence corresponding to a synonymy characterization part of the target synonymy level according to the channel overhead sequence, and transmitting the synonymy signal sequence to the channel as a target transmission signal sequence.
Inventors
- NIU KAI
- LIANG ZIJIAN
- WANG CHANGSHUO
- XU JIN
- ZHANG PING
Assignees
- 北京邮电大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260206
Claims (10)
- 1. A synonymous variation image transmission method, characterized by comprising: Carrying out synonymous analysis transformation processing on the original image based on meeting semantic equivalence constraint to obtain a continuous hidden characteristic diagram; Determining probability distribution of each element of the continuous hidden characteristic graph; Calculating channel overhead corresponding to each feature vector in the continuous hidden feature map according to the probability distribution, and generating a channel overhead sequence; Determining a target synonymous level according to the current channel state parameters of a channel, wherein a plurality of synonymous levels marked by sequence numbers are preset, the sequence numbers of the synonymous levels correspond to preset synonymous level dividing rules, the synonymous level dividing rules are used for representing the dividing mode of a synonymous representation part and a detail representation part of a hidden feature map in the current synonymous level, and the sequence numbers of the synonymous levels are ordered according to the semantic importance and/or the semantic granularity of the hidden feature map; determining a synonymous signal sequence corresponding to the synonymous representation part of the target synonymous level according to the channel overhead sequence; And transmitting the synonymous signal sequence as a target transmission signal sequence into the channel.
- 2. The method of claim 1, wherein determining a synonym signal sequence corresponding to a synonym-characterization portion of the target synonym-hierarchy from the channel overhead sequence comprises: performing unequal-length nonlinear conversion on the continuous hidden characteristic map to obtain an intermediate state characteristic representation of the continuous hidden characteristic map; Mapping the intermediate state characteristic representation into a plurality of groups of increment signals corresponding to the synonym hierarchy based on the synonym hierarchy dividing rule, wherein each group of increment signals is an estimated difference value between the synonym signal of the current synonym hierarchy and the synonym signal of the last synonym hierarchy, and for the same hidden characteristic diagram, a plurality of increment signals corresponding to each synonym hierarchy have the same channel cost; Superposing the increment signals corresponding to the target synonym level and the increment signals corresponding to all synonym levels before the target synonym level based on the sequence of the serial numbers of the synonym levels to obtain the synonym signal sequence; Or taking the continuous hidden characteristic map as input of a synonymous encoder in a pre-trained synonymous variation image transmission model, so that the synonymous encoder maps the continuous hidden characteristic map from a hidden space to a signal sequence of a signal space based on unequal-length nonlinear conversion to obtain a plurality of sub-signal sequences, and cascading the plurality of sub-signal sequences to form the signal sequence, wherein channel cost of each sub-signal sequence is determined based on the channel cost sequence, and separating synonymous signal sequences corresponding to synonymous characterization parts of the target synonymous hierarchy from the signal sequence.
- 3. The method of claim 2, wherein the channel state parameters include signal-to-noise ratio, and wherein determining the target synonym level based on the current channel state parameters for the channel comprises: Determining a sequence number of the target synonymous level or a soft value sequence for representing the sequence number based on a preset linear relation or a preset nonlinear relation according to the signal-to-noise ratio, and determining a weight coefficient sequence corresponding to the characteristics of the target synonymous level; Based on the sequence of the sequence numbers of the plurality of synonymous levels, superposing the increment signals corresponding to the target synonymous level and the increment signals corresponding to all synonymous levels before the target synonymous level according to the sequence numbers and the weight coefficient sequence to obtain the synonymous signal sequence, or superposing the increment signals corresponding to the target synonymous level and the increment signals corresponding to all synonymous levels before the target synonymous level according to the soft value sequence and the weight coefficient sequence to obtain the synonymous signal sequence; The method further comprises the step of sending the sequence number based on a side information transmission link to receiving end equipment corresponding to the original image after determining the sequence number of the target synonymous level according to the signal-to-noise ratio based on the preset linear relation or the preset nonlinear relation.
- 4. The method of claim 1, wherein the channel state parameter comprises a signal-to-noise ratio, the method further comprising: After determining probability distribution of each element of the continuous latent feature map, dividing the continuous latent feature map into a plurality of groups of synonym characterization parts and detail characterization parts corresponding to each synonym level based on serial numbers of the plurality of synonym levels, wherein the synonym characterization parts under each synonym level are shared space features of all image samples in a synonym set of each synonym level corresponding to the original image, average distortion between any image sample in the synonym set and the original image meets a distortion metric index corresponding to each synonym level, the distortion metric index corresponding to each synonym level is sequentially increased or decreased according to the serial numbers of each synonym level, the perception loss between any image sample in the synonym set and the original image meets a perception loss metric index corresponding to each synonym level, and the perception loss metric index corresponding to each synonym level is sequentially increased or decreased according to the serial numbers of each synonym level; estimating the coding rate of the synonym characterization part corresponding to each synonym level based on the probability distribution of each element of the continuous hidden feature map; Determining a target synonymous level according to the current channel state parameters of the channel comprises the following steps: and determining the sequence number of the target synonymous level or a soft value sequence used for representing the sequence number according to the signal-to-noise ratio and the coding rate.
- 5. A synonymous variation image receiving method, characterized by comprising: Receiving a signal sequence from a channel; Determining a target synonymous level corresponding to the signal sequence, wherein a plurality of synonymous levels marked by sequence numbers are preset, the sequence numbers of the synonymous levels correspond to preset synonymous level dividing rules, the synonymous level dividing rules are used for representing the dividing mode of synonymous representing parts and detail representing parts of the hidden feature images in the current synonymous level, and the sequence numbers of the synonymous levels are ordered according to the semantic importance and/or semantic granularity of the hidden feature images; Determining a continuous hidden characteristic map corresponding to the target synonymous level according to the signal sequence; And carrying out synonymous comprehensive transformation meeting semantic equivalence constraint on the continuous hidden feature map to obtain a reconstructed image corresponding to the signal sequence.
- 6. The method of claim 5, wherein determining a target synonym level for which the signal sequence corresponds comprises: acquiring the sequence number of the target synonymous level from a side information transmission link; Or determining the sequence number of the target synonymous level based on the current signal-to-noise ratio of the channel through a preset linear relation or a preset nonlinear relation; Or determining the sequence number of the target synonym hierarchy or the soft value sequence used for representing the sequence number according to the signal-to-noise ratio by a synonym decoder in the pre-training synonym variation image transmission model.
- 7. The method of claim 5, wherein determining a continuous latent feature map corresponding to the target synonym level from the signal sequence comprises: the signal sequence is used as input of a synonymous decoder in a pre-trained synonymous variation image transmission model, so that the synonymous decoder converts the signal sequence into a plurality of groups of sub-signal sequences based on a channel overhead sequence acquired from a side information transmission link, wherein the length of the sub-signal sequence is consistent with that of the channel overhead sequence, and the channel overhead sequence comprises channel overhead corresponding to feature vectors; or performing incremental feature detection on the signal sequence based on preset unequal length nonlinear conversion and the synonymous level division rule, and mapping the signal sequence into a plurality of incremental feature detection results corresponding to the synonymous level; superposing the target synonymous level and the increment feature detection results corresponding to all synonymous levels before the target synonymous level according to the sequence number of the target synonymous level or a soft value sequence for representing the sequence number to obtain an intermediate state feature representation of the continuous hidden feature map; And obtaining the continuous hidden characteristic graph based on non-linear conversion of unequal lengths by the intermediate state characteristic representation.
- 8. The method according to any one of claims 5 to 7, further comprising: Receiving side information containing a first quantized intermediate coding sequence based on a side information link, and performing entropy decoding on the side information containing the first quantized intermediate coding sequence to obtain the first quantized intermediate coding sequence, wherein the first quantized intermediate coding sequence is obtained by performing quantization and entropy coding processing on an intermediate matrix of a variation entropy model in a bottleneck layer; Estimating a mean matrix and a standard deviation matrix according to the first quantized intermediate coding sequence based on the part behind the bottleneck layer of the variation entropy model; Correcting the continuous hidden characteristic map based on the mean matrix and the standard deviation matrix to obtain a corrected continuous hidden characteristic map; and carrying out synonymous comprehensive transformation meeting semantic equivalence constraint on the basis of the corrected continuous hidden characteristic diagram to obtain a reconstructed image corresponding to the signal sequence.
- 9. The training method of the synonymous variational image transmission model is characterized by comprising the following steps of: Acquiring an image sample; carrying out synonymous analysis transformation processing on the image sample based on meeting semantic equivalence constraint to obtain a continuous hidden characteristic diagram; Estimating probability distribution of each element in the continuous hidden characteristic graph based on a variation entropy model to obtain probability distribution characteristics of each element in the continuous hidden characteristic graph; Dividing the continuous hidden feature map into a plurality of groups of synonymous representation parts and detail representation parts corresponding to synonymous levels based on serial numbers of a plurality of preset synonymous levels, wherein the synonymous levels are identified by the serial numbers, the serial numbers of the synonymous levels correspond to preset synonymous level division rules, the synonymous level division rules are used for representing the division mode of the synonymous representation parts and the detail representation parts of the hidden feature map in the current synonymous level, and the serial numbers of the plurality of synonymous levels are sequenced by semantic importance of the hidden feature map and/or semantic granularity; Estimating the coding rate of the synonymy characterization part of each synonymy hierarchy based on the probability distribution characteristics of each element of the continuous hidden characteristic map; Determining channel overhead corresponding to each feature vector in the continuous hidden feature map, and generating a channel overhead sequence; Estimating signal-to-noise ratio thresholds corresponding to the synonymous levels according to the coding rate, the reference signal-to-noise ratio and the channel overhead sequence; The continuous hidden feature map is used as input of a synonym encoder in a synonym variation image transmission model, and a synonym signal sequence corresponding to a synonym characterization part of a target synonym level selected by the training is determined; Taking a target signal-to-noise ratio threshold corresponding to the target synonymous level as the signal-to-noise ratio of a channel in the training, and transmitting the synonymous signal sequence to the channel; A synonym decoder in a synonym variation image transmission model input by the synonym signal sequence received from the channel, so that the synonym decoder determines the target synonym level and determines a continuous hidden characteristic map corresponding to the target synonym level based on the target signal-to-noise ratio threshold; Carrying out synonymous comprehensive transformation meeting semantic equivalence constraint on the continuous hidden feature map to obtain a reconstructed image; Calculating a loss value based on the reconstructed image and the image samples; And updating and training the synonymous variation image transmission model according to the loss value to obtain the synonymous variation image transmission model after training, wherein the loss value comprises total loss obtained by carrying out weighted summation on the loss values calculated based on at least two different loss functions.
- 10. A computer device comprising one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the one or more programs comprising instructions for performing the method of any of claims 1-9.
Description
Synonymous variational image transmitting and receiving method and related model training method Technical Field The disclosure relates to the technical field of communication, in particular to a synonymous variation image sending and receiving method and a related model training method. Background Currently, among various services of the mobile internet, visual information transmission using images and videos as main modes occupies a great majority of network bandwidth. The traditional visual information transmission is based on a separation coding framework supported by shannon source channel coding theorem, the image/video compression and data transmission of visual information are split into two mutually independent information processing processes, a distortion-limited image/video compression scheme is designed based on the condition of grammar symbol accuracy, and a transmission process is designed based on the condition of bit-level transmission reliability. However, with the development of wireless communication technology to intellectualization, the visual information transmission mode at the grammar level exposes a plurality of problems such as poor transmission robustness, weak channel self-adaptation capability, unmatched symbol accuracy and perceived quality, difficulty in effectively meeting the requirement of intelligent transmission, and the like. Based on semantic information theory, aiming at the same perception meaning/semantic content, the meaning/semantic content can be expressed through different grammar symbol/data representation forms, and equivalent restoration of the meaning can be realized by the different representation forms at the perception level, namely the representation forms are different in meaning or identical in meaning, so that the essence breaks through the information processing logic which is only corresponding to the traditional grammar level number and is effective only through accurate restoration. Under the background, in the related art, from a joint source channel coding framework with better performance in theory, a communication paradigm oriented to a semantic hierarchy is developed. The communication paradigm is based on a deep neural network to build an end-to-end semantic communication model, and joint source channel coding is achieved. The channel state is introduced into the neural network processing flow through the attention mechanism, and training is carried out by traversing the channel state, so that implicit channel state self-adaption is realized. The adaptation capability of the method to semantic granularity is poor, so that channel resources are not matched with semantic requirements, the channel self-adaptation accuracy is low, and the resource utilization rate is poor. Disclosure of Invention The disclosure provides a synonymous variation image sending and receiving method and a related model training method, so as to solve or partially solve the above problems. The first aspect of the disclosure provides a synonymy variational image sending method, which comprises the steps of carrying out synonymy analysis transformation processing on an original image based on meeting semantic equivalence constraint to obtain a continuous hidden feature image, determining probability distribution of semantics of each element of the continuous hidden feature image, calculating channel cost corresponding to each feature vector in the continuous hidden feature image according to the probability distribution of the semantics to generate a channel cost sequence, determining a target synonymy level according to current channel state parameters of a channel, wherein a plurality of synonymy levels identified by sequence numbers are preset, the sequence numbers of each synonymy level correspond to preset synonymy level division rules, the synonymy level division rules are used for representing division modes of synonymy representation parts and detail representation parts of the hidden feature image in the current synonymy level, the sequence numbers of the synonymy levels are ordered according to semantic importance and/or semantic granularity of the hidden feature image, determining synonymy signal sequences corresponding to the synonymy representation parts of the target synonymy levels according to the channel cost sequence, and transmitting the synonymy signal sequences to the channel as target transmission signal sequences. The second aspect of the disclosure provides a synonym variation image receiving method, which comprises the steps of receiving a signal sequence from a channel, determining a target synonym level corresponding to the signal sequence, wherein a plurality of synonym levels marked by sequence numbers are preset, the sequence numbers of the synonym levels correspond to preset synonym level division rules, the synonym level division rules are used for representing a division mode of a synonym representation part and a detail representation pa