CN-122027796-A - Data transmission method, intelligent glasses terminal and receiving terminal
Abstract
The application discloses a data transmission method, an intelligent glasses terminal and a receiving terminal, and relates to the technical field of cross-terminal communication, wherein the data transmission method applied to the intelligent glasses terminal comprises the steps of obtaining image data, audio data, text data and sensor data; determining the semantic compression rate of the image data according to the semantic similarity between the image data and other modal data, wherein the semantic compression rate and the semantic similarity are positively correlated, compressing the image data according to the semantic compression rate of the image data to obtain compressed image data, respectively compressing audio data, text data, sensor data and the compressed image data to obtain a plurality of modal compression data, and transmitting the modal compression data to a preset receiving end. According to the application, through cross-mode semantic association evaluation at the intelligent glasses end, the adaptive compression of the image data is realized, so that the transmission data quantity is effectively reduced, and the transmission efficiency of cross-end transmission of multi-mode data is improved.
Inventors
- HU SONG
Assignees
- 深圳市歌尔泰克科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260109
Claims (10)
- 1. The data transmission method is characterized by being applied to an intelligent glasses terminal, and comprises the following steps: acquiring multi-modal data, wherein the multi-modal data comprises image data, audio data, text data and sensor data; Determining a semantic compression rate of the image data according to the semantic similarity between the image data and other modal data, wherein the semantic compression rate is positively correlated with the semantic similarity; compressing the image data according to the semantic compression rate of the image data to obtain compressed image data; Respectively compressing the audio data, the text data, the sensor data and the compressed image data to obtain a plurality of modal compression data; and sending each modal compression data to a preset receiving end, so that the receiving end decompresses each modal compression data to obtain the audio data, the text data, the sensor and the compressed image data, and performing semantic restoration on the compressed image data according to other modal data to obtain restored image data.
- 2. The data transmission method according to claim 1, wherein the step of determining the semantic compression rate of the image data based on the semantic similarity between the image data and the other modality data includes: determining semantic similarity between the image data and other modal data; Determining the information importance degree of each modal data according to the data priority of each modal data; For any target mode data in the other mode data, determining a semantic association weight between the image data and the target mode data according to the semantic similarity between the image data and the target mode data, the information importance of the image data and the information importance of the target mode data, wherein the semantic association weight is positively correlated with the semantic similarity, and the semantic association weight is negatively correlated with the information importance; according to a preset basic compression rate of image data and semantic association weights between the image data and other modal data, determining the semantic compression rate, wherein the semantic compression rate and the semantic association weights are positively correlated.
- 3. The data transmission method according to claim 1, wherein the step of transmitting each of the modal compressed data to a predetermined receiving end includes: acquiring transmission parameters of a plurality of preset transmission links, wherein the transmission parameters comprise at least one of delay, packet loss rate and bandwidth; determining a target transmission link of each mode compressed data according to the transmission parameters of each transmission link and the parameter weight coefficient of each mode compressed data, wherein the parameter weight coefficient is determined according to a preset mapping strategy between the data priority of each mode compressed data and the weight coefficient; And sending the modal compressed data corresponding to the target transmission link to the receiving end through the target transmission link.
- 4. A data transmission method according to claim 3, wherein the step of determining the target transmission link of each of the modal compressed data based on the transmission parameters of each of the transmission links and the parameter weight coefficient of each of the modal compressed data comprises: For any modal compressed data, determining a state evaluation score of each transmission link according to the transmission parameters of each transmission link and the parameter weight coefficients of the modal compressed data; Judging whether a transmission link with the state evaluation score being greater than or equal to a preset score threshold exists or not; In the case that a transmission link with a state evaluation score greater than or equal to the score threshold exists, determining the transmission link with the highest state evaluation score as a target transmission link of the modal compressed data; Under the condition that the state evaluation scores of all the transmission links are smaller than the score threshold, determining the difference value between the state evaluation scores of all the transmission links, and judging whether the difference value is smaller than a preset difference value threshold or not; And when the difference value is smaller than the difference value threshold value, determining two transmission links with the difference value smaller than the difference value threshold value as transmission link pairs, selecting a target link pair from the transmission link pairs, and determining the transmission link contained in the target link pair as the target transmission link of the modal compressed data.
- 5. The data transmission method according to claim 1, wherein the modality compression data includes image compression data, audio compression data, text compression data, and sensor compression data, and the compressing the audio data, the text data, the sensor data, and the compressed image data, respectively, to obtain the plurality of modality compression data includes: dictionary coding is carried out on the text data to obtain the text compression data; performing differential encoding on the sensor data to obtain the sensor compressed data; Identifying emotion characteristics of the audio data through a preset emotion analysis model, dividing the audio data according to the emotion characteristics to obtain emotion fluctuation fragments and stable fragments, and respectively compressing the emotion fluctuation fragments and the stable fragments to obtain the audio compression data, wherein the compression rate of the emotion fluctuation fragments is smaller than that of the stable fragments; And identifying a key region and a background region of the compressed image data through a preset semantic segmentation model, and respectively compressing the key region and the background region to obtain the image compressed data, wherein the compression rate of the key region is smaller than that of the background region.
- 6. The data transmission method according to claim 5, wherein the step of identifying the key region and the background region of the compressed image data by a preset semantic segmentation model comprises: dividing the compressed image data into a plurality of target areas through the semantic division model, and determining semantic keywords of each target area; Determining semantic similarity between semantic keywords of each target area and preset target keywords, wherein the target keywords comprise audio keywords which are obtained by extracting keywords from the audio data; and determining a key area from a target area with the semantic similarity being greater than or equal to a preset similarity threshold value, and determining a background area from a target area with the semantic similarity being smaller than the similarity threshold value.
- 7. The data transmission method is characterized in that the data transmission method is applied to a receiving end and comprises the following steps: Receiving a plurality of modal compression data sent by an intelligent glasses terminal, wherein the plurality of modal compression data are obtained by respectively compressing acquired audio data, text data, sensor data and compressed image data by the intelligent glasses terminal, the compressed image data are obtained by compressing the image data according to the semantic compression rate of the image data, the semantic compression rate is determined according to the semantic similarity between the image data and other modal data, and the semantic compression rate is positively correlated with the semantic similarity; Decompressing each of the modal compressed data to obtain the audio data, the text data, the sensor and the compressed image data; And carrying out semantic restoration on the compressed image data according to the other modal data to obtain restored image data.
- 8. The data transmission method according to claim 7, wherein the step of performing semantic restoration on the compressed image data according to the other modal data to obtain restored image data includes: Extracting semantic information from the other modal data, and mapping the semantic information to an image feature space of the compressed image data to obtain mapping features; and repairing the compressed image data according to the mapping characteristics to obtain restored image data.
- 9. A smart eyewear comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program being configured to implement the steps of the data transmission method according to any one of claims 1 to 6.
- 10. A receiving end, characterized in that it comprises a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program being configured to implement the steps of the data transmission method according to claim 7 or 8.
Description
Data transmission method, intelligent glasses terminal and receiving terminal Technical Field The present application relates to the field of cross-terminal communication technologies, and in particular, to a data transmission method, an intelligent glasses terminal, and a receiving terminal. Background Along with the development of wearable equipment technology, AI (ARTIFICIAL INTELLIGENCE ) glasses have also made remarkable progress, and can acquire multi-mode data in real time, such as image/video through a camera, audio through a microphone, text through optical character recognition and extraction, and motion gesture data through a sensor, and perform data interaction with terminal equipment through wireless communication (bluetooth, wi-Fi, 5G), so as to realize storage, processing and feedback of the multi-mode data. At present, in order to realize data cross-terminal transmission between the AI glasses and the terminal equipment, two processing modes mainly exist, namely, original multi-mode data acquired by the AI glasses are directly transmitted, or single-mode data are independently processed by adopting a general compression algorithm and then are transmitted (for example, an image is compressed by using a JPEG algorithm, audio is compressed by using an MP3 algorithm and the like). However, the multi-mode data collected by the AI glasses, especially the high-definition image, the video, the audio and the sensor data, has extremely large data volume, and if the original data is directly transmitted or the data processed by adopting a single-mode general compression algorithm, the phenomena of blocking, packet loss and the like may occur when facing a low-bandwidth scene, such as a weak Wi-Fi signal coverage area or a 5G network in a remote area, and the transmission efficiency of the multi-mode data is seriously affected. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art. Disclosure of Invention The application mainly aims to provide a data transmission method, an intelligent glasses end and a receiving end, and aims to solve the technical problem of how to improve the transmission efficiency of cross-end transmission of multi-mode data. In order to achieve the above object, the present application provides a data transmission method, which is applied to an intelligent glasses terminal, and the method includes: acquiring multi-modal data, wherein the multi-modal data comprises image data, audio data, text data and sensor data; Determining a semantic compression rate of the image data according to the semantic similarity between the image data and other modal data, wherein the semantic compression rate is positively correlated with the semantic similarity; compressing the image data according to the semantic compression rate of the image data to obtain compressed image data; Respectively compressing the audio data, the text data, the sensor data and the compressed image data to obtain a plurality of modal compression data; and sending each modal compression data to a preset receiving end, so that the receiving end decompresses each modal compression data to obtain the audio data, the text data, the sensor and the compressed image data, and performing semantic restoration on the compressed image data according to other modal data to obtain restored image data. In an embodiment, the step of determining the semantic compression rate of the image data according to the semantic similarity between the image data and the other modality data includes: determining semantic similarity between the image data and other modal data; Determining the information importance degree of each modal data according to the data priority of each modal data; For any target mode data in the other mode data, determining a semantic association weight between the image data and the target mode data according to the semantic similarity between the image data and the target mode data, the information importance of the image data and the information importance of the target mode data, wherein the semantic association weight is positively correlated with the semantic similarity, and the semantic association weight is negatively correlated with the information importance; according to a preset basic compression rate of image data and semantic association weights between the image data and other modal data, determining the semantic compression rate, wherein the semantic compression rate and the semantic association weights are positively correlated. In an embodiment, the step of sending each of the modal compressed data to a predetermined receiving end includes: acquiring transmission parameters of a plurality of preset transmission links, wherein the transmission parameters comprise at least one of delay, packet loss rate and bandwidth; determining a target transmission link