CN-121980072-A - Multi-mode data retrieval method and device and related equipment

CN121980072ACN 121980072 ACN121980072 ACN 121980072ACN-121980072-A

Abstract

The embodiment of the invention discloses a multi-mode data retrieval method, a multi-mode data retrieval device and related equipment. The method comprises the steps of obtaining multi-mode data, preprocessing the multi-mode data to obtain standard data, carrying out multi-mode semantic annotation on the standard data to generate semantic tags of each piece of multi-mode data, associating and storing each semantic tag with corresponding original data to obtain a search database, obtaining description texts of search requirements, carrying out multi-dimensional semantic analysis on the description texts to obtain intention information of multiple dimensions, carrying out data matching on the basis of the intention information of each dimension and the semantic tags in the search database to obtain multiple candidate matching results, and outputting the candidate matching results. According to the method, data of different modes are converted into standardized semantic tags through unified semantic labeling, cross-type data integration is achieved, and therefore retrieval efficiency and accuracy are improved.

Inventors

LIU YU

Assignees

北京泰信天成科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260130

Claims (10)

1. A multi-modal data retrieval method comprising: Acquiring multi-mode data, and preprocessing the multi-mode data to obtain standard data; Carrying out multi-mode semantic annotation on the standard data to generate semantic tags of each piece of multi-mode data; Associating and storing each semantic tag with corresponding original data to obtain a retrieval database; acquiring a description text of a search requirement, and carrying out multi-dimensional semantic analysis on the description text to obtain intention information of multiple dimensions; And carrying out data matching based on intention information of each dimension and semantic tags in the search database to obtain a plurality of candidate matching results and outputting the candidate matching results.
2. The multi-modal data retrieval method as claimed in claim 1, wherein the matching the intention information of each dimension with the semantic tags in the retrieval database according to a preset matching algorithm to obtain and output a plurality of candidate matching results, includes: dividing intention information of each dimension, and determining key intention information and auxiliary intention information; Comparing the intention information of each dimension of the key intention information with semantic tags of corresponding dimensions to obtain corresponding preliminary matching results; Calculating the similarity between each dimension intention information of the auxiliary dimension information and the semantic label of the dimension corresponding to the preliminary matching result, and taking the preliminary matching result with the similarity larger than a preset similarity threshold value as a candidate matching result; carrying out weighted summation on the matching degree of the candidate matching results based on preset dimension weights to obtain a comprehensive matching score of each candidate matching result; And sorting the candidate matching results in a descending order according to the preset data priority and the comprehensive matching score, and displaying the candidate matching results in order.
3. The method for retrieving multi-modal data as set forth in claim 1, wherein the multi-modal data includes text data, image data and video data, and wherein the preprocessing the multi-modal data to obtain standard data includes: performing de-duplication and format unification on the text data; Noise reduction and size standardization are carried out on the image data; and performing repeated frame removal and voice denoising on the video data.
4. The method for retrieving multi-modal data as set forth in claim 2, wherein said performing multi-modal semantic annotation on said standard data to generate a semantic tag for each piece of multi-modal data includes: Extracting the text data according to a preset semantic annotation algorithm to obtain a plurality of semantic entities and entity relations among the semantic entities, and combining the semantic entities according to the entity relations to obtain semantic tags of the corresponding text data; extracting key information of the image data through image semantic recognition, and combining based on the key information to obtain a semantic tag of the corresponding image data; And carrying out image semantic annotation and time axis annotation on the key frames of the video data, and carrying out text recognition on the audio information to obtain corresponding semantic tags.
5. The multi-modal data retrieval method as claimed in claim 1 wherein the dimensions of the intent information include fields, time, places, people, objects, events, requirements.
6. A multi-modal data retrieval apparatus comprising: the preprocessing module is used for acquiring multi-mode data and preprocessing the multi-mode data to obtain standard data; the marking module is used for carrying out multi-mode semantic marking on the standard data and generating a semantic label of each piece of multi-mode data; The association module is used for associating and storing each semantic tag with corresponding original data to obtain a retrieval database; The analysis module is used for acquiring the description text of the retrieval requirement, and carrying out multi-dimensional semantic analysis on the description text to obtain intention information of multiple dimensions; And the matching module is used for carrying out data matching based on intention information of each dimension and semantic tags in the search database, obtaining a plurality of candidate matching results and outputting the candidate matching results.
7. The multi-modal data retrieval apparatus as claimed in claim 1 wherein the matching module comprises: the dividing unit is used for dividing intention information of each dimension and determining key intention information and auxiliary intention information; the comparison unit is used for comparing the intention information of each dimension of the key intention information with the semantic tags of the corresponding dimension to obtain a corresponding preliminary matching result; The similarity calculation unit is used for calculating the similarity between the intention information of each dimension of the auxiliary dimension information and the semantic tags of the dimension corresponding to the preliminary matching result, and taking the preliminary matching result with the similarity larger than a preset similarity threshold value as a candidate matching result; The weighted summation unit is used for carrying out weighted summation on the matching degree of the candidate matching results based on preset dimension weights to obtain a comprehensive matching score of each candidate matching result; And the sorting unit is used for sorting the candidate matching results in a descending order according to the preset data priority and the comprehensive matching score and displaying the candidate matching results in sequence.
8. The multi-modal data retrieval apparatus as claimed in claim 1 wherein the labeling module comprises: The text labeling unit is used for extracting the entities of the text data according to a preset semantic labeling algorithm to obtain a plurality of semantic entities, and combining the semantic entities to obtain semantic tags of the corresponding text data; The image labeling unit is used for extracting key information of the image data through image semantic recognition and obtaining semantic tags of the corresponding image data based on the key information combination; The video annotation unit is used for carrying out image semantic annotation and time axis annotation on the key frames of the video data, and carrying out text recognition on the audio information to obtain corresponding semantic tags. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the multimodal data retrieval method of any of claims 1 to 5 when the computer program is executed.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the multimodal data retrieval method of any of claims 1 to 5 when the computer program is executed.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, causes the processor to perform the multimodal data retrieval method according to any of claims 1 to 5.

Description

Multi-mode data retrieval method and device and related equipment Technical Field The embodiment of the invention relates to the technical field of data processing, in particular to a multi-mode data retrieval method, a device and related equipment. Background The multi-mode data explosive growth, heterogeneous data formats such as text, image and video, and the like are difficult to realize cross-type data unified processing and associated retrieval by a traditional system, and the data value mining is limited. The searching requirements in the fields of law and the like are complex, multidimensional information such as time, people, events and the like needs to be extracted accurately, the prior art aims at identifying ambiguity, and the refined searching requirements are difficult to match. Traditional retrieval relies on single keyword matching, semantic-level understanding is lacked, missed detection and false detection are easy to occur, multi-mode data retrieval response speed is low, and millions of data efficient retrieval requirements are difficult to meet. The existing system has the defects of non-uniform data labeling standard, obvious multi-mode data semantic gap, incapability of realizing standardized label mapping, insufficient retrieval accuracy, difficulty in supporting requirements of professional scene evidence integration and the like. Disclosure of Invention The embodiment of the invention provides a multi-mode data retrieval method, a multi-mode data retrieval device and related equipment, and aims to solve the technical problem that the prior art is difficult to prevent sensitive data from being leaked while guaranteeing the usability of semantic retrieval. In a first aspect, an embodiment of the present invention provides a multi-modal data retrieval method, including: Acquiring multi-mode data, and preprocessing the multi-mode data to obtain standard data; Carrying out multi-mode semantic annotation on the standard data to generate semantic tags of each piece of multi-mode data; Associating and storing each semantic tag with corresponding original data to obtain a retrieval database; acquiring a description text of a search requirement, and carrying out multi-dimensional semantic analysis on the description text to obtain intention information of multiple dimensions; And carrying out data matching based on intention information of each dimension and semantic tags in the search database to obtain a plurality of candidate matching results and outputting the candidate matching results. In a second aspect, an embodiment of the present invention provides a multi-modal data retrieval apparatus, including: the preprocessing module is used for acquiring multi-mode data and preprocessing the multi-mode data to obtain standard data; the marking module is used for carrying out multi-mode semantic marking on the standard data and generating a semantic label of each piece of multi-mode data; The association module is used for associating and storing each semantic tag with corresponding original data to obtain a retrieval database; The analysis module is used for acquiring the description text of the retrieval requirement, and carrying out multi-dimensional semantic analysis on the description text to obtain intention information of multiple dimensions; And the matching module is used for carrying out data matching based on intention information of each dimension and semantic tags in the search database, obtaining a plurality of candidate matching results and outputting the candidate matching results. In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the multi-mode data retrieval method according to the first aspect when executing the computer program. In a fourth aspect, an embodiment of the present invention further provides a readable storage medium, where the computer readable storage medium stores a computer program, where the computer program when executed by a processor causes the processor to perform the multi-modal data retrieval method according to the first aspect. The embodiment of the invention provides a multi-mode data retrieval method, a multi-mode data retrieval device and related equipment. The method comprises the steps of obtaining multi-mode data, preprocessing the multi-mode data to obtain standard data, carrying out multi-mode semantic annotation on the standard data to generate semantic tags of each piece of multi-mode data, associating and storing each semantic tag with corresponding original data to obtain a search database, obtaining description texts of search requirements, carrying out multi-dimensional semantic analysis on the description texts to obtain intention information of multiple dimensions, carrying out data matching on the basis of the intention information of each d