CN-122019833-A - Image retrieval method, device and equipment for target image information
Abstract
The embodiment of the invention provides an image retrieval method, device and equipment for target image information, which comprises the steps of obtaining target image information to be retrieved, analyzing and extracting the target image information to obtain target image feature information, matching the target image feature information in a preset image data structured index library, retrieving to obtain retrieval results containing matched target image data and structural description information related to the target image data, and outputting the retrieval results, wherein the image data structured index library is obtained through the following steps of obtaining multi-mode original data containing images, videos and text descriptions of the same target object, carrying out content analysis and feature extraction on the multi-mode original data to obtain corresponding structural description information, and storing the structural description information as an index structure supporting feature-based matching retrieval according to a preset storage format. The embodiment of the invention can realize efficient understanding and accurate matching of the deep semantic content of the image.
Inventors
- ZHANG XINPENG
- WANG XINYONG
- LUO XINKAI
- DING ZHEN
Assignees
- 中译文娱科技(青岛)有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251226
Claims (10)
- 1. An image retrieval method of target image information, comprising: acquiring target image information to be retrieved; analyzing and extracting the target image information to obtain target image characteristic information; Matching the target image characteristic information in a preset image data structured index library, and retrieving to obtain a retrieval result containing matched target image data and structured description information associated with the target image data; outputting the search result, wherein the image data structured index library is obtained through the following steps: Acquiring multi-mode original data containing image, video and text description of the same target object; Performing content analysis and feature extraction on the multi-mode original data to obtain corresponding structured description information; And storing the structural description information into an index structure supporting feature-based matching retrieval according to a preset storage format to obtain the image data structural index library.
- 2. The method for retrieving an image of target image information according to claim 1, wherein performing content analysis and feature extraction on the multi-modal raw data to obtain corresponding structured description information includes: Preprocessing and extracting features of images in the multi-mode original data to obtain first structural description information; Preprocessing, key frame extraction and feature extraction are carried out on the video in the multi-mode original data to obtain second structural description information; performing natural language processing and information extraction on the text description in the multi-mode original data to obtain third structural description information; And correlating the first structural description information, the second structural description information and the third structural description information to obtain correlated structural description information.
- 3. The method for retrieving an image of a target image according to claim 2, wherein associating the first structural description information, the second structural description information, and the third structural description information to obtain the associated structural description information includes: extracting the first structural description information, the second structural description information and the third structural description information to obtain a target category label; Establishing an association relation of cross-modal data by taking a common target category label as an association key; and integrating all the associated multi-mode structural information of each target category to obtain the associated structural description information.
- 4. The image retrieval method of target image information according to claim 3, wherein storing the structured description information as an index structure supporting feature-based matching retrieval according to a preset storage format, to obtain the image data structured index library, comprises: extracting feature vectors of all modes from the structural description information, and constructing a multi-mode feature vector library; establishing an inverted index item by taking the target category label as an index key; establishing a structural description index entry for each target category label; and packaging the multi-mode feature vector library, the inverted index item and the index entry into an image data structured index library.
- 5. The image retrieval method of claim 1, wherein analyzing and extracting the target image information to obtain target image feature information comprises: Determining the mode type of the target image information according to the target image information; And analyzing and extracting the target image information according to the mode type to obtain target image characteristic information.
- 6. The method for searching for the target image information according to claim 1, wherein the matching of the target image feature information in a preset image data structured index library, the searching for the search result including the matched target image data and the structured description information associated with the target image data, includes: calculating the comprehensive similarity between the target image characteristic information and the content characteristic information in the image data structured index library; According to the comprehensive similarity, sorting and screening the candidate items to obtain a sorted candidate item list; And retrieving and assembling the matched target image data and the structural description information from the image data structural index library to obtain a retrieval result.
- 7. The image retrieval method of target image information according to claim 1, further comprising: acquiring a secondary screening instruction triggered by a user based on the search result; and filtering, reordering and updating the retrieval result according to the secondary screening instruction.
- 8. An image search device for target image information, comprising: the acquisition module is used for acquiring target image information to be retrieved; the processing module is used for analyzing and extracting the target image information to obtain target characteristic information; matching the target characteristic information in a preset image data structured index library, and retrieving to obtain a retrieval result containing matched image data and structured description information associated with the image data; The method comprises the steps of outputting a search result, wherein the construction process of the image data structured index library comprises the steps of obtaining multi-mode original data containing images, videos and associated texts, carrying out content analysis and feature extraction on the multi-mode original data to obtain corresponding structured description information, organizing the structured description information according to preset specifications, and storing the structured description information as an index structure supporting matched search based on features to obtain the image data structured index library.
- 9. A computing device, comprising: one or more processors; Storage means for storing one or more programs which when executed by the one or more processors cause the one or more processors to implement the method of any of claims 1 to 7.
- 10. A computing device readable storage medium having stored therein a program which when executed by a processor implements the method of any one of claims 1 to 7.
Description
Image retrieval method, device and equipment for target image information Technical Field The embodiment of the invention relates to the technical field of image retrieval, in particular to an image retrieval method, device and equipment for target image information. Background Current image retrieval techniques rely mainly on manually annotated keywords or simple metadata (e.g., shooting time, file size), and matching of low-level visual features based on color, texture, etc. The method has the obvious defects that firstly, the manual labeling cost is high, the efficiency is low, the subjectivity is high, mass data is difficult to deal with, secondly, the semantic content (such as specific targets, behaviors and events) of the image cannot be deeply understood by simple metadata and low-level features, so that the deviation between a search result and the real intention of a user is large, the precision and recall ratio are low, furthermore, the multi-mode data such as images, videos and associated texts thereof are often isolated and managed, effective association and fusion are not available, cross-mode mutual verification and joint search cannot be realized, and finally, the requirements of searching images by images, complex search based on natural language description and the like are difficult to support. Disclosure of Invention The embodiment of the invention aims to solve the technical problem of providing an image retrieval method, device and equipment for target image information, which can realize efficient understanding and accurate matching of deep semantic content of images by automatically analyzing multi-mode data and constructing a structured index library. In order to solve the technical problems, the technical scheme of the embodiment of the invention is as follows: An image retrieval method of target image information, comprising: acquiring target image information to be retrieved; analyzing and extracting the target image information to obtain target image characteristic information; Matching the target image characteristic information in a preset image data structured index library, and retrieving to obtain a retrieval result containing matched target image data and structured description information associated with the target image data; outputting the search result, wherein the image data structured index library is obtained through the following steps: The method comprises the steps of obtaining multi-mode original data containing image, video and text description of the same target object, carrying out content analysis and feature extraction on the multi-mode original data to obtain corresponding structural description information, and storing the structural description information as an index structure supporting feature-based matching retrieval according to a preset storage format to obtain the image data structural index library. Optionally, performing content analysis and feature extraction on the multi-mode original data to obtain corresponding structural description information, where the method includes: Preprocessing and extracting features of images in the multi-mode original data to obtain first structural description information; Preprocessing, key frame extraction and feature extraction are carried out on the video in the multi-mode original data to obtain second structural description information; performing natural language processing and information extraction on the text description in the multi-mode original data to obtain third structural description information; And correlating the first structural description information, the second structural description information and the third structural description information to obtain correlated structural description information. Optionally, associating the first structural description information, the second structural description information and the third structural description information to obtain associated structural description information, including: extracting the first structural description information, the second structural description information and the third structural description information to obtain a target category label; Establishing an association relation of cross-modal data by taking a common target category label as an association key; and integrating all the associated multi-mode structural information of each target category to obtain the associated structural description information. Optionally, storing the structural description information as an index structure supporting matching search based on features according to a preset storage format to obtain the image data structural index library, including: extracting feature vectors of all modes from the structural description information, and constructing a multi-mode feature vector library; establishing an inverted index item by taking the target category label as an index key; establishing a structural description index entry for each target