CN-121996820-A - Unified multi-space index method for high-dimensional multi-mode search
Abstract
The invention discloses a unified multi-space index method for high-dimensional multi-modal search, which comprises the steps of obtaining a multi-vector object set, constructing a unified multi-space map index, integrating neighbor relations of all possible vector combination spaces into a single map structure, respectively processing objects to be inserted as different multi-vector queries in the index construction process, executing search on each vector combination space and selecting neighbor nodes, optimizing the index construction process by using a lossless acceleration technology, performing lossless compression on a constructed neighbor list to reduce storage cost, receiving the multi-vector queries, executing self-adaptive navigation search according to the vector combination and the weight of the query, and returning the most similar objects. The method can solve the problems of space mismatch and resource explosion dilemma of the existing method, obviously reduce the index construction time and the storage cost, and effectively improve the efficiency and the accuracy of multi-vector query.
Inventors
- GAO YUNJUN
- GONG ZHENG
- Zhu Diefan
- WANG MENGZHAO
- TU YAOFENG
- ZHENG KAI
Assignees
- 浙江大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260409
Claims (8)
- 1. The unified multi-space index method for high-dimensional multi-mode search is characterized by comprising the following steps of: (1) Acquiring a multi-vector object set, wherein each object in the set is represented by a plurality of vectors from different semantic spaces; (2) Constructing a unified multi-space diagram index, integrating neighbor relations of all possible vector combination spaces of objects into a single diagram structure, storing vector data of each object only once, and maintaining an independent neighbor list for each vector combination space; (3) In the process of constructing the graph index, the object to be inserted is treated as different multi-vector queries respectively, and searching is carried out on each vector combination space and neighbor nodes are selected; (4) Optimizing a graph index construction process by using a lossless acceleration technology, wherein the graph index construction process comprises multiplexing of distance calculation, approximate calculation based on importance perception and accurate calculation based on inner product transformation; (5) Performing lossless compression after collecting and sequencing neighbor lists, and reducing storage overhead by adopting a difference value coding compression method based on a median value; (6) A multi-vector query is received and an adaptive navigation search is performed in the graph index based on the vector combinations and weights of the query, and top-k most similar objects are returned using the weighted aggregate distance.
- 2. The unified multi-spatial index method for high-dimensional multi-modal searching according to claim 1, wherein the specific implementation manner of step (2) is as follows: s21, initializing an empty map and setting an initial layer number by adopting a hierarchical map structure for multi-space map indexes; s22, distributing the maximum layer number in the graph for the object to be inserted; s23, storing vector data of the object and pre-calculation information in the graph; s24, executing a neighbor construction process, namely, steps S25-S27, for each layer of the object not exceeding the maximum layer number; S25, performing greedy search for each vector combination space of the object to obtain a candidate neighbor set; S26, selecting neighbors from the candidate neighbor set by using HNSW heuristic rules, and preferentially selecting neighbors which are close to the query and far from each other; And S27, integrating and compressing neighbor lists of all vector combination spaces, and storing the neighbor lists associated with the vertexes of the objects in the graph.
- 3. The unified multi-spatial index method for high-dimensional multi-modal searching according to claim 1, wherein the specific implementation manner of step (3) is as follows: S31, generating all vector combination spaces of the objects to be inserted; S32, constructing corresponding queries based on the aggregate distance definition aiming at any vector combination space; s33, executing greedy search in the graph structure of the current layer, and gradually accessing nodes closer to the query distance; S34, calculating the aggregation distance of the accessed nodes in the vector combination space in the searching process, and maintaining a candidate node set according to the aggregation distance; S35, selecting candidate neighbors of the orientation quantity combination space from the candidate node set according to the distance sequence of the nodes and the query; And S36, carrying out HNSW heuristic neighbor screening on candidate neighbors of all vector combination spaces.
- 4. The unified multi-spatial index method for high-dimensional multi-modal searching according to claim 1, wherein the specific implementation manner of step (4) is as follows: S41, carrying out aggregation multiplexing on unidirectional distance results shared by a plurality of vector combination spaces, so as to avoid repeated calculation; s42, in the vector combination space searching process, adopting the approximate calculation of importance perception to vectors with lower distance contribution to reduce the cost, and simultaneously ensuring the lossless of the final aggregation distance; s43, under the condition that the number of the candidate node set nodes is insufficient, adopting an accurate distance calculation method based on inner product transformation to judge whether the nodes are added into the set or not so as to accelerate the calculation process of the accurate aggregation distance.
- 5. The unified multi-spatial index method for high-dimensional multi-modal searching according to claim 1, wherein the specific implementation manner of step (5) is as follows: S51, for each object, collecting neighbor IDs of all vector combination spaces of the objects into a single list, and sorting the objects according to the neighbor IDs; s52, performing incremental coding compression storage on the neighbor list after the aggregation and sequencing by adopting a median ID coding method.
- 6. The unified multi-spatial index method for high-dimensional multi-modal searching according to claim 1, wherein the specific implementation manner of step (6) is as follows: s61, determining a vector combination space to which the multi-vector query belongs and selecting a corresponding neighbor list as an access object; S62, in the self-adaptive navigation searching process, evaluating the relevance between the nodes and the multi-vector query according to the weighted aggregation distance, and guiding the searching path to preferentially access the nodes with smaller distance; And S63, maintaining a candidate set and a result set in the process of accessing the node, circularly taking out the nearest point from the candidate set to the multi-vector query, expanding the neighbor of the nearest point, updating the result set according to the weighted aggregation distance, and finally returning top-k objects with the minimum weighted aggregation distance from the result set as search results.
- 7. A computer device comprising a memory and a processor, wherein the memory has a computer program stored therein, and the processor is configured to execute the computer program to implement the unified multi-spatial index method for high-dimensional multi-modal searching according to any one of claims 1 to 6.
- 8. A computer readable storage medium storing a computer program, wherein the computer program is executed by a processor to implement the unified multi-spatial index method for high-dimensional multi-modal searching according to any one of claims 1 to 6.
Description
Unified multi-space index method for high-dimensional multi-mode search Technical Field The invention belongs to the technical fields of database systems and information retrieval, and particularly relates to a unified multi-space index method for high-dimensional multi-mode search. Background Along with the rapid development of artificial intelligence technology, particularly the wide application of visual language models and large language models, high-dimensional vector embedding has become a key technology for processing unstructured data, and in application scenes such as retrieval enhancement generation and the like, vector search is used as a core engine to provide external knowledge support for a generation model. However, existing vector search techniques are almost always based on the assumption that each object is represented by a single vector, which, although widely used, faces the problem of insufficient expressive power in processing complex multi-modal data. In order to adequately capture the rich information of multi-modal objects, multi-vector representation is emerging as a more expressive and flexible alternative, where an object is described by a set of different vectors, each capturing a different aspect, perspective or modality. For example, in a multi-modal chat application, a user may provide a daytime local clock photograph and a text prompt to help me find a local clock night scene photograph taken from the same view, the back-end system encodes the text prompt into a semantic vector, fuses the reference image and text into another vector through a multi-modal encoder to form a bi-directional vector query to perform a search in a vector database, and the method achieves higher accuracy in complex search scenarios by retaining different information channels instead of being forcefully fused into a single vector. However, this superior expressive power presents a serious system challenge-the need for efficient multi-vector searching, including in particular the following: First, the existing system mainly adopts an independent index strategy to process the multi-vector search problem, a leading vector database such as Milvus and VBase constructs independent single-space HNSW (hierarchical navigable small world) indexes for each mode, and then combines search results when inquiring, and the method has a key architecture defect, namely a space mismatch problem, and the multi-vector inquiry does not exist in any single mode space but is positioned in a new composite vector space defined by a specific vector combination, and the accurate position of the multi-vector inquiry is determined by vector data and assigned weights. The pre-computed and stored neighbor relationships in the single-modality HNSW graph are poor proxies for the true neighbor relationships in the query's actual compound space, resulting in search inefficiency and limited accuracy. Second, the theoretically correct solution is for allThe possible vector combinations pre-construct the index, while this guarantees that there is a perfectly aligned index for any query combination, presents a serious practical dilemma, and in order to achieve high performance, each index must employ an optimized memory layout, storing vector data and neighbor lists consecutively to ensure data locality and avoid performance loss due to cache misses. However, this necessary optimization forces a large replication of vector data in numerous indexes, so this naive solution incurs an exponential and impractical cost in terms of construction time and memory, making it impractical in practical applications. Third, the search strategy of the existing multi-vector search method is insensitive to dynamic weights. In practical applications, users may assign different importance to different modalities (e.g. "find an image that looks like this picture but mainly about this piece of text"), these preferences are expressed by dynamic weights, however, existing single spatial indexes and their search strategies cannot effectively use these weight information to guide the search path, limiting the accuracy and efficiency of the search. Fourth, for objects containing m vector representations, the number of potential vector combinations grows exponentially, although m is typically smaller in real-world applications, as m increases, any method aimed at providing full search capability faces inherent computational and storage challenges. Finally, existing methods lack specialized optimization for the characteristics of multi-vector search problems, multi-vector distance computation involves a large number of redundant computations, but existing systems fail to make efficient use of these computation modes to accelerate index construction and query processing, and at the same time, exponentially growing neighbor list storage overhead also lacks efficient compression schemes. Disclosure of Invention In view of the above, the invention provides a unified multi