Search

CN-122019589-A - Query method, computing storage device and host

CN122019589ACN 122019589 ACN122019589 ACN 122019589ACN-122019589-A

Abstract

The present disclosure provides a query method, a computing storage device, and a host. The present disclosure provides a query method performed by a computing storage device, the computing storage device being connected to a host, and the query method comprising receiving a plurality of data entries from the host for storage in a device memory of the computing storage device, the plurality of data entries being obtained by the host based on a query vector and at least a portion of the data vectors in a database, and each of the plurality of data entries comprising a plurality of distance values, receiving a query request from the host, the query request comprising an address of the plurality of data entries in the device memory, obtaining a query result in response to the query request, the query result being based on a first distance value comprised by the plurality of data entries and the first distance value comprising the plurality of distance values, and sending the query result to the host.

Inventors

  • REN YI
  • LI XIAOZHE
  • DONG FEI
  • YANG PAN
  • Pu Kuimin
  • LI XIANGYOU
  • CAO CHENGBIAO
  • LIU YONGJIAN

Assignees

  • 三星(中国)半导体有限公司
  • 三星电子株式会社

Dates

Publication Date
20260512
Application Date
20260108

Claims (20)

  1. 1. A query method performed by a computing storage device, the computing storage device being connected to a host, and the query method comprising: Receiving a plurality of data entries from the host for storage in a device memory of the computing storage device, the plurality of data entries obtained by the host based on a query vector and at least a portion of the data vectors in a database, and each of the plurality of data entries comprising a plurality of distance values; receiving a query request from the host, the query request including addresses of the plurality of data entries in the device memory; Obtaining a query result in response to the query request, the query result being based on a first distance value included by the plurality of data entries and the first distance value including the plurality of distance values, and And sending the query result to the host.
  2. 2. The query method of claim 1, wherein obtaining query results comprises: The query result is obtained by accumulating and ordering the first distance values.
  3. 3. The query method of claim 2, wherein the computing storage device comprises at least one computing unit, and Obtaining the query result by accumulating and ordering the first distance values, including: in response to the query request, activating a target computing unit of the at least one computing unit, and The query result is obtained by a target computing unit by performing a plurality of first operations in a pipelined fashion, each of the plurality of first operations including obtaining at least one data entry of the plurality of data entries, accumulating distance values included in the at least one data entry to obtain at least one accumulated distance value, and ordering the at least one accumulated distance value.
  4. 4. The query method of claim 3, wherein obtaining the at least one data entry comprises: And reading the at least one data item from the equipment memory according to the addresses of the plurality of data items.
  5. 5. The query method of claim 4, wherein the target computing unit includes a cache and a number of data entries in the device memory are prefetched into the cache, and Wherein reading the at least one data entry comprises: And reading the at least one data item from the cache according to the addresses of the plurality of data items.
  6. 6. The query method of claim 3, wherein accumulating the distance value included in the at least one data entry comprises: Accumulating the plurality of distance values included in each of the at least one data entry through a plurality of parallel accumulation channels to obtain the at least one accumulated distance value, Wherein the accumulation of the plurality of distance values included in each of the at least one data entry is performed by a single accumulation channel of the plurality of parallel accumulation channels.
  7. 7. The query method of claim 3, wherein ordering the at least one accumulated distance value comprises: The at least one accumulated distance value is ordered in parallel by a plurality of comparators.
  8. 8. The query method of claim 3, wherein sending the query result to the host comprises: the identification of the number of data entries having the smallest accumulated distance value in response to the sorting is returned to the host.
  9. 9. The query method of claim 2, wherein the plurality of data entries are written by the host according to a memory segment of the device memory indicated by a dequeued descriptor from a queue of the computing storage device, and Wherein a plurality of descriptors indicating different memory segments of the device memory are in the queue.
  10. 10. The query method of claim 1, wherein the plurality of data entries are obtained by the host based on a query vector and at least a portion of a data index obtained by the at least a portion of the data vector through a similarity search algorithm.
  11. 11. The query method of claim 10, wherein receiving a query request comprises: By calculating the high-speed connection CXL protocol, receiving the query request from the host, Wherein the at least a portion of the data index is a data index associated with a target cluster, the target cluster being a cluster having a smallest distance from the query vector in a first cluster of target cluster areas, the target cluster area being a cluster area corresponding to the computing storage device in at least one cluster area, and a plurality of clusters obtained by clustering a plurality of data vectors in a database are aggregated into the at least one cluster area, and A first data index obtained by a first data vector in a first cluster in the target cluster region through a reverse file product quantization IVFPQ algorithm is stored into the device memory via the CXL protocol.
  12. 12. A query method performed by a host, the host being connected to at least one computing storage device and comprising at least one central processing unit, CPU, the query method comprising performing, by each of the at least one CPU, a first operation comprising: obtaining a plurality of data entries based on the query vector and at least a portion of the data vectors in the database, each of the plurality of data entries comprising a plurality of distance values; writing the plurality of data entries into a device memory included in a target computing storage device of the at least one computing storage device; sending a query request to a target computing storage device, the query request including addresses of the plurality of data entries in the device memory, and And receiving a query result returned by the target computing storage device, wherein the query result is obtained by the target computing storage device in response to the query request, the query result is based on a first distance value included by the plurality of data items, and the first distance value includes the plurality of distance values.
  13. 13. The query method of claim 12, wherein the query result is obtained by the target computing storage device by accumulating and ordering the first distance values.
  14. 14. The query method as claimed in claim 13, wherein the query result is obtained by the target computing unit by performing a plurality of second operations in a pipelined fashion, each of the plurality of second operations including obtaining at least one data entry of the plurality of data entries, accumulating distance values included in the at least one data entry to obtain at least one accumulated distance value, and ordering the at least one accumulated distance value, and The target computing unit is selected from at least one computing unit comprised by the target computing storage device.
  15. 15. The query method of claim 12, wherein a plurality of clusters obtained by clustering a plurality of data vectors in a database are aggregated into at least one cluster region, each of the at least one cluster region corresponding to one of the at least one computing storage device, and The query method further includes determining, by each of the at least one CPU, a target computing storage device based on a target cluster region of the at least one cluster region having a smallest distance from a query vector.
  16. 16. The query method of claim 12, wherein writing the plurality of data entries into the device memory comprises: acquiring descriptors dequeued from a queue of a target computing storage device, and Writing the plurality of data entries into the device memory according to the memory segment of the device memory indicated by the descriptor, and A plurality of descriptors indicating different memory segments of the device memory are in the queue.
  17. 17. The query method of claim 16, wherein the query method further comprises releasing, by each of the at least one CPU, and returning, in response to receiving the query result, the descriptor to the queue.
  18. 18. The query method of claim 12, wherein the host includes a plurality of CPUs, and each of the plurality of CPUs is configured to perform the first operation in parallel.
  19. 19. The query method of claim 15, wherein obtaining a plurality of data entries comprises: The plurality of data entries is obtained based on a query vector and at least a portion of a data index obtained from the at least a portion of the data vector by a similarity search algorithm.
  20. 20. The query method of claim 19, wherein sending the query request to the target computing storage device comprises: The query request is sent to the target computing storage device via a high speed connection CXL protocol, Wherein the at least a portion of the data index is a data index associated with a target cluster, the target cluster being a cluster having a smallest distance from the query vector in a first cluster in the target cluster region, and A first data index obtained by a first data vector in a first cluster in the target cluster region through a reverse file product quantization IVFPQ algorithm is stored into the device memory via the CXL protocol.

Description

Query method, computing storage device and host Technical Field The present disclosure relates to the field of data querying, and in particular, to a querying method, a computing storage device, and a host. Background The similarity search algorithm is used to find specified similar content from the vector database, and includes, for example, a K nearest neighbor search algorithm (KNN), a near nearest neighbor search (ANNS) algorithm, a Local Sensitive Hash (LSH) algorithm, and the like. The approximate nearest neighbor search algorithm has the advantages of low memory occupation and the like because the throughput rate is far higher than that of the K nearest neighbor search algorithm, and is widely applied to the fields of information retrieval embedding, recommendation systems driven by Artificial Intelligence (AI), large Language Models (LLM) and the like. Inverted file product quantization (IVFPQ) is one of the widely applied algorithms in the near nearest neighbor search algorithm, which reduces query scope by vector clustering and memory occupation by vector dimension and cell size compression. Various architectures based on different hardware platforms are proposed to accelerate the approximate nearest neighbor search algorithm comprising IVFPQ, but the existing architectures have the problems that main memory and video memory capacity are difficult to expand and are insufficient for caching a large number of vectors to be checked, the proportion of the read-write overhead of a Solid State Disk (SSD) to occupy the processing is high, and the cache consistency cannot be supported by a high-speed serial computer expansion bus standard (PCIe). Disclosure of Invention The present disclosure provides a query method, computing storage device, and host to address some or all of the above issues. According to an embodiment, a query method performed by a computing storage device is provided, the computing storage device being connected to a host, and the query method comprising receiving a plurality of data entries from the host for storage in a device memory of the computing storage device, the plurality of data entries being obtained by the host based on a query vector and at least a portion of the data vectors in a database, and each of the plurality of data entries comprising a plurality of distance values, receiving a query request from the host, the query request comprising an address of the plurality of data entries in the device memory, obtaining a query result in response to the query request, the query result being based on a first distance value comprised by the plurality of data entries and the first distance value comprising the plurality of distance values, and sending the query result to the host. Alternatively or additionally, obtaining the query result includes obtaining the query result by accumulating and ordering the first distance values. Alternatively or additionally, the computing storage device includes at least one computing unit and obtains the query result by accumulating and ordering first distance values, including initiating a target computing unit of the at least one computing unit in response to the query request, and obtaining the query result by the target computing unit by performing a plurality of first operations in a pipelined fashion, each of the plurality of first operations including obtaining at least one of the plurality of data entries, accumulating distance values included in the at least one data entry to obtain at least one accumulated distance value, and ordering the at least one accumulated distance value. Alternatively or additionally, retrieving the at least one data entry includes reading the at least one data entry from the device memory based on the address of the plurality of data entries. Alternatively or additionally, the target computing unit comprises a cache and a number of data entries in the device memory are prefetched into the cache, and wherein reading the at least one data entry comprises reading the at least one data entry from the cache according to the address of the plurality of data entries. Alternatively or additionally, accumulating the distance values comprised by the at least one data entry comprises accumulating the plurality of distance values comprised by each of the at least one data entry by a plurality of parallel accumulation channels to obtain the at least one accumulated distance value, wherein the accumulating of the plurality of distance values comprised by each of the at least one data entry is performed by a single accumulation channel of the plurality of parallel accumulation channels. Alternatively or additionally, ordering the at least one accumulated distance value includes ordering the at least one accumulated distance value in parallel by a plurality of comparators. Alternatively or additionally, sending the query result to the host includes returning to the host an identification of a number of data entries having