Search

CN-121979921-A - System, method, medium, program product and terminal for generating edge equipment retrieval enhancement based on super-dimensional calculation

CN121979921ACN 121979921 ACN121979921 ACN 121979921ACN-121979921-A

Abstract

According to the edge equipment retrieval enhancement generation system, the method, the medium, the program product and the terminal based on the super-dimensional computation, the characteristics of sparse super-dimensional vector coding, lightweight similarity computation and the like of the super-dimensional computation technology are fused with the RAG system in depth, and an adaptive edge equipment RAG system architecture is constructed. The sparse super-dimensional vector code of the super-dimensional calculation is used for constructing the RAG knowledge base, the storage cost is reduced through binary bit compression storage, and the storage constraint of the edge equipment is solved. The Hamming distance algorithm is adopted to replace the traditional cosine similarity calculation, the calculation force requirement in the retrieval process is reduced, the time delay is shortened, and the real-time requirement is met. Based on clustering management of the HDC vector, full recalculation of the newly added knowledge is not needed, and updating efficiency is improved.

Inventors

  • Request for anonymity

Assignees

  • 上海光羽芯辰科技有限公司

Dates

Publication Date
20260505
Application Date
20260407

Claims (10)

  1. 1. An edge device retrieval enhancement generation system based on a super-dimensional computation, comprising: The HDC knowledge base module is used for acquiring related data operated by the edge equipment, performing HDC coding processing on the related data operated by the edge equipment, and performing sparsification and block storage on the result of the HDC coding processing to form an edge equipment HDC knowledge base; the HDC retrieval module is connected with the HDC knowledge base module and is used for retrieving and obtaining a retrieval result of the user query request from the edge equipment HDC knowledge base by adopting a similarity retrieval method when the user query request is received; and the LLM generation module is connected with the HDC retrieval module and is used for generating a structured query report corresponding to the user query request through a lightweight large language model based on the retrieval result of the user query request.
  2. 2. The edge device retrieval enhancement generation system based on super-dimensional computing of claim 1, wherein said HDC knowledge base module comprises: the knowledge input unit is used for acquiring related data of the operation of the edge equipment and preprocessing the related data to acquire preprocessed knowledge data; the HDC encoding unit is used for performing HDC encoding processing on the preprocessed knowledge data based on the random super-dimensional space mapping rule to obtain an HDC vector corresponding to the preprocessed knowledge data; And the classification storage unit is used for sparsifying and storing the HDC vectors by adopting a binary bit compression storage mode, dividing all the HDC vectors into clusters for block storage, and forming an HDC knowledge base of the edge equipment.
  3. 3. The super-dimensional computing based edge device retrieval enhancement generation system of claim 1, wherein said HDC retrieval module comprises: the query coding unit is used for carrying out HDC coding processing on the user query request after receiving the user query request to obtain an HDC vector of the user query request; the similarity calculation unit is used for calculating the similarity between the HDC vector of the user query request and all the HDC vectors of the corresponding clusters in the edge device HDC knowledge base by adopting a Hamming distance algorithm, and screening the HDC vector meeting the preset distance requirement from the edge device HDC knowledge base according to the similarity calculation result to serve as a retrieval result of the user query request.
  4. 4. The super-dimensional computing based edge device retrieval enhancement generation system of claim 1, wherein the LLM generation module comprises: The formatting unit is used for formatting the search result of the user query request; and the report output unit is used for inputting the formatted search result into the lightweight large language model and generating a structured query report corresponding to the user query request.
  5. 5. The system for generating the edge device retrieval enhancement based on the super-dimensional computation according to claim 1, further comprising an HDC knowledge base updating module connected with the HDC knowledge base module, wherein the HDC knowledge base updating module is used for acquiring updating information of the edge device HDC knowledge base in real time, performing HDC coding processing on the updating information to obtain an HDC vector of the updating information, performing similarity computation on the HDC vector of the updating information and each cluster in the edge device HDC knowledge base, judging according to a similarity computation result to obtain corresponding cluster information updated and stored in the edge device HDC knowledge base, and performing redundancy detection to delete the redundant vector.
  6. 6. The system of claim 3, wherein the HDC vector of the user query request and the HDC vector in the edge device HDC repository are the same dimensional vectors.
  7. 7. The edge equipment retrieval enhancement generation method based on the super-dimensional computation is characterized by comprising the following steps of: Acquiring related data operated by edge equipment, performing HDC coding processing on the related data operated by the edge equipment, and performing sparsification and block storage on the result of the HDC coding processing to form an edge equipment HDC knowledge base; When a user query request is received, searching and obtaining a search result of the user query request from the HDC knowledge base of the edge equipment by adopting a similarity search method; and generating a corresponding structured query report of the user query request through a lightweight large language model based on the search result of the user query request.
  8. 8. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method for enhanced generation of edge device search based on super-dimensional computation of claim 7.
  9. 9. A computer program product comprising computer program code which, when run on a computer, causes the computer to implement the method of edge device retrieval enhancement generation based on multidimensional calculations as claimed in claim 7.
  10. 10. An electronic terminal comprising a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to implement the method of generating edge device retrieval enhancement based on a multidimensional calculation of claim 7.

Description

System, method, medium, program product and terminal for generating edge equipment retrieval enhancement based on super-dimensional calculation Technical Field The application relates to the technical field of search enhancement generation, in particular to an edge equipment search enhancement generation system, method, medium, program product and terminal based on super-dimensional calculation. Background Currently, a retrieval enhancement generation (RETRIEVAL-augmented Generation, RAG) system has become a core technology for improving the accuracy and knowledge timeliness of large language model (Large Language Model, LLM) generation, and a typical architecture of the retrieval enhancement generation (RETRIEVAL-augmented Generation, RAG) system is composed of a retrieval module (responsible for acquiring related knowledge from an external knowledge base) and a LLM generation module (generating response based on a retrieval result), and is widely applied to edge equipment scenes such as industrial edge diagnosis, intelligent home interaction, vehicle-mounted voice assistants and the like. The retrieval module of the conventional RAG system generally relies on a high-dimensional dense vector database to realize knowledge storage and similarity retrieval, and specifically comprises (1) a knowledge encoding link, a retrieval calculation link and an edge device adaptation, wherein the knowledge encoding link is used for converting knowledge such as text, sensor data and fault manuals into dense vectors (such as encoded by models such as BERT, sentence-BERT) with 512-4096 dimensions, (2) the retrieval calculation link is used for matching knowledge vectors most relevant to user query from the vector database in a complex calculation mode such as Euclidean distance, cosine similarity and the like, and (3) the edge device adaptation is used for limiting the edge device (such as an industrial edge gateway, an Internet of things terminal and an on-board central control) to hardware resources (CPU calculation power is less than or equal to 4 cores, memory is less than or equal to 8GB, memory is mainly used for flash memory and power consumption is less than or equal to 10W), the storage and calculation requirement of the high-dimensional vector database is difficult to bear, and cloud cooperation is generally needed, but network time delay (more than or equal to 100 ms) and data privacy risk are introduced. The prior RAG system has the core defects that (1) the storage cost is too high, each piece of knowledge of a high-dimensional dense vector (such as 1024-dimensional float32 vector) occupies about 4KB storage space, if the edge device needs to store 10 ten thousand pieces of industrial fault knowledge, a vector database needs to occupy more than or equal to 400MB space, the conventional redundancy capacity (usually reserved less than or equal to 200 MB) of a flash memory of the edge device is far beyond, the retrieval time delay is long, similarity calculation depends on floating point operation (such as multiple multiplication and summation of cosine similarity), the retrieval time of a single query is more than or equal to 300ms on a single-core CPU of the edge device, the LLM is caused to generate response total time delay to be more than or equal to 1s, real-time interaction requirements (such as industrial fault diagnosis needs to be less than or equal to 500 ms) cannot be met, and (3) the edge suitability is poor, the encoding and the retrieval of the high-dimensional vector needs to continuously occupy more than or equal to 30% of the CPU calculation power of the edge device, other services (such as sensor data acquisition) which are easy to cause the simultaneous operation of the device are blocked, the operation power consumption is high (more than 40% of the total power consumption of the device is shortened), and (4) the new update/update is required to be newly updated, and the new knowledge is required to be updated when the new knowledge is required to be updated in the fast iteration mode (such as the new knowledge is required to be updated) is more than 10 minutes, and the dynamic knowledge is required to be updated in the dynamic and is required to be updated. Disclosure of Invention In view of the shortcomings of the prior art, the invention provides an edge device retrieval enhancement generation system, an edge device retrieval enhancement generation method, an edge device retrieval enhancement generation medium, an edge device retrieval enhancement generation program product and an edge device retrieval enhancement generation terminal based on super-dimensional computation, which are used for solving the problems that an existing RAG system is high in storage cost, slow in retrieval, poor in suitability, low in updating efficiency and the like in edge devices. In order to achieve the above and other related objects, a first aspect of the present application provides an edge device retrieval