Search

CN-121980551-A - Enterprise memory-oriented retrieval front access control method

CN121980551ACN 121980551 ACN121980551 ACN 121980551ACN-121980551-A

Abstract

The application discloses a retrieval preposed access control method facing enterprise memory, which comprises the steps of generating a user capacity bitmap for each user, generating a fragment requirement bitmap for each enterprise memory fragment, receiving user inquiry and generating an inquiry vector, recalling candidate fragment identification sequences from vector indexes based on approximate nearest neighbor search, firstly reading the fragment requirement bitmap of each candidate fragment, carrying out bit operation with the user capacity bitmap to comprise judgment, sorting candidates in a candidate pool according to similarity, executing secondary permission check on a sorting result by using the latest user capacity bitmap and the fragment requirement bitmap before final output, eliminating candidates which are failed to check, and recording audit logs. According to the enterprise memory-oriented retrieval prepositive access control method, permission judgment is prepositioned before vector loading and fine calculation, a non-permission content contact surface is blocked, the waste and delay of calculation force are reduced, and the output compliance under dynamic change of permission is ensured through secondary verification.

Inventors

  • TONG BING
  • YIN CHAOJIE
  • ZHOU YAN
  • ZHANG CHEN
  • WU JING

Assignees

  • 浙江创邻科技有限公司

Dates

Publication Date
20260505
Application Date
20260401

Claims (10)

  1. 1. The enterprise memory-oriented search front access control method is characterized by comprising the following steps of: Decomposing authority capability into a set with minimum granularity, generating a user capability bitmap for each user, generating a fragment requirement bitmap for each enterprise memory fragment, and storing the fragment requirement bitmap in association with a vector index entry; receiving a user query and generating a query vector, recalling candidate segment identification sequences from the vector index based on approximate nearest neighbor search; For each candidate segment, before loading the complete content data and executing high-precision similarity calculation, firstly reading a segment requirement bitmap and carrying out bit operation inclusion judgment with a user capacity bitmap, if the user capacity bitmap does not contain all authority capacity required by the segment requirement bitmap, directly discarding the candidate, enabling the candidate not to enter a subsequent scoring and sorting flow, and if the user capacity bitmap does not contain all authority capacity required by the segment requirement bitmap, allowing the complete content to be loaded and executing similarity calculation, and then entering a candidate pool; And sequencing the candidates in the candidate pool according to the similarity, performing secondary authority verification on the sequencing result by using the latest user capacity bitmap and the fragment requirement bitmap before final output, removing the candidates which are not passed in verification, and recording an audit log.
  2. 2. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, The user capacity bitmap is obtained by synthesizing character capacity bitmaps of a plurality of characters of a user by bit-wise OR operation.
  3. 3. The method for controlling access to a pre-search set for enterprise memory according to claim 2, wherein each element in the set of authority capabilities represents a basic access capability, and the basic access capability comprises at least one of basic enterprise memory reading, sensitive content reading, financial domain reading, legal domain reading, project reading, research and development domain reading, operation domain reading and public domain reading.
  4. 4. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, The fragment requires a bitmap of sixty-four bits, one hundred twenty-eight bits, or two hundred fifty-six bits in length as lightweight metadata to be co-stored with the vector index entry or resident memory cache.
  5. 5. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, The secondary permission verification step is used for processing dynamic permission change generated in the retrieval process, and comprises user capacity bitmap permission reduction caused by user permission recovery or fragment requirement bitmap requirement improvement caused by fragment permission recovery.
  6. 6. The method for controlling access to a memory-oriented search front of an enterprise of claim 5, wherein, The secondary authority verification step detects authority dynamic change through a cache timestamp comparison mechanism, specifically, when the first bit operation comprises determination, the cache timestamps of the user capacity bitmap and the fragment requirement bitmap are recorded, when the secondary authority verification is carried out, the latest timestamp of the current authority metadata is compared, if the timestamps are inconsistent, the authority is determined to be changed, and the candidate is triggered to be re-verified or marked to be required to be checked manually.
  7. 7. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, And for the shortage of the output quantity generated by eliminating candidates from the secondary authority verification, executing a bit filling strategy to maintain the stability of the output quantity, wherein the bit filling strategy comprises preferentially filling the candidate pool which is judged to pass through and is scored by bit operation according to the sorting order, and the candidates in the candidate pool are judged to pass through the front authority and are subjected to similarity calculation, so that the method can be directly used for filling the output quantity without repeated calculation.
  8. 8. The method for controlling access to a memory-oriented search front of an enterprise of claim 7, If the residual quantity of the candidate pool is insufficient to complement the output quantity, dynamically expanding a recall quantity threshold value of the approximate nearest neighbor search, re-executing vector index recall and front authority judging flow, adding the newly judged candidates into the candidate pool and executing similarity fine calculation until the output quantity requirement is met or the maximum search calculation upper limit is reached.
  9. 9. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, The bit operation comprises a judging step of executing in batches in a vector index recall stage, and performing parallel judgment by utilizing the compact memory storage characteristic of bitmap data and a single-instruction multi-data-stream instruction set of a processor so as to reduce authority checking delay in a high concurrency scene.
  10. 10. The method for controlling access to a memory-oriented search front of an enterprise as set forth in claim 1, wherein, The audit log record comprises segment identification of the removed candidate, removal reason type, weight limit version number of verification time, and authority state difference of the candidate when the first bit operation contains judgment and the second verification, and is used for tracing the authority change time window and system security audit.

Description

Enterprise memory-oriented retrieval front access control method Technical Field The invention belongs to the technical field of retrieval access control of enterprise memory systems, and particularly relates to a retrieval front access control method for enterprise memory. Background The existing enterprise memory system (knowledge base/enterprise search/RAG) generally adopts a process of vector search recall first and authority filtering second, namely, vector is firstly generated for user inquiry, ANN approximate nearest neighbor search recall TopK candidates are carried out in vector indexes, authority filtering is carried out on the recall candidates, and finally filtered results are output or are delivered to a large model for generation. However, the post-retrieval filtering method has the technical defects that firstly, the risk of an override recall contact surface is generated and a large number of candidate objects are processed in a vector retrieval stage, even if an unauthorized result is filtered out finally, the system is contacted with information of the unauthorized objects in the retrieval process, and possibly brings about compliance audit risk and side channel risks such as buffering, journaling and time difference, secondly, calculation power waste and delay rise, ANN recall is a main time-consuming step, recall first and then filtering can lead to a large number of calculation flowers on unauthorized data which cannot be returned finally, so that effective candidates are often insufficient to be forced to increase K values or to repeat retrieval, cost and delay are further increased, thirdly, retrieval quality in an accessible set is unstable, the post-retrieval filtering firstly takes the whole library TopK and then deletes the unauthorized item, the residual result is not equivalent to TopK in the user accessible set, and results quality fluctuation is caused, fourthly, the user authority recovery can occur in the process of a retrieval request, the prior art generally carries out filtering after one time, the filtering is carried out according to the filtering, the filtering is carried out before the filtering, the filtering is carried out, the filtering authority is changed before the output, the window can occur, the retrieval window can be changed or the retrieval authority is not be carried out, and the condition of the retrieval authority is required to be changed, and the retrieval of the document is required to be completely or the condition is required to be updated when the retrieval authority is required to be carried out in time or the retrieval window is not in time or the retrieval. Disclosure of Invention The invention provides an enterprise memory-oriented retrieval front access control method for solving the technical problems, which adopts the following technical scheme: an enterprise memory-oriented search front access control method comprises the following steps: Decomposing authority capability into a set with minimum granularity, generating a user capability bitmap for each user, generating a fragment requirement bitmap for each enterprise memory fragment, and storing the fragment requirement bitmap in association with a vector index entry; receiving a user query and generating a query vector, recalling candidate segment identification sequences from the vector index based on approximate nearest neighbor search; For each candidate segment, before loading the complete content data and executing high-precision similarity calculation, firstly reading a segment requirement bitmap and carrying out bit operation inclusion judgment with a user capacity bitmap, if the user capacity bitmap does not contain all authority capacity required by the segment requirement bitmap, directly discarding the candidate, enabling the candidate not to enter a subsequent scoring and sorting flow, and if the user capacity bitmap does not contain all authority capacity required by the segment requirement bitmap, allowing the complete content to be loaded and executing similarity calculation, and then entering a candidate pool; And sequencing the candidates in the candidate pool according to the similarity, performing secondary authority verification on the sequencing result by using the latest user capacity bitmap and the fragment requirement bitmap before final output, removing the candidates which are not passed in verification, and recording an audit log. Further, the user capacity bitmap is obtained by synthesizing the character capacity bitmaps of a plurality of characters of the user by bit-wise OR operation. Further, each element in the force set represents a basic access capability including at least one of an enterprise memory basic reading, a sensitive content reading, a financial domain reading, a legal domain reading, an item reading, a research and development domain reading, an operation domain reading, and a public domain reading. Further, the fragment requires a bitmap of s