CN-117472854-B - Method based on acceleration batch file search model

CN117472854BCN 117472854 BCN117472854 BCN 117472854BCN-117472854-B

Abstract

The application discloses a method based on an acceleration batch file search model, which comprises a file index and file metadata formed by associating the file index, wherein authority control of the file index comprises a catalog space, tenant information, user information and file authority, creation of the file index comprises tenant, file service and index service, the index service adopts asynchronous mode creation and updating, and a compensation mechanism is added for updating abnormal data except corresponding trial processing, so that missing or abnormal files are updated at regular time. The application supports file keyword query by converging files to a unified file index, realizes low-delay file data update, supports a plurality of condition searches such as keyword segmentation search, authority control, file path search and the like, is decoupled from services such as file information metadata query, preview, editing and the like in services, focuses on unified summarization of data, focuses on query efficiency, focuses on high coverage rate and accuracy of query results and authority control.

Inventors

KANG NINGBO
WANG HAICHAO
QIU CHEN

Assignees

苏州沙咖智能科技有限公司

Dates

Publication Date: 20260512
Application Date: 20231030

Claims (4)

1. A method based on an acceleration batch file search model is characterized by comprising a file index and file metadata which is formed by associating the file index, wherein the authority control of the file index is controlled by the association relation of the file metadata, and the authority control of the file index comprises directory space, tenant information, user information and file authority, The file metadata is used for storing name attributes, file type attributes, file size attributes and address attributes of an associated file library; the catalog space is used for storing metadata of tables, indexes and other objects, and comprises names of the tables, names of columns, data types of the columns and name information of the indexes; The tenant information is used for controlling physical isolation of file metadata and logical isolation of document indexes; the user information is used for controlling the attribution authority of the document; The file authority is used for controlling the authority process, the authority information and the outer chain sharing of the file, and is the authority with the finest granularity in the file index; The file index creation comprises tenant, file service and index service, wherein the index service adopts asynchronous mode creation and updating, and adds a compensation mechanism to update abnormal data except corresponding trial processing, and updates missing or abnormal files at regular time; The authority control supports the multi-tenant authority isolation level of the SaaS version and comprises the steps of setting directory authorities, updating document authorities, setting document authorities, updating document authorities and modifying document authorities, wherein the authority rules of hidden parts of files are established, the storage space of the files is the organization range authorities of the files, the sources of the files are the authorities of the files, the organization range and personnel visible search of the files can be carried out by setting the authorities of the files, and the organization authorities are set to update the organized information to index service and the personnel information under organization to authority items in the index service.
2. The method for accelerating a batch file search model as recited in claim 1, wherein the creation of the file index comprises the following operations: Space maintenance, namely pushing data maintained in any space catalogue under tenant information through file service to gather into index service; The file uploading process needs to store metadata information of the file and content identification information of the file into an index service besides storing the file into a file storage space conventionally, and the current operation is realized asynchronously, so that the file storage process is not influenced and the service is decoupled; Authority control, namely fine granularity control which is set for a visible range in the middle of a document or a catalog; File reclamation, in which the file is updated and embodied in the index service, and the reclaimed file is invisible.
3. The method based on the accelerated batch file search model of claim 2, wherein the file index creation operation method is FileConsumer, the file operation under different space directories is consumed by asynchronous consumption, the tenant is identified by FileProducer, the file identifier and the operation identifier send the message, and the specific steps of file index creation include: s1, inquiring metadata information and detailed information of files under different tenants according to tenant IDs, file metadata IDs and file editing types as reference entering objects; s2, judging whether an index under the tenant exists or not firstly when the process of file synchronization is increased according to the editing type strategy of the file, and aiming at carrying out index initialization and use when the process is synchronous for the first time; s3, indexing the abnormal processing condition of the file information, and waiting for retrying to update again in the database.
4. The method of claim 2, wherein in the space maintenance, empty space directories are meaningless to an index service, and only space directories containing file entities can be added as attributes to an index document.

Description

Method based on acceleration batch file search model Technical Field The invention belongs to the technical field of file searching, and particularly relates to a method based on an acceleration batch file searching model. Background In the conventional standardized management of a company, knowledge base storage is an extremely important standardized management link for the company, and knowledge base management is used for recording information and knowledge, so that team precipitation experience and resource sharing are facilitated, team cooperation and safety management and control are realized, a complete knowledge system is formed, and continuous evolution is realized. At present, a large number of files of a company are stored in a service terminal, and the files are scattered and unstructured data are difficult to retrieve. The usual file search is directed to structured file metadata queries, whereas the current scenario of the elastiscearch search application is not commonly used in knowledge base file storage in log analysis and web blogs. The current knowledge base tool of the company can manage the company files and set access rights through rights control, but file searching is mainly realized through fuzzy matching of file names, so that the searched data need to accurately know keywords contained in the file names, otherwise, the required files cannot be searched. Disclosure of Invention In order to make up for the defects of the prior art, the invention provides a scheme of a method based on an acceleration batch file search model, which aims to solve the problem that the matching range of files cannot be searched through file keywords in the existing corporate knowledge base. A method based on an acceleration batch file search model comprises a file index and file metadata which is formed by associating the file index, wherein the authority control of the file index is controlled by the association relation of the file metadata, the authority control of the file index comprises directory space, tenant information, user information and file authority, The file metadata is used for storing name attributes, file type attributes, file size attributes and address attributes of an associated file library; the catalog space is used for storing metadata of tables, indexes and other objects, and comprises names of the tables, names of columns, data types of the columns and name information of the indexes; The tenant information is used for controlling physical isolation of file metadata and logical isolation of document indexes; the user information is used for controlling the attribution authority of the document; The file authority is used for controlling the authority process, the authority information and the outer chain sharing of the file, and is the authority with the finest granularity in the file index. Further, the file index creation includes tenant, file service and index service, the index service adopts asynchronous mode creation and updating, and for the data with abnormal updating, except the corresponding trial processing, a compensation mechanism is added, and the missing or abnormal file is updated regularly. Further, the creation of the file index includes the following: Space maintenance, namely pushing data maintained in any space catalogue under tenant information through file service to gather into index service; The file uploading process needs to store metadata information of the file and content identification information of the file into an index service besides storing the file into a file storage space conventionally, and the current operation is realized asynchronously, so that the file storage process is not influenced and the service is decoupled; Authority control, namely fine granularity control which is set for a visible range in the middle of a document or a catalog; File reclamation, in which the file is updated and embodied in the index service, and the reclaimed file is invisible. Further, the file index creating operation method is FileConsumer, and the file operation under different space catalogs is consumed in an asynchronous consumption mode, and the tenant, the file identifier, the operation identifier and the like are identified through FileProducer to send messages, wherein the specific steps of file index creating include: s1, inquiring metadata information and detailed information of files under different tenants according to tenant IDs, file metadata IDs and file editing types as reference entering objects; s2, judging whether an index under the tenant exists or not firstly when the process of file synchronization is increased according to the editing type strategy of the file, and aiming at carrying out index initialization and use when the process is synchronous for the first time; s3, indexing the abnormal processing condition of the file information, and waiting for retrying to update again in the database. In particular, in the space maintenan