Search

CN-122022437-A - Workflow decision system based on LLM, multi-agent and vectorization retrieval

CN122022437ACN 122022437 ACN122022437 ACN 122022437ACN-122022437-A

Abstract

The invention discloses a workflow decision system based on LLM, multiple agents and vectorization retrieval, and belongs to the technical field of big data processing. The system comprises a data source layer, a data processing and filtering layer, an agent engine layer and an application layer which are sequentially cooperated to construct a three-layer cooperated computing architecture of data purification, intelligent scheduling and deep mining, and the system is concretely characterized in that the data source layer is used for providing multi-source heterogeneous data, the data processing and filtering layer is used for performing intelligent preprocessing and mixed filtering operation on the data input by the data source layer, the agent engine layer is used for cooperatively executing an engine for multiple agents, the application layer is used for receiving an analysis result of the agent engine layer, and the system is used for realizing incremental analysis and multi-task parallelism through a context layering isolation mass data processing mechanism, and adapting to professional knowledge requirements of different service scenes through a dynamic modularized prompt word mechanism.

Inventors

  • ZHUO JING
  • CAO HAO

Assignees

  • 四川盛邦润达科技有限公司

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. The workflow decision system based on LLM, multiple agents and vectorization retrieval is characterized by comprising a data source layer, a data processing and filtering layer, an agent engine layer and an application layer which are sequentially cooperated to construct a three-layer cooperative computing architecture of data purification-intelligent scheduling-deep mining, and the three-layer cooperative computing architecture is specifically as follows: The data source layer is used for providing multi-source heterogeneous data, comprises unstructured data and semi-structured data, and specifically comprises at least one of chat records, bank running water and a call list; The data processing and filtering layer is used for performing intelligent preprocessing and mixed filtering operation on the data input by the data source layer and outputting highly suspected data, and the operation comprises semantic compression preprocessing based on entity identification, scoring screening based on multi-feature fusion and vectorization retrieval; the intelligent Agent engine layer is a multi-intelligent Agent cooperative execution engine and comprises a 'function Agent → WorkFlow → Task' three-layer reusable intelligent Agent architecture, a dynamic workflow and knowledge management module and a context storage module, wherein the dynamic workflow and knowledge management module is used for storing intermediate clues and analyzed data states in a lasting manner, and carrying out intention analysis, workflow dynamic arrangement, multi-intelligent Agent cooperative reasoning and negative decision on highly suspected data; The application layer is used for receiving the analysis result of the agent engine layer and providing the functions of clue visualization, relationship graph display, analysis report output and continuous inquiry response based on the session memory; the system realizes the parallelism of incremental analysis and multi-task through a context layered isolation mass data processing mechanism comprising data layered storage and task isolation analysis, and adapts to professional knowledge requirements of different business scenes through a dynamic modularized prompt word mechanism.
  2. 2. The workflow decision system based on LLM, multi-agent and vectorized retrieval of claim 1 wherein the intelligent preprocessing and hybrid filtering flow of the data processing and filtering layer comprises the steps of: S101, constructing a theme knowledge base taking personnel, positions, communication and funds as core dimensions based on multi-source data input by a data source layer, wherein the knowledge bases are associated through unique identification keys to form a three-dimensional data network, and the unique identification keys comprise an identity card number and a mobile phone number; s102, intelligent segmentation is carried out on session type original data including but not limited to chat records according to time and dialogue parties, semantic compression is carried out on each segmentation, wherein key entities are extracted, redundant words are filtered, and are recombined into semantic concentrated texts after duplication removal, the key entities comprise names, places and amounts, and the redundant words comprise word of speech, adverbs and exclamation words; S103, extracting and grading mixed features, namely carrying out vector embedding on semantic concentrated texts, calculating similarity scores of the semantic concentrated texts and a preset case-related sample vector library, namely Faiss scores, the number proportion of key entities, dynamic positive and negative word library weighting scores and context rearrangement scores, and generating comprehensive case-related probability scores through a preset fusion model; s104, screening highly suspected paragraphs according to a preset threshold value, pushing the highly suspected paragraphs to an agent engine layer, and associating corresponding topic library information.
  3. 3. The workflow decision system based on LLM, multi-agent and vectorization retrieval of claim 2 wherein in said dynamic positive and negative word stock, the negative word stock is derived from high frequency non-case-related paragraphs and the positive word stock is derived from high frequency entities of history case-related paragraphs.
  4. 4. The workflow decision system based on LLM, multi-Agent and vectorization retrieval of claim 1 wherein in the three-layer reusable Agent architecture of "function Agent- & gt WorkFlow- & gt Task", function Agent is a dedicated function module, workFlow is a preset standardized execution chain, and Task layer is used for performing secondary integration and arrangement on at least one WorkFlow to realize multiplexing of function modules.
  5. 5. The workflow decision system based on LLM, multi-agent and vectorization retrieval of claim 1 wherein the context layered isolation mass data processing mechanism is specifically to perform layered storage and analysis isolation of tens of millions of word level data, persist intermediate threads and analyzed data states, mark "processed" data blocks, implement incremental analysis to avoid duplicate processing while supporting multitasking parallel execution and data security isolation.
  6. 6. The workflow decision system based on LLM, multi-agent and vectorization retrieval according to claim 1, wherein the dynamic modularized prompt word mechanism splits the prompt word into a basic reasoning logic and an extensible professional knowledge base, supports dynamic loading and combination according to business scenes, and the negative decision mechanism is realized through code generation and logic re-evaluation, and when ambiguity exists in decision, supplementary information is automatically called to check and correct erroneous decision.
  7. 7. The workflow decision system based on LLM, multi-agent and vectorized retrieval of claim 1 wherein said multi-agent collaborative reasoning process of said agent engine layer comprises: S201, intention analysis and Task routing, namely recognizing intention of a user natural language instruction through a fine-tuning language model or a lightweight Bert classification model, and routing the intention to a corresponding Task; S202, dynamically arranging a workflow, namely calling corresponding WorkFlow and function agents based on the requirement of Task to construct a personalized execution flow; S203, performing chained execution and persistence, namely performing sequential execution by the function agents, inputting LLM analysis to highly suspected data in batches by adopting a divide-by-conquer-persistence strategy, and storing analysis results in real time and calling the analysis results by other agents across data sources; S204, performing anti-thinking verification, namely triggering an anti-thinking mechanism when uncertainty or contradiction exists in the judgment of LLM output, and calling a professional knowledge base and associated data to perform secondary reasoning verification.
  8. 8. The workflow decision system based on LLM, multi-Agent and vectorized retrieval of claim 7 wherein WorkFlow comprises chat log analysis WorkFlow, said chat log analysis WorkFlow consisting of information query Agent, transaction analysis Agent, chat depth analysis Agent in order.
  9. 9. The workflow decision system based on LLM, multi-agent and vectorization retrieval of claim 1 wherein said application layer thread visualization comprises structured thread display, relational graph plotting, said continuous query response is based on a session memory module built in the system, supporting user follow-up targeted query and detail query of analysis results.
  10. 10. The workflow decision system based on LLM, multi-agent and vectorization retrieval of claim 1, wherein the data processing and filtering layer adopts Faiss efficient vector retrieval library to realize vector similarity calculation, so that vector calculation cost is reduced, the semantic compression preprocessing technology can reduce Token consumption of large model processing, and the calculation cost is saved by at least 30%.

Description

Workflow decision system based on LLM, multi-agent and vectorization retrieval Technical Field The invention relates to the technical field of big data processing, in particular to a workflow decision system based on LLM, multi-agent and vectorization retrieval. Background In public safety and judicial investigation fields, analysts need to face massive, multi-source and unstructured data such as WeChat chat records, bank running water, call lists and the like. The traditional analysis method mainly relies on manual line-by-line reference and rule-based keyword matching, and has the following remarkable defects: the method has low efficiency and high cost, is slow in manual processing of mass data, is easy to generate fatigue, and is difficult to cope with the ever-increasing data scale. The clue omission risk is large, semantic context cannot be understood based on a keyword retrieval mode, and case related information adopting a dark language, a code or a complex logic relationship is easy to miss. Depth analysis capability is lacking-traditional methods have difficulty in achieving automated correlation analysis across data sources (such as correlating references in chat records with transfer records) and crime pattern reasoning. The large model is high in direct application cost and poor in effect, if massive raw data are directly input into the large model, the context window of the large model is exceeded, information is lost and analysis is incomplete, and meanwhile extremely high calculation cost and Token consumption are generated. The workflow is stiff, the existing analysis tool flow is fixed, and dynamic adjustment and personalized adaptation are difficult to carry out according to different case types (such as drug crimes and job crimes) and the instant intention of a user. Therefore, an analysis system capable of intelligently, automatically, and deeply processing multi-source data related to a case is needed, and having high accuracy, high efficiency, and high flexibility is needed. In view of this, we propose workflow decision systems based on LLM, multi-agent and vectorized retrieval. Disclosure of Invention The invention aims to provide a workflow decision system based on LLM, multi-agent and vectorization retrieval, so as to solve the problems in the background technology. In order to achieve the above purpose, the present invention provides the following technical solutions: The workflow decision system based on LLM, multiple agents and vectorization retrieval comprises a data source layer, a data processing and filtering layer, an agent engine layer and an application layer which are sequentially cooperated to construct a three-layer cooperative computing architecture of data purification-intelligent scheduling-deep mining, and the workflow decision system comprises the following specific steps: The data source layer is used for providing multi-source heterogeneous data, comprises unstructured data and semi-structured data, and specifically comprises at least one of chat records, bank running water and a call list; the data processing and filtering layer is used for performing intelligent preprocessing and mixed filtering operation on the data input by the data source layer and outputting highly suspected data, and the operation comprises semantic compression preprocessing based on entity identification, scoring screening based on multi-feature fusion and vectorization retrieval; the intelligent Agent engine layer is a multi-intelligent Agent cooperative execution engine and comprises a 'function Agent → WorkFlow → Task' three-layer reusable intelligent Agent architecture, a dynamic workflow and knowledge management module and a context storage module, wherein the dynamic workflow and knowledge management module is used for storing intermediate clues and analyzed data states in a lasting manner, and carrying out intention analysis, workflow dynamic arrangement, multi-intelligent Agent cooperative reasoning and negative decision on highly suspected data; The application layer is used for receiving the analysis result of the agent engine layer and providing the functions of clue visualization, relationship graph display, analysis report output and continuous inquiry response based on the session memory; The system realizes the parallelism of incremental analysis and multi-task through a context layering isolation mass data processing mechanism comprising data layering storage and task isolation analysis, and adapts to the professional knowledge requirements of different business scenes through a dynamic modularized prompt word mechanism. Preferably, the intelligent preprocessing and hybrid filtering flow of the data processing and filtering layer comprises the following steps: S101, constructing a theme knowledge base taking personnel, positions, communication and funds as core dimensions based on multi-source data input by a data source layer, wherein the knowledge bases are associated t