CN-121998103-A - Multi-agent brain signal autonomous understanding method based on large language model driving
Abstract
The invention discloses a multi-agent brain signal autonomous understanding method based on large language model driving, which solves the problems that the traditional paradigm flow is stiff and long-term complex tasks are difficult to process on the basis of reducing the brain signal analysis technical threshold by constructing a layered collaboration framework comprising a central supervisor agent and a specialized sub-agent. The invention utilizes a central supervisor agent to analyze the natural language intention of a user and dynamically decompose tasks, utilizes a specialized sub-agent to combine a global shared state and a context isolation mechanism to execute domain-specific full-link dynamic planning and accurate tool call, and utilizes a hierarchical resource configuration and retrieval enhancement generation mechanism to fuse quantitative calculation results and qualitative clinical knowledge to generate a comprehensive analysis report with cross-domain causal logic, thereby establishing a three-layer difficulty evaluation benchmark system verification framework performance and finally realizing autonomy, flexibility and clinical interpretability of a brain signal understanding process.
Inventors
- Zhao sha
- ZHOU YANGXUAN
- WANG JIQUAN
- LI SHIJIAN
- PAN GANG
Assignees
- 浙江大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260409
Claims (9)
- 1. The multi-agent brain signal autonomous understanding method based on the large language model driving is characterized by comprising the following steps of: (1) Constructing a layered multi-agent architecture including a central supervisor agent, a plurality of specialized sub-agents, and a global sharing state; (2) The central supervisor intelligent agent receives the user natural language query, analyzes the user intention complexity, and decomposes the complex query into a plurality of subtasks; (3) The central supervisor agent distributes subtasks to the corresponding specialized subtasks according to the field attributes, and the specialized subtasks adopt a context isolation strategy when executing the subtasks; (4) The specialized sub-agent receiving the sub-task instruction performs logic reasoning planning by combining the current system state and the workflow context to generate a tool execution sequence and a tool calling instruction; (5) The specialized sub-agent sequentially calls corresponding tools through a system interface according to the generated tool execution sequence, performs actual brain signal analysis operation and manages input and output circulation of data; (6) The specialized sub-agent triggers a retrieval enhancement mechanism, searches related knowledge according to the subtask query general knowledge base and the domain-specific knowledge base, and combines the tool execution output result to generate a specialized domain sub-report which is returned to the central supervisor agent; (7) The central supervisor agent aggregates the domain sub-reports returned by each sub-agent, performs cross-modal logic reasoning and causal relationship analysis, and generates a comprehensive analysis report to be fed back to the user.
- 2. The method for automatically understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein the central supervisor agent in the step (1) is used for analyzing natural language intention of users and distributing tasks, the multiple specialized sub-agents comprise sleep analysis agents and emotion analysis agents, the sleep analysis agents are used for specialized sleep analysis tasks comprising sleep stage, micro event detection, sleep structure assessment and clinical sleep report generation, the emotion analysis agents are used for specialized emotion analysis tasks comprising emotion state identification, mental fatigue monitoring, cognitive load assessment and emotion stability analysis, the global sharing state is used for managing brain signal data input by users, intermediate processing results and tool execution parameters to achieve data flow and state synchronization among the agents, the global sharing state is achieved by adopting key values to a dictionary structure, input parameters are read from the global sharing state and output results are written back when all tools are executed, and the state synchronization can be achieved without explicit communication among the agents.
- 3. The method for automatically understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein said specialized sub-agent configuration hierarchical resource access mechanism divides tools into a general tool set and a domain-specific tool set, divides a knowledge base into a general knowledge base and a domain-specific knowledge base, enables sub-agents to access only a subset of resources strongly related to tasks thereof, reduces the occupation ratio of irrelevant tool descriptions in a context window, reduces calculation overhead, and supports sub-agents to perform operations including data loading, preprocessing, feature extraction and domain analysis.
- 4. The method for autonomously understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein the central supervisor agent in the step (2) firstly judges the semantic complexity of query by combining history session, analyzes user intention by means of thinking chain reasoning, judges whether specialized sub-agents need to be called according to the semantic complexity of query, generates response directly from global sharing state for simple greeting or capability query, distributes only single definite instruction for atomic operation, and decomposes complex task related to high-level semantics into sub-task sequences with logic dependency relationship based on causal reasoning logic.
- 5. The method for autonomous understanding of brain signals of multiple agents based on large language model driving of claim 1, wherein the context isolation policy in the step (3) is specifically that the specialized sub-agents only receive specific task instructions distributed by the central supervisor agent and related execution result summaries generated in the global sharing state after the preamble sub-agents execute the tasks, shield irrelevant global dialogue histories, and ensure that the sub-agents concentrate on executing specific analyses in the sub-task domain.
- 6. The method for autonomous understanding of brain signals of multiple agents based on large language model driving according to claim 1, wherein the specific implementation process of the step (4) is as follows: S41, the specialized sub-agent preferentially executes direct answer judging logic before starting tool planning, and evaluates whether the existing information is enough to directly respond to the sub-task instruction by retrieving the global sharing state and the execution result of the precursor sub-agent; S42, when a tool is required to be called, the specialized sub-agent disassembles a high-level instruction into a series of atomic sub-targets which are not separable, then determines the logic execution sequence of each sub-target according to the dependence relation of data flow, and finally generates a complete tool execution sequence from a starting point to an end point in a single response so as to ensure the continuity and efficiency of an execution flow; S43, when a specific tool call instruction is constructed, the specialized sub-agent follows the following parameter configuration rules, namely if a parameter is explicitly specified in a user instruction, the parameter is forcedly mapped to a corresponding tool parameter, and for non-critical parameters which are not explicitly specified, a default value or an empty dictionary defined by the tool is automatically used, and meanwhile, the parameter key name is forcedly checked, so that the parameter key name is strictly matched with a variable name in a global sharing state, and the correctness of a tool execution parameter reading path is ensured.
- 7. The method for automatically understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein the tool called in the step (5) performs corresponding encapsulation processing operation on brain signal data strictly according to the predefined algorithm logic and flow steps in the function description, reads input parameters from a global sharing state, and writes the generated output result back to the global sharing state after the execution is completed, thereby updating the current context information of the system for reading by other subsequent tools or sub-agents, and realizing seamless synchronization and zero copy circulation of data streams.
- 8. The method for automatically understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein the retrieval enhancement mechanism in the step (6) is characterized in that coarse granularity retrieval is firstly carried out in a general knowledge base and a domain-specific knowledge base through a double encoder model, then fine granularity reordering is carried out through a cross encoder model, physiological definitions or clinical guidelines with high correlation are screened out to serve as domain-specific knowledge, and finally quantitative output results executed by tools are combined with the retrieved domain-specific knowledge to generate specialized domain sub-reports with interpretation and returned to a central supervisor agent.
- 9. The method for automatically understanding brain signals of multiple agents based on large language model driving according to claim 1, wherein the central supervisor agent in the step (7) receives and analyzes the domain sub-reports returned by each sub-agent, extracts key physiological indexes, quantitative characteristics and domain specific conclusions thereof, then performs deep semantic reasoning, identifies the connection between internal logic relations and physiological mechanisms among analysis results of different domains, thereby constructing a multidimensional physiological state view, finally integrates discrete data insight into a coherent natural language description by the central supervisor agent, and finally generates a comprehensive analysis report with internal logic consistency and feeds the comprehensive analysis report back to a user.
Description
Multi-agent brain signal autonomous understanding method based on large language model driving Technical Field The invention belongs to the technical field of intersection of brain-computer interfaces and artificial intelligence, and particularly relates to a multi-agent brain signal autonomous understanding method based on large language model driving. Background The brain-computer interface (BCI) technology realizes the direct interaction between the human brain and external equipment by decoding brain signals, and has important application value in the fields of clinical health monitoring, sleep disorder diagnosis, emotion assessment, cognitive load analysis and the like. Electroencephalogram signals (EEG) are used as the most mainstream non-invasive brain signal acquisition modes, and are used as core data sources of key application scenes such as sleep stage, fatigue monitoring and emotion recognition by virtue of high time resolution and convenience. The brain signal understanding paradigm has undergone remarkable evolution, early research mainly depends on a characteristic engineering paradigm, and is decoded by combining a traditional classifier through manually designing time-frequency domain characteristics, and the method is highly dependent on field expert experience and has limited generalization capability. After deep learning is raised, a representation decoding range gradually becomes the main stream, and an end-to-end model can automatically learn potential representation from original signals, so that performance breakthrough is achieved in multiple brain signal decoding tasks. However, the existing paradigm still has two fundamental bottlenecks, severely restricting the scale application in real scenes: the prior art has a high technical barrier, and severely restricts the popularization and clinical transformation of brain signal understanding and analyzing technology. The complete brain signal understanding and analyzing flow needs to have brain signal research background, programming capability and signal processing professional knowledge, so that non-technical background users such as clinicians are difficult to effectively operate and apply. The existing brain signal understanding paradigm has static and limited design and function, and is difficult to support a complex and long-time end-to-end workflow. The current paradigm is built based on a single task and a static flow, only can realize isolated signal decoding operation, but cannot realize continuous and autonomous cooperative processing in a complete diagnosis link such as data loading, preprocessing, sleep staging, report generation and the like, and the current paradigm lacks effective planning and sequential execution capability of multi-step and strongly-dependent analysis tasks, so that the applicability and efficiency of the brain signal analysis system in dynamic and long-term application scenes are limited. In recent years, the rapid evolution of large language models provides a new technical path for breaking through the bottleneck, and the powerful intention understanding, task planning and tool calling capabilities of the large language models enable the construction of intelligent systems capable of understanding natural language and autonomously executing complex flows to be possible. Some efforts have been directed to introducing large language models into the brain-computer interface field, but most of the current research still focuses on cross-modal translation tasks, such as decoding brain electrical signals into text, images or video, essentially treating the model as a passive "decoder" rather than an "agent" with autonomous decision-making and planning capabilities. While there have been preliminary heuristics to implement basic event detection and report generation based on agents, their workflow is typically a predefined pattern, is inflexible, and lacks a layer-by-layer decomposition capability for complex user intent and cross-domain task collaboration mechanism. Particularly, the field has not established a systematic evaluation benchmark so far, which makes it difficult to quantitatively evaluate and compare the reliability, stability and execution efficiency of the intelligent agent system from atomic operation, sequential reasoning to cross-domain intention understanding and other different complexity levels. In addition, on the system architecture level, a single-agent model is easily interfered by description of irrelevant tools when facing to an increasingly-growing tool library, and generates phenomena of distraction and tool illusion, so that an execution path deviates from a normal track, while multi-agent cooperation shows advantages in the fields of chemistry, biology and the like, but an effective specialized labor division and resource isolation mechanism is not established in the brain signal analysis field, and the execution efficiency and the system robustness are difficult to be simul