CN-121981686-A - Workflow optimization method for multi-agent system based on different composition

CN121981686ACN 121981686 ACN121981686 ACN 121981686ACN-121981686-A

Abstract

The invention discloses a multi-agent system workflow optimization method based on different patterns, which is characterized in that a multi-agent system is modeled into different patterns, MAS workflow generation problems are converted into different pattern adjacency matrix optimization problems, agents and tools are used as nodes, an interaction mode is used as edges, the method comprises the steps of communication links among the agents, calling of the tools and self-thinking of the agents, sub-sampling strategies are designed by means of an upper confidence boundary algorithm to conduct sub-sampling, a two-stage matrix training process is introduced, namely, the first stage is conducted through sub-sampling and task evaluation, node sparsity and low-rank sparsity punishment are combined, high-efficiency nodes are rapidly screened out, the second stage is fine-granularity edge optimization, the interaction edge weight is optimized for the nodes selected in the first stage, the effective edge weight is improved, the redundant edge weight is reduced, and an optimal graph matrix is obtained. The invention is suitable for the fields of scientific calculation, software development and the like, improves the adaptability and performance of the MAS system, and has wide application prospect and practical value.

Inventors

JIA HAO
GUAN BOWEN
HUANG YONG
ZHAO LIANG
GENG DAN
CHEN LUOBIN
LIU JUN

Assignees

中华人民共和国大连海关

Dates

Publication Date: 20260505
Application Date: 20260210

Claims (7)

1. The multi-agent system workflow optimization method based on the heterograms is characterized by comprising the following steps of: The method comprises the steps of modeling a multi-agent system into an heterogram, initializing agent nodes, enabling the agent nodes to be endowed with specific roles and capabilities, enabling all the agent nodes and the tool nodes to be fully connected through the interactive edges, setting multiple rounds of workflow execution for each task, forming an initial heterogeneous complete graph, and endowing each element of an adjacent matrix with initial weight, wherein the nodes comprise the agent nodes and the tool nodes; Secondly, sampling subgraphs from the heterogeneous graph by utilizing a subgraph sampling strategy to serve as a workflow to execute a current task, and evaluating the performance of the system; Continuously sampling the subgraph and evaluating the system performance, updating the adjacent matrix weight through a gradient back propagation mechanism, removing nodes with average weight lower than a set threshold value, and obtaining an intermediate heterogeneous subgraph and an adjacent matrix thereof; the fourth step, re-initializing the adjacent matrix weight based on the middle heterogeneous subgraph, taking the maximized system performance as a training target, introducing low-rank sparsity at the same time, and performing second-stage training; and fifthly, obtaining the most excellent composition after finishing two-stage training, and determining an optimal workflow according to the optimal heterogeneous chart for automatic execution of the task to be processed.
2. The method of claim 1, wherein in the first step, a plurality of rounds of workflow execution are set for each task to enhance self-thinking of the multi-agent system, and a maximum number of times of thinking is defined as K, the interactive sides are divided into an in-wheel side and an inter-wheel side, and the adjacency matrix is divided into an in-wheel side matrix and an inter-wheel side matrix, for each of which is at a first point Agent node in round Its output of Tasks faced by current systems Configuration of the agent node itself And output decisions from in-wheel neighbors and inter-wheel neighbors: In the formula, Representing nodes Is provided with an output of (a), And The points are respectively in the wheel and among the wheels Is defined by the nodes of the edges of: In the formula, Is shown in the first In the round, the node Sum node Importance weight of interactions between; Is shown in the first In the round, the node Importance weight of self-jeopardy; Is shown in the first In the round, the node Is passed to the first Nodes in a round Importance weight of (c).
3. The method of claim 1, wherein the second step neutron graph sampling strategy first selects the interaction edge by the following formula Weights of (2) : In the formula, 、、 And Is the parameter, the first item Representing the input weight given by the interaction edge e; Representing the number of times the interaction edge e is sampled into the sub-graph and eventually the correct answer is derived, Representing the number of times the interaction edge e is co-selected, then the second term Representing the average rewards of the interactive edges e, wherein the average rewards are used for ensuring that the interactive edges with better historical expression have higher priority; representing the total number of edges selected so far, the third item As an exploration component for encouraging selection of interactive edges with a smaller number of selections, a fourth item As a diversity reward for avoiding over-utilization of certain interactive edges; after the weight of each interaction edge is obtained, sigmoid normalization is carried out by using the following formula to obtain the probability that each interaction edge is sampled : Probability of being sampled with each interaction edge Sub-sampling is performed.
4. The method of claim 1, wherein in the second step, the probability weighted average performance using a limited number of samples approximates the system performance expectations: Wherein, the Representing the sub-sampling strategy from the adjacency matrix Mid-sampling to obtain sub-adjacency matrix Is a function of the probability of (1), The sub-graph obtained by sampling is represented, The representation is based on an adjacency matrix Is a sub-set of all samples of (a), Representing the task faced by the current system, N represents the total number of sub-graphs sampled.
5. The method according to claim 1, wherein in the third step, the first stage training goal is expressed as: Wherein, the Representing the task at which the current system is faced, The sub-graph obtained by sampling is represented, The representation is based on an adjacency matrix Is a sub-set of all samples of (a), And As the coefficient of the light-emitting diode, Representing the adjacency matrix of the kth round of workflow, Representing low-rank sparsity approximated by a kernel norm; representing a node sparsity penalty term: In the formula, Representing connection to a node Average weight of all in-wheel edges of (a); Representing scaling factors, ordering node availability after each round of iterative training assigns a larger scaling factor to the inefficient nodes to amplify their losses, and a smaller scaling factor to the efficient nodes to allow them to be preserved during training.
6. The method of claim 1, wherein in the fourth step, the second stage training goal is represented as: Wherein, the Representing the intermediate adjacency matrix, Representing the task at which the current system is faced, The sub-graph obtained by sampling is represented, The representation is based on an intermediate adjacency matrix Is a sub-set of all samples of (a), As the coefficient of the light-emitting diode, Representing the intermediate adjacency matrix of the kth round of workflow, Representing low rank sparsity approximated by a kernel norm.
7. The method of claim 1, wherein the agent node comprises: ① Commodity classifying and auditing intelligent agent; ② The customs clearance document compliance auditing agent; ③ Risk study and judgment of an agent; ④ Tax calculating agent; ⑤ A flow coordination and decision agent; The tool node comprises: ① A commodity classification database; ② A document OCR and extraction tool; ③ Enterprise credit and history repository; ④ A risk parameter library and a model; ⑤ Tax collect system; ⑥ Customs regulation knowledge base.

Description

Workflow optimization method for multi-agent system based on different composition Technical Field The invention belongs to the field of artificial intelligence systems, and particularly relates to a Multi-agent system (Multi-AGENT SYSTEM, MAS) workflow automatic design and optimization method based on a large language model (Large Language Model, LLM). Background LLM-based multi-agent systems have become a powerful framework for handling complex tasks, with large language models as cores, by assigning roles and specific capabilities to different agents and allowing them to cooperatively interact. Its applications cover scientific investigation, software development, text authoring, etc. The workflow of the multi-agent system covers the sequence of interaction, communication and tool use among agents, while the excellent multi-agent performance greatly depends on the quality of the workflow, and the existing multi-agent workflow generation method has the following defects. The workflow in most of the current scenes is preset by manpower, and although the method can achieve good effect in specific fields, the method depends on manpower and limits expansibility. The model capacity is excessively depended, and the optimization of many workflow automatic generation methods is performed by a central large model, so that the central large model is required to have strong context capacity and reasoning capacity, but the cost is increased, and excessive requirements are also provided for the scale of the model, so that the methods are difficult to widely apply in a scene with limited resources. The capability of utilizing the historical information is insufficient, in the existing method, the historical experience of the optimization process is fed back to the central large model in a natural language mode to optimize and generate a new workflow, however, due to the fact that the context window of the large model is limited, only a plurality of recent historical experiences can be used in decision making, all the historical information cannot be utilized, and the risk of sinking into local optimum is increased. Dynamic interactions between agents and tools are ignored, existing workflow optimization methods focus on optimizing interactions between agents without taking into account the invocation logic of agents and tools in the optimization objective, which makes it difficult for the system to produce good results in practical application scenarios where a large number of external tools need to be invoked to enhance the system capabilities. Disclosure of Invention Aiming at the defects existing in the prior art, the invention provides a multi-agent system workflow optimization method based on different composition, which converts MAS workflow generation problem into heterogeneous diagram optimization technology of adjacency matrix learning problem, can improve the efficiency and performance of workflow optimization, and realizes extensible adaptation across various fields (such as scientific research, software development and reasoning task). In order to achieve the above purpose, the present invention adopts the following technical scheme: a multi-agent system workflow optimization method based on different composition comprises the following steps: The method comprises the steps of modeling a multi-agent system into an heterogram, initializing agent nodes, enabling the agent nodes to be endowed with specific roles and capabilities, enabling all the agent nodes and the tool nodes to be fully connected through the interactive edges, setting multiple rounds of workflow execution for each task, forming an initial heterogeneous complete graph, and endowing each element of an adjacent matrix with initial weight, wherein the nodes comprise the agent nodes and the tool nodes; Secondly, sampling subgraphs from the heterogeneous graph by utilizing a subgraph sampling strategy to serve as a workflow to execute a current task, and evaluating the performance of the system; Continuously sampling the subgraph and evaluating the system performance, updating the adjacent matrix weight through a gradient back propagation mechanism, removing nodes with average weight lower than a set threshold value, and obtaining an intermediate heterogeneous subgraph only comprising effective nodes and an adjacent matrix thereof; the fourth step, re-initializing the adjacent matrix weight based on the middle heterogeneous subgraph, taking the maximized system performance as a training target, introducing low-rank sparsity at the same time, and performing second-stage training; and fifthly, obtaining the most excellent composition after finishing two-stage training, and determining an optimal workflow according to the optimal heterogeneous chart for automatic execution of the task to be processed. Compared with the prior art, the invention has the following beneficial effects: Advantages of heterogeneous graph modeling. According to the invention