CN-122019081-A - Multi-level agent arrangement system and dynamic routing method thereof

CN122019081ACN 122019081 ACN122019081 ACN 122019081ACN-122019081-A

Abstract

The invention discloses a multi-level intelligent agent arrangement system and a dynamic routing method thereof, and relates to the technical field of artificial intelligence. The method comprises an application layer, a layout layer, an agent registry, a confidence gating module, a dynamic routing module and an agent layer, wherein the application layer is used for receiving a user request and generating a session object, the layout layer is used for creating a root task node based on the session object and executing recursion decomposition of an HTN hierarchical task network to generate an atomic task sequence, the agent registry is used for storing agent configuration records, and the confidence gating module is used for reading an initial candidate set from the agent registry for each atomic task in the atomic task sequence and generating a gated candidate set based on Conformal confidence gating. The invention can make the multi-agent cooperation process have clear structure boundary, interpretable decision logic and stable execution result.

Inventors

ZHAO CHANGJUN
JIANG HAILIANG

Assignees

北京天诚星源信息技术有限公司

Dates

Publication Date: 20260512
Application Date: 20260107

Claims (10)

1. The multi-level agent arrangement system is characterized by comprising an application layer, an arrangement layer, an agent registration center, a confidence gating module, a dynamic routing module and an agent layer, wherein the application layer is used for receiving a user request and generating a session object, the arrangement layer is used for creating a root task node based on the session object and performing recursion decomposition of an HTN hierarchical task network to generate an atomic task sequence, the agent registration center is used for storing an agent configuration record, the confidence gating module is used for reading an initial candidate set from the agent registration center for each atomic task in the atomic task sequence and generating a gated candidate set based on Conformal confidence gating, the dynamic routing module is used for selecting a selected routing arm from the gated candidate set based on ContextualBandit and sending an atomic task to an agent corresponding to the selected routing arm through a task delegation interface of the arrangement layer so as to obtain an execution result, and the agent layer is used for executing the atomic task and returning the execution result, and the arrangement layer is used for summarizing the execution result of each atomic task and outputting the execution result to the application layer.
2. The system of claim 1 wherein the root task node created by the orchestration layer comprises a task identification field, a task description field, a task type field, and a subtask list field, the orchestration layer writes the original text of the user request into the task description field and initializes the subtask list field to a null list, the orchestration layer performs a recursive decomposition operation on the root task node, the recursive decomposition operation comprising invoking a large language model and sending a decomposition instruction comprising the task description field content of the current task node, receiving the subtask description list returned by the large language model, creating a subtask node for each subtask description in the subtask description list and adding to the subtask list field of the current task node, continuing to perform a recursive decomposition operation for a subtask node of the compound type for the task type field, ending the decomposition for a subtask node of the atomic type for the task type field, traversing the leaf task node of the atomic type field after the recursive decomposition operation is completed to generate an atomic task sequence, and delivering the atomic task sequence to the confidence gating module.
3. The system of claim 2, wherein each agent configuration record stored by the agent registry comprises an agent identification field, a capability tag list field, and a tool binding list field, wherein the confidence gating module reads all agent configuration records from the agent registry for a current atomic task and forms an initial candidate set after receiving the atomic task sequence, wherein the confidence gating module invokes the large language model and sends a capability extraction instruction comprising the task description field content of the current atomic task, receives the capability tag list returned by the large language model, and stores the capability tag list as a set of required capabilities of the current atomic task.
4. The system of claim 3, wherein the confidence gating module creates a confidence evaluation table within the current session object, each row of the confidence evaluation table corresponds to one of the initial candidate sets, and each row includes an agent identification column, a capability coverage count column, a capability miss count column, and a confidence pass flag column, the confidence gating module initializes the capability coverage count column and the capability miss count column to a preset initial value and initializes the confidence pass flag column to a pending state, the confidence gating module performs a capability matching operation to update the capability coverage count column and the capability miss count column, the capability matching operation includes traversing each capability tag in the demand capability set and checking the presence of the capability tag in a capability tag list field of the corresponding agent, the confidence gating module performs a confidence decision operation, the confidence decision operation includes setting the confidence pass flag column to a pass state when the capability miss count column satisfies a preset miss condition, setting a confidence element pass flag according to a capability coverage count column and a preset reject ratio to a preset reject candidate set when the capability miss count column does not satisfy the preset miss condition, and forming a dynamic task gate state by setting a confidence element pass flag from the capability coverage count column to the corresponding candidate set to a dynamic state.
5. The system of claim 4, wherein the dynamic routing module defines each agent in the candidate set after gating as a routing arm for the current atomic task and creates an arm state record for each routing arm in the current session object, the arm state record including an arm identification field, a test count field, and a success count field, and the life cycle deadline of the test count field being defined in an execution period of the current user request, the dynamic routing module creates a context feature record in the current session object, the context feature record including a1 st feature slot, a2 nd feature slot, a3 rd feature slot, and a 4 th feature slot, the 1 st feature slot storing a position index of the current atomic task in the atomic task sequence, the 2 nd feature slot storing a number of elements of the set of demand capabilities, the 3 rd feature slot storing a number of elements of the candidate set after gating, the 4 th feature slot storing a number of atoms completed in the current session object, the dynamic routing module performs an arm score calculation operation for each routing arm and selects a final highest scoring arm, the routing arm score as a candidate set, the routing arm score calculation function, and a tool score map is generated by the tool score, the tool score is generated by matching a value with a request of the current element and a request capability set based on a request score list, the tool score is generated by a request and a request for a tool score is generated by matching a value with a value based on a performance of the candidate set, and updating the successful count field and writing the execution result into an output buffer area of the atomic task when the execution result meets the preset effective output condition, and summarizing the content of the output buffer area and returning the content to the application layer after all the atomic tasks are completed.
6. A dynamic routing method of a multi-level agent orchestration system is characterized by being executed by the system according to any one of claims 1 to 5 and comprises the steps of receiving a user request and generating a session object, creating a root task node based on the session object and executing recursion decomposition of an HTN hierarchical task network to obtain an atomic task sequence, calling a large language model for each atomic task in the atomic task sequence to generate a demand capacity set, screening out a gated candidate set from an initial candidate set based on Conformal confidence gating by a confidence gating module, mapping the gated candidate set into routing arms based on ContextualBandit and selecting a selected routing arm according to arm score calculation by a dynamic routing module, delegating the atomic task to the selected routing arm to be executed by an agent corresponding to obtain an execution result and writing the execution result into an output buffer, and summarizing and outputting the content of the buffer.
7. The method of claim 6, wherein the recursively decomposing operation includes sending a decomposing instruction to the large language model for a current task node and receiving a subtask description list, creating subtask nodes one by one for the subtask description list and writing into subtask list fields, dividing the subtask nodes into composite types or atom types according to the task type fields and continuing the recursively decomposing operation for the composite types of subtask nodes, traversing all atom type leaf task nodes after the recursively decomposing operation to generate an atomic task sequence, the order of the atomic task sequence being determined by the traversing order.
8. The method of claim 6, wherein the agent screening operation for the current atomic task includes reading all agent configuration records from an agent registry and forming an initial candidate set, invoking a large language model and sending capability extraction instructions to obtain a capability tag list and store as a required capability set, creating a confidence assessment table within the session object and creating a corresponding row for each agent in the initial candidate set, initializing a capability coverage count column, a capability missing count column, and a confidence pass flag column, traversing the required capability set for each agent and updating the capability coverage count column or the capability missing count column based on the presence of capability tags in the capability tag list field.
9. The method of claim 8 wherein the confidence decision operation includes reading a capability coverage count column and a capability miss count column for a current row in the confidence assessment table, setting a confidence pass flag column to a pass state when the capability miss count column is a preset miss boundary value, comparing a preset ratio threshold of the capability coverage count column to a total number of elements of the required capability set when the capability miss count column is greater than the preset miss boundary value and accordingly setting the confidence pass flag column to a pass state or a reject state, screening the confidence pass flag column from the confidence assessment table as a pass state row and forming a gated candidate set.
10. The method of claim 6 wherein the routing operation includes defining each agent in the candidate set after gating as a routing arm and creating an arm status record for each routing arm and initializing a test count field and a success count field for each routing arm in the session object, creating a context feature record containing the 1 st feature slot to the 4 th feature slot, calculating a capability match number and a tool availability for each routing arm and generating a base score, determining an exploration prize value based on the test count field and synthesizing a final score with the base score, selecting the routing arm with the largest final score as the selected routing arm and updating the test count field, sending an atomic task to the selected routing arm corresponding to the agent and receiving an execution result, updating the success count field and writing the execution result into the output buffer based on whether the execution result meets a preset valid output condition, sequentially repeating the routing operation for the atomic task sequence until all the atomic tasks are completed, and returning after the summary of the contents of the output buffer is performed.

Description

Multi-level agent arrangement system and dynamic routing method thereof Technical Field The invention relates to the technical field of artificial intelligence, in particular to a multi-level intelligent agent arrangement system and a dynamic routing method thereof. Background With the development of artificial intelligence technology, intelligent systems based on large language models gradually evolve from a single dialogue form to a collaborative execution form facing complex tasks. In early applications, agents were typically run in a single instance manner, directly generating results through one or more model reasoning, which can meet basic requirements in simple scenarios such as question and answer retrieval, text generation, etc. However, as application scenarios evolve toward enterprise-level business processes, cross-system collaboration, and complex decision support, single agents are increasingly exposing significant drawbacks in terms of capability coverage, execution stability, and controllability. In order to solve the problem of limited single body capability, a multi-agent cooperation idea is gradually introduced in the prior art, and tasks are completed jointly by configuring a plurality of agents with different capability emphasis. And part of the system adopts a regular routing mode to map the user request to preset agents according to keywords or regular conditions, and the system also carries out series execution on a plurality of agents according to a preset sequence in a fixed flow or diagram structure mode. Such schemes improve the functional integrity of the system to some extent, but still rely on manually designing routing rules or execution paths ahead of time. When the task structure changes or the business requirement is adjusted, the rules or the flow needs to be reconfigured, the system flexibility is not enough, and the maintenance cost is high. In further exploration, there are technical solutions that attempt to introduce automatic planning capability, disassemble complex tasks by model reasoning, and then execute subtasks by different agents respectively. Such schemes usually take natural language planning results as execution basis, but in actual operation, strict task structure constraints are often lacking, and the performability and consistency of the planning results are difficult to ensure. On one hand, the task disassembly granularity is unstable, the problem of excessive disassembly or insufficient disassembly easily occurs, and on the other hand, the uncontrollable execution process is caused by the lack of definite termination conditions and state identifiers, so that the task is difficult to stably run in an engineering system. Disclosure of Invention In view of this, the invention provides a multi-level agent arranging system and a dynamic routing method thereof, which recursively decomposes user requests based on an HTN layered task network in an arranging layer to generate an atomic task sequence, and combines an agent registration center, a Conformal confidence gating mechanism and a ContextualBandit dynamic routing strategy to realize reliable screening, self-adaptive delegation and orderly execution of the atomic task, so that the multi-agent cooperation process has clear structural boundaries, interpretable decision logic and stable execution results on the premise of not depending on historical training data, and the execution certainty, system controllability and engineering landing capability under complex task scenes are effectively improved. The technical scheme includes that the multi-level agent arrangement system comprises an application layer, an arrangement layer, an agent registration center, a confidence gating module, a dynamic routing module and an agent layer, wherein the application layer is used for receiving a user request and generating a session object, the arrangement layer is used for creating a root task node based on the session object and performing recursion decomposition of an HTN hierarchical task network to generate an atomic task sequence, the agent registration center is used for storing an agent configuration record, the confidence gating module is used for reading an initial candidate set from the agent registration center aiming at each atomic task in the atomic task sequence and generating a gated candidate set based on Conformal confidence gating, the dynamic routing module is used for selecting a selected routing arm from the gated candidate set based on ContextualBandit and sending an atomic task to an agent corresponding to the selected routing arm through a task delegation interface of the arrangement layer to obtain an execution result, the agent layer is used for executing the atomic task and returning the execution result, and the arrangement layer is used for summarizing the execution result of each atomic task and outputting the execution result to the application layer. Further, the root task node created by