CN-121998063-A - Causal reasoning driven multi-agent cooperation method, system, medium and equipment
Abstract
The application provides a causal reasoning driven multi-agent cooperation method, a causal reasoning driven multi-agent cooperation system, a causal reasoning driven multi-agent cooperation medium and causal reasoning driven multi-agent cooperation equipment, wherein the causal reasoning driven multi-agent cooperation method comprises the steps of acquiring data transmitted among a plurality of agents in a cross-domain manner, analyzing the data to generate an agent interconnected flow data structure; according to priori knowledge and flow data structures, a causal model is established, causal relations among agents are described by the causal model, key factors affecting the communication of the agents are found out according to the causal relations, multiple potential cooperation modes composed of multiple optimization strategies are generated for the key factors, real-time tasks are acquired, potential optimal cooperation modes are selected for the real-time tasks by reinforcement learning, a counter facts scene of the unselected multiple potential cooperation modes is established, benefits of the multiple potential cooperation modes are verified, the optimal cooperation modes are determined, and cross-domain operation of the real-time tasks is executed in the optimal cooperation modes.
Inventors
- XIAO HAN
- XU CHANGQIAO
- LIU XING
- ZHANG XIAOTIAN
- CHEN TIANXIANG
- Cao tengfei
Assignees
- 北京邮电大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251208
Claims (10)
- 1. A causal reasoning driven multi-agent collaboration method, the method comprising: acquiring data transmitted among a plurality of agents in a cross-domain manner, and analyzing the data to generate a flow data structure of the interconnection of the agents; establishing a causal model according to priori knowledge and the flow data structure, and describing causal relation among the agents by using the causal model; Locking key factors influencing communication among a plurality of agents according to a causal model, and generating a plurality of potential cooperation modes composed of a plurality of optimization strategies aiming at the key factors; Acquiring a real-time task, and selecting a potential optimal cooperation mode of the real-time task from a plurality of potential cooperation modes by using a reinforcement learning algorithm; and establishing a counter-facts scene of the unselected multiple potential cooperation modes, verifying benefits of the multiple potential cooperation modes, determining an optimal cooperation mode, and executing cross-domain jobs of the real-time tasks in the optimal cooperation mode.
- 2. The causal inference driven multi-agent collaboration method of claim 1, wherein obtaining data transmitted across domains between a plurality of agents, analyzing the data to generate a traffic data structure for the agent interconnect, comprises: Acquiring data transmitted among a plurality of intelligent agents in a cross-domain manner, and generating a fluctuation form of the data and a data quantity of the data; and analyzing peak fluctuation in the fluctuation form, other fluctuation smaller than the peak fluctuation and the data volume, and generating multiple modes of data transmitted between the intelligent agents and key modes corresponding to the peak fluctuation.
- 3. The causal inference driven multi-agent collaboration method of claim 1, wherein building a causal model from a priori knowledge and the flow data structure, using the causal model to describe causal relationships between a plurality of agents, comprises: Generating physical link information and task information among a plurality of intelligent agents according to the priori knowledge and the flow data structure; abstracting the link information into a path graph representation; And integrating the link information and the task information to establish the causal relationship, wherein the causal relationship comprises a causal structure, a time sequence causal graph and a causal path.
- 4. A causal inference driven multi-agent collaboration method according to claim 3, wherein generating task information between a plurality of said agents from said a priori knowledge and said traffic data structure comprises: the complexity of the task between the agents is calculated according to the following formula: f; wherein C is the complexity, m is the number of kinds of multiple modes, ω i is the complexity weight of each mode, n is the number of steps required during the task processing, λ j (j=1, 2,., n) is a step hierarchy coefficient, f is the interaction frequency between the agents, and μ is the interaction weight.
- 5. A causal inference driven multi-agent collaboration method according to claim 3, wherein generating task information between a plurality of said agents from said a priori knowledge and said traffic data structure comprises: calculating the urgency of the task between the agents according to the following formula: ; wherein E is the urgency, T is delay tolerance, k is a scaling factor, and b is the lowest delay bias.
- 6. The causal inference driven multi-agent collaboration method of claim 1, wherein the acquiring a real-time task, using a reinforcement learning algorithm, selects a potentially optimal collaboration mode of the real-time task among a plurality of the potentially collaboration modes, comprises: Acquiring historical cooperation information among a plurality of agents, establishing a multi-agent communication throughput model according to the historical cooperation information, and generating data transmission quantity of each round of transmission by utilizing the multi-agent communication throughput model; Acquiring a real-time network bandwidth, and generating an allocation bandwidth by combining the data transmission quantity; Selecting a decision element ancestor from the decision set of the potential collaboration mode, wherein the decision element ancestor is expressed as: π`=<B`,n,v`>; Wherein pi ' is the decision element, B ' is the allocated bandwidth, n is the task step, and v ' is the data transmission quantity of each round of transmission.
- 7. The causal reasoning driven multi-agent collaboration method of claim 6, wherein the steps of obtaining historical collaboration information among a plurality of agents, building a multi-agent communication throughput model according to the historical collaboration information, and generating data transmission amount of each round of transmission by using the multi-agent communication throughput model comprise: establishing a multi-agent communication throughput model according to the historical collaboration information, and predicting the change of network bandwidth; the data transmission amount is calculated according to the following formula: ; Wherein v' is the data transmission amount, B is the predicted network bandwidth, n is the task step, and T is the delay tolerance.
- 8. A causal reasoning driven multi-agent collaboration system for authentication and communication of edge gateways and devices, the system comprising: The acquisition module is used for acquiring data transmitted among a plurality of intelligent agents in a cross-domain manner, analyzing the data and generating a flow data structure of the interconnection of the intelligent agents; The first establishing module is used for establishing a causal model according to priori knowledge and the flow data structure, and describing causal relations among a plurality of agents by using the causal model; the first generation module is used for locking key factors influencing communication among a plurality of intelligent agents according to a causal model, and generating a plurality of potential cooperation modes composed of a plurality of optimization strategies aiming at the key factors; The selecting module is used for acquiring a real-time task, and selecting a potential optimal cooperation mode of the real-time task from a plurality of potential cooperation modes by utilizing a reinforcement learning algorithm; And the transmission module is used for establishing a counter-facts scene of the unselected multiple potential cooperation modes, verifying benefits of the multiple potential cooperation modes, determining an optimal cooperation mode, and executing cross-domain transmission of real-time tasks in the optimal cooperation mode.
- 9. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the causal inference driven multi-agent collaboration method of any of claims 1 to 7.
- 10. An electronic device is characterized by comprising a processor and a memory; the memory stores a computer program which, when executed by the processor, causes the processor to perform the steps of the causal reasoning driven multi-agent collaboration method as claimed in any of claims 1 to 7.
Description
Causal reasoning driven multi-agent cooperation method, system, medium and equipment Technical Field The application relates to the technical field of computers, and more particularly to causal reasoning driven multi-agent collaboration methods, systems, media, and devices. Background In recent years, the development of large model-based generative AI technology has been rapid, and AGI (general artificial intelligence) is no longer impermissible. Meanwhile, AI agents (i.e., agents) based on large models represent the leading trend and application direction of AGI, and AI agents become the main stream form of large model floor business. With the continuous evolution and development of intelligent agent technology, the interconnection and interaction among different AI intelligent agents become an important research topic. Traditional multi-node collaboration schemes are mostly based on heuristic rules or data-driven decision mechanisms. In a relatively stable, static environment, these schemes can play a role in accomplishing some established tasks. However, under a highly dynamic multi-agent cooperation scene, the schemes are relatively static and stiff, are difficult to adapt to rapid changes of environments and dynamic adjustment of agent tasks, and are easy to generate conditions of benefit fluctuation, decision misalignment and even direct failure. Disclosure of Invention In view of the above, the application aims to provide a causal reasoning-driven multi-agent cooperation method, system, medium and device, which solve the problems that the existing multi-node cooperation scheme is difficult to adapt to a highly dynamic scene and has benefit fluctuation or decision failure. To achieve one of the above disclosed objects, the present application provides a causal reasoning driven multi-agent collaboration method, the method comprising: acquiring data transmitted among a plurality of agents in a cross-domain manner, and analyzing the data to generate a flow data structure of the interconnection of the agents; Establishing a causal model according to priori knowledge and the flow data structure, and describing causal relations among a plurality of agents by using the causal model; Locking key factors influencing communication among a plurality of agents according to a causal model, and generating a plurality of potential cooperation modes of a plurality of optimization strategy groups aiming at the key factors; Acquiring a real-time task, and selecting a potential optimal cooperation mode of the real-time task from a plurality of potential cooperation modes by using a reinforcement learning algorithm; and establishing a counter-facts scene of the unselected multiple potential cooperation modes, verifying benefits of the multiple potential cooperation modes, determining an optimal cooperation mode, and executing cross-domain jobs of the real-time tasks in the optimal cooperation mode. As a further improvement of an embodiment of the present application, obtaining data transmitted across domains between a plurality of agents, analyzing the data to generate a traffic data structure of the agent interconnection, including: Acquiring data transmitted among a plurality of intelligent agents in a cross-domain manner, and generating a fluctuation form of the data and a data quantity of the data; and analyzing peak fluctuation in the fluctuation form, other fluctuation smaller than the peak fluctuation and the data volume, and generating multiple modes of data transmitted between the intelligent agents and key modes corresponding to the peak fluctuation. As a further improvement of an embodiment of the present application, establishing a causal model from a priori knowledge and the flow data structure, using the causal model to describe causal relationships between a plurality of agents, comprising: Generating physical link information and task information among a plurality of intelligent agents according to the priori knowledge and the flow data structure; abstracting the link information into a path graph representation; And integrating the link information and the task information to establish the causal relationship, wherein the causal relationship comprises a causal structure, a time sequence causal graph and a causal path. As a further improvement of an embodiment of the present application, generating task information between a plurality of said agents based on said a priori knowledge and said flow data structure includes: the complexity of the task between the agents is calculated according to the following formula: f; wherein C is the complexity, m is the number of kinds of multiple modes, ω i is the complexity weight of each mode, n is the number of steps required during the task processing, λ j (j=1, 2,., n) is a step hierarchy coefficient, f is the interaction frequency between the agents, and μ is the interaction weight. As a further improvement of an embodiment of the present application, generating t