CN-121981272-A - Multi-agent self-organizing collaborative decision-making method and device driven by multi-brain-region-like neural mechanism

CN121981272ACN 121981272 ACN121981272 ACN 121981272ACN-121981272-A

Abstract

The application discloses a multi-agent self-organizing collaborative decision-making method driven by a multi-brain-area-like neural mechanism and a device thereof, and relates to the field of multi-agent collaborative decision-making; judging whether each agent is cognitively synchronous with the neighbor agents according to the cognition state of each agent in the multi-agent cluster, and carrying out information interaction to update the cognition state of each agent during cognition synchronization, and carrying out multi-agent self-organizing collaborative decision based on game learning according to the cognition state updated by each agent to obtain the execution decision of each agent so as to complete the task target. According to the application, a cognitive synchronization framework simulating a cooperative work mechanism of a plurality of functional brain regions of the human brain is constructed, so that a plurality of intelligent agents are endowed with distributed reasoning capability, and under the condition of no central scheduling, efficient, robust and self-adaptive cooperative decision based on game relation is realized.

Inventors

WANG XIAOYI
YANG LIWEN
XU MING
QI XUGUANG
BAI XUE
DING JIXIN
CHEN ZHAOYUE

Assignees

北京航空航天大学

Dates

Publication Date: 20260505
Application Date: 20260206

Claims (10)

1. The multi-agent self-organizing collaborative decision-making method driven by the neural mechanism of the multiple brain regions is characterized by comprising the following steps: constructing a multi-agent cluster with a brain-like region cognitive architecture according to a task target; Judging whether each agent is in cognitive synchronization with a neighbor agent according to the cognitive state of each agent in the multi-agent cluster, and performing information interaction during the cognitive synchronization to update the cognitive state of each agent; and carrying out multi-agent self-organizing collaborative decision based on game learning according to the updated cognitive state of each agent to obtain an execution decision of each agent so as to complete a task target.
2. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 1, wherein the multi-agent cluster with a brain-region-like cognitive architecture is constructed according to a task target, and specifically comprises the following steps: constructing an initial agent cluster according to the task target; The intelligent brain recognition system comprises an initial intelligent body cluster, a basic brain recognition module, a decision module and a decision module, wherein the basic brain recognition module comprises a perception module, a memory module, an inference module and the decision module, the perception module is used for simulating a brain sensory cortex and is responsible for receiving original observation data from the environment, the memory module is used for simulating a brain sea horse body and is responsible for storing and retrieving historical experience, environment state information and interaction information with other intelligent bodies, the inference module is used for simulating a brain forehead leaf cortex and is responsible for carrying out situation evaluation, strategy planning and value judgment based on the current perception information and the information stored by the memory module and outputting an inference intermediate result, and the decision module is used for simulating a brain movement cortex and is responsible for generating executable action decisions of the intelligent bodies according to the output results of the perception module, the memory module and the inference module; defining an initial communication topological structure, an initial cognitive state and a cross-module and cross-agent communication mechanism of each agent in an initial agent cluster; Defining dominant function trends of each agent in the initial agent cluster according to task targets to obtain a multi-agent cluster with brain-like region cognitive architecture, wherein the dominant function trends comprise perception dominant function trends, memory dominant function trends, reasoning dominant function trends and decision dominant function trends.
3. The multi-agent self-organizing collaborative decision-making method driven by the multi-brain-region-like neural mechanism according to claim 2, wherein the cross-module and the cross-agent communication mechanism are described by a neural coupling relation dynamics equation, and the neural coupling relation dynamics equation is: Wherein, the Representation of Is a first order derivative of (a); Representing the cognitive state of agent i; Representing a nonlinear neural power function within the brain region of the agent i; a cognitive state vector estimate representing agent j; the connection weight matrix between the brain areas of the intelligent agent i and the neighbor intelligent agent j.
4. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 2, wherein the dominant functional propensity of each agent in the initial agent cluster is defined according to a task objective, and specifically comprises: The dominant functional propensity of each agent is determined based on the location of the agent in the communication topology.
5. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 2, wherein the dominant functional propensity of each agent in the initial agent cluster is defined according to a task objective, and specifically comprises: For each intelligent agent, calculating each capability score of the intelligent agent according to the perception performance index, the memory capacity index, the reasoning capability index and the decision execution capability index, wherein each capability score of the intelligent agent comprises a perception capability score, a memory capability score, a reasoning capability score and a decision capability score; Each intelligent agent broadcasts each capability score to the neighbor intelligent agents in a local communication mode; and each intelligent agent determines the dominant function tendency of each intelligent agent according to each capability score of the intelligent agent and each capability score competition role of each neighbor intelligent agent.
6. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 2, wherein the dominant functional propensity of each agent in the initial agent cluster is defined according to a task objective, and specifically comprises: defining a dominant function trend parameter vector for each agent, and converting the dominant function trend parameter vector into character probability distribution corresponding to each character through a softmax form; simulating a plurality of rounds of preset tasks based on the initial agent clusters, and calculating global performance indexes after task simulation; calculating a meta gradient for the dominant function trend parameter vector of each agent according to the global performance index, and updating the dominant function trend parameter vector according to the meta gradient; calculating the role probability distribution of each agent according to the updated dominant function trend parameter vector; determining dominant function trends of each agent according to the role probability distribution; judging whether the dominant function tendency of each intelligent agent at present meets the preset requirement; If yes, obtaining dominant function tendency of each intelligent agent; If not, the dominant function trend of each current intelligent agent is used as the dominant function trend of each intelligent agent in the initial intelligent agent cluster, and the step of ' performing simulation of multiple preset tasks based on the initial intelligent agent cluster ' is returned to calculate the global performance index after task simulation '.
7. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 1, wherein the method is characterized by judging whether each agent is cognitively synchronized with a neighbor agent according to the cognition state of each agent in a multi-agent cluster, and performing information interaction to update the cognition state of each agent during cognition synchronization, and specifically comprises the following steps: for the cognitive state of each agent, determining the cognitive processing frequency of each agent by using a phase rhythm generator; According to the cognitive processing frequency of each intelligent agent, updating the instantaneous phase of the intelligent agent by using a phase dynamics equation; for each agent, calculating whether the instant phase difference between the agent and each neighbor agent at the current moment is smaller than a preset synchronization threshold; Judging whether the instant phase difference at the current moment is smaller than a preset synchronization threshold value or not; If yes, the two intelligent agents with the instantaneous phase difference smaller than the preset synchronization threshold value at the current moment are considered to be synchronous for information interaction; if not, returning to the step of 'the cognitive state of each agent, and determining the cognitive processing frequency of each agent by using a phase rhythm generator'.
8. The multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism according to claim 1, wherein the multi-agent self-organizing collaborative decision-making based on game learning is performed according to the updated cognitive state of each agent, and the decision-making is performed by each agent to complete a task objective, and specifically comprises the following steps: for each agent, constructing a differentiated profit function based on the function roles according to the updated cognitive state of the agent; updating the strategy parameters according to the current strategy parameters of the strategy network adopted by the decision module of the intelligent agent and the gradient of the differentiated yield function to obtain updated strategy parameters; whether the iteration convergence condition is met currently; If yes, obtaining the finally updated strategy parameters of each intelligent agent, giving the strategy parameters to a strategy network, and determining the execution decision of each intelligent agent by applying the strategy network given the parameters according to the current updated cognitive state; If not, the current updated policy parameters are made to be the current policy parameters, and the step of carrying out policy parameter update according to the current policy parameters of the policy network adopted by the decision module of the intelligent agent and the gradient of the differentiated yield function is returned to obtain updated policy parameters.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism of any one of claims 1-8.
10. A computer program product comprising a computer program which, when executed by a processor, implements the multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism of any one of claims 1-8.

Description

Multi-agent self-organizing collaborative decision-making method and device driven by multi-brain-region-like neural mechanism Technical Field The application relates to the field of multi-agent collaborative decision-making, in particular to a multi-agent self-organizing collaborative decision-making method and device driven by a neural mechanism of a similar multi-brain region. Background Currently, multi-agent collaborative decision-making technology is widely applied to the fields of intelligent transportation, unmanned aerial vehicle clusters, industrial automation and the like. The main stream method comprises centralized reinforcement learning, consensus algorithm, game equilibrium solving and the like. However, these methods generally have three problems of (1) high dependence on a central node or full-connection communication, difficulty in adapting to a dynamic, large-scale and resource-limited real environment, (2) lack of modeling of an internal cognitive process of an agent, resulting in stiff cooperative behavior and difficulty in processing complex reasoning tasks, and (3) most of the methods assume that the agent is completely cooperated or completely competed, and cannot effectively process cooperation in a mixed-engine scene. Disclosure of Invention The application aims to provide a multi-agent self-organizing collaborative decision-making method and device driven by a multi-brain-area neural mechanism, which can solve the problems that a central node or full-connection communication is highly dependent, and a dynamic, large-scale and resource-limited real environment is difficult to adapt, solve the problems that the modeling of an intelligent agent internal cognitive process is lacking, so that collaborative behavior is stiff, and complex reasoning tasks are difficult to process, and solve the problem that most methods assume that intelligent agents are fully cooperated or fully competed, and cannot effectively process collaboration in a mixed-engine scene. In order to achieve the above object, the present application provides the following solutions: in a first aspect, the present application provides a multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism, comprising: constructing a multi-agent cluster with a brain-like region cognitive architecture according to a task target; Judging whether each agent is in cognitive synchronization with a neighbor agent according to the cognitive state of each agent in the multi-agent cluster, and performing information interaction during the cognitive synchronization to update the cognitive state of each agent; and carrying out multi-agent self-organizing collaborative decision based on game learning according to the updated cognitive state of each agent to obtain an execution decision of each agent so as to complete a task target. In a second aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the multi-agent self-organizing collaborative decision-making method driven by a multi-brain-region-like neural mechanism described above. In a third aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, implements the multi-agent self-organizing collaborative decision-making method driven by the multi-brain-region neural mechanism described above. According to the specific embodiment provided by the application, the application has the following technical effects: The application provides a multi-agent self-organizing collaborative decision-making method driven by a multi-brain-area-like neural mechanism and a device thereof, wherein the method comprises the steps of constructing a multi-agent cluster with a brain-area-like cognitive architecture according to a task target; judging whether each agent is cognitively synchronous with the neighbor agents according to the cognition state of each agent in the multi-agent cluster, and carrying out information interaction to update the cognition state of each agent during cognition synchronization, and carrying out multi-agent self-organizing collaborative decision based on game learning according to the cognition state updated by each agent to obtain the execution decision of each agent so as to complete the task target. According to the application, a cognitive synchronous framework simulating a cooperative work mechanism of a plurality of functional brain regions of a human brain is constructed, a plurality of agents are endowed with distributed reasoning capability, and under the condition of no central scheduling, a high-efficiency, robust and self-adaptive cooperative decision based on a game relationship is realized, so that the problems that a central node or full-connection communication is highly depended, the dynamic, large-scale and resource-limited