CN-122021711-A - Multi-agent action prediction method based on intention-driven differential attention model

CN122021711ACN 122021711 ACN122021711 ACN 122021711ACN-122021711-A

Abstract

The application discloses a multi-agent action prediction method based on an intention-driven differential attention model, and belongs to the technical field of multi-agent action prediction. The method comprises the steps of obtaining environmental information around an unmanned aerial vehicle group and one-step actions of the unmanned aerial vehicle group as input data, inputting the input data into a short-term intention encoder module, extracting correlation among the input data to obtain current intention characteristics, wherein the short-term intention encoder module comprises a gate circulation unit and an intention encoder, inputting the current intention characteristics into a differential attention module, carrying out weighted aggregation on the current intention characteristics through differential attention weights to obtain perception characteristics, and generating current action instructions of the unmanned aerial vehicle group based on the perception characteristics, wherein the differential attention module comprises a message attention sub-module and an environmental attention sub-module. The method improves the accuracy of the unmanned aerial vehicle group action prediction.

Inventors

WANG RUI
YIN JIAJIE
HU LONG
HAO YIXUE
LI XIANZHI

Assignees

华中科技大学

Dates

Publication Date: 20260512
Application Date: 20251219

Claims (10)

1. A multi-agent motion prediction method based on an intent-driven differential attention model, the method comprising: acquiring environmental information around the unmanned aerial vehicle group and one-step action of the unmanned aerial vehicle group as input data; Inputting input data into a short-term intention encoder module, extracting correlation between the input data to obtain current intention characteristics, wherein the short-term intention encoder module comprises a gate circulation unit and an intention encoder, inputting the current intention characteristics into a differential attention module, and carrying out weighted aggregation on the current intention characteristics through differential attention weights to obtain perception characteristics, and the differential attention module comprises a message attention sub-module and an environment attention sub-module; and generating a current action instruction of the unmanned aerial vehicle group based on the perception characteristics, and guiding the action of the unmanned aerial vehicle group in the next step.
2. The method for multi-agent action prediction based on an intent-driven differential attention model of claim 1, wherein said inputting input data into a short-term intent encoder module, extracting correlations between the input data, and deriving current intent features, comprises: Inputting input data into a gate cycle unit, and updating the state of the input data through time sequence iteration to obtain a hidden state vector; The hidden state vector and the input data are input into an intention encoder, the hidden state vector and the input data are mapped into Gaussian distribution parameters through a variation self-encoder, the Gaussian distribution parameters are sampled, and the correlation between the input data is extracted, so that the current intention characteristic is obtained.
3. The method for multi-agent action prediction based on intent-driven differential attention model of claim 2, wherein the inputting the current intent feature into the differential attention module, the weighting and aggregating the current intent feature by the differential attention weight, the obtaining the perception feature, comprises: inputting the current intention characteristic into a message attention sub-module, and performing dual-path projection and residual connection on the current intention characteristic to obtain a communication intention characteristic; Inputting the communication intention characteristic and the hidden state vector into an environment attention sub-module, and carrying out dual-path projection and differential attention weight distribution on the communication intention characteristic and the hidden state vector to obtain a perception characteristic; the message attention submodule comprises a dual-path projection module, an attention try difference module and a residual connection aggregation module.
4. The intent-driven differential attention model based multi-agent action prediction method of claim 3, wherein said inputting current intent features into a message attention sub-module, performing dual path projection and residual connection on the current intent features to obtain communication intent features, comprises: Inputting the current intention characteristic into a dual-path projection module to perform dual-path projection to obtain a first query-key vector and a second query-key vector; Inputting the first query-key vector and the second query-key vector into an attention-seeking-diagram difference module, and calculating the difference between the first query-key vector and the second query-key vector through a softmax attention-seeking diagram to obtain attention weight; And inputting the attention weight into a residual connection aggregation module, and carrying out weighted summation on the current intention characteristic to obtain the communication intention characteristic.
5. The intent-driven differential attention model based multi-agent action prediction method of claim 4, wherein said inputting the first query-key vector and the second query-key vector into the attention-seeking-to-differential module obtains an attention weight by softmax attention seeking to calculate a difference between the first query-key vector and the second query-key vector, comprising: Calculating a first softmax attention map through a primary attention path based on the first query vector and the first key vector; Calculating a second softmax attention map through a noise baseline path based on the second query vector and the second key vector; deriving an attention weight based on the difference of the first softmax attention map subtracted from the second softmax attention map multiplied by a noise suppression coefficient; the first query-key vector comprises a first query vector and a first key vector, the second query-key vector comprises a second query vector and a second key vector, and the noise suppression coefficient is used for adjusting the suppression intensity of the noise baseline path.
6. The intent-driven differential attention model based multi-agent action prediction method of claim 5, wherein inputting the communication intent feature and the hidden state vector into an environmental attention sub-module, performing dual path projection and differential attention weight distribution on the communication intent feature and the hidden state vector to obtain a perception feature, comprises: inputting the communication intention characteristic and the hidden state vector into an environment attention sub-module, and carrying out dual-path projection on the communication intention characteristic and the hidden state vector to obtain a third query-key vector and a fourth query-key vector; Calculating a difference value between the third query-key vector and the fourth query-key vector through softmax attention map to obtain a differential attention weight; and carrying out weighted summation on local features corresponding to the communication intention features based on the differential attention weights to obtain perception features.
7. The intent-driven differential attention model based multi-agent action prediction method of claim 6, wherein inputting the communication intent feature and the hidden state vector into an environmental attention sub-module, performing dual-path projection on the communication intent feature and the hidden state vector to obtain a third query-key vector and a fourth query-key vector, comprises: inputting the communication intention characteristic into a first linear transformation layer to perform linear transformation, and generating a third query vector and a third key vector; Inputting the hidden state vector to a second linear transformation layer for linear transformation to generate a fourth query vector and a fourth key vector; Performing main attention path projection on the third query vector and the third key vector to obtain a third query-key vector; and carrying out reference noise path projection on the fourth query vector and the fourth key vector to obtain a fourth query-key vector.
8. A multi-agent action prediction system based on an intent-driven differential attention model implemented using the multi-agent action prediction method based on an intent-driven differential attention model as claimed in any one of claims 1 to 7, characterized in that the system comprises: The acquisition module is used for acquiring environmental information around the unmanned aerial vehicle group and one-step actions of the unmanned aerial vehicle group as input data; The processing module is used for inputting input data into the short-term intention encoder module, extracting correlation between the input data and obtaining current intention characteristics, the short-term intention encoder module comprises a door circulation unit and an intention encoder, the current intention characteristics are input into the differential attention module, the current intention characteristics are weighted and aggregated through differential attention weights, and the perception characteristics are obtained, and the differential attention module comprises a message attention sub-module and an environment attention sub-module; The generation module is used for generating a current action instruction of the unmanned aerial vehicle group based on the perception characteristics and guiding the action of the unmanned aerial vehicle group in the next step.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the multi-agent action prediction method based on an intent-driven differential attention model of any one of claims 1 to 7 when the program is executed.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the multi-agent action prediction method based on an intent-driven differential attention model as claimed in any of claims 1 to 7.

Description

Multi-agent action prediction method based on intention-driven differential attention model Technical Field The application belongs to the technical field of multi-agent behavior prediction, and particularly relates to a multi-agent behavior prediction method based on an intention-driven differential attention model. Background In modern society, multi-agent systems have gained widespread attention in recent years in robot clusters, unmanned aerial vehicle formation, intelligent transportation, and other scenarios due to their advantages in distributed control, collaborative decisions, and complex task decomposition. In such systems, each agent is typically based on local observation independent decisions, while the partially observable environmental structure makes it difficult for a single agent to accurately infer environmental states or predict the behavior of other agents. Therefore, multi-agent reinforcement learning is gradually a main technical route for improving the collaborative capability, and better learning stability and decision consistency are realized by means of centralized training and decentralized execution modes. In this context, the communication mechanism is an important auxiliary means to alleviate local underobservation and to raise the level of collaboration. The existing method commonly realizes information sharing among agents through an explicit communication mode, such as broadcast communication, point-to-point communication based on target selection and the like. In terms of generation and processing of communication messages, common techniques include message encoding based on continuous vectors, discrete symbols, and natural language, with specific designs depending on differences in expressive power and communication costs. In order to improve communication efficiency and reduce the impact of redundant information, attention mechanisms are increasingly being introduced into multi-agent communication frameworks for screening more relevant content from a large number of teammate messages and improving the effectiveness of cross-agent information integration. In the environment that high communication pressure and redundant information exist in a large quantity, the existing unmanned aerial vehicle group has the technical problems of short-term intention mismatching, averaging effect and local observation noise interference. Disclosure of Invention The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application provides a multi-agent action prediction method based on an intention-driven differential attention model, which can obviously reduce the invalid diffusion of attention in communication messages and improve the focusing capability of key communication sources. In a first aspect, the present application provides a multi-agent action prediction method based on an intent-driven differential attention model, the method comprising: acquiring environmental information around the unmanned aerial vehicle group and one-step action of the unmanned aerial vehicle group as input data; Inputting input data into a short-term intention encoder module, extracting correlation between the input data to obtain current intention characteristics, wherein the short-term intention encoder module comprises a gate circulation unit and an intention encoder, inputting the current intention characteristics into a differential attention module, and carrying out weighted aggregation on the current intention characteristics through differential attention weights to obtain perception characteristics, and the differential attention module comprises a message attention sub-module and an environment attention sub-module; and generating a current action instruction of the unmanned aerial vehicle group based on the perception characteristics, and guiding the action of the unmanned aerial vehicle group in the next step. According to one embodiment of the present application, the inputting the input data into the short-term intention encoder module, extracting the correlation between the input data, and obtaining the current intention feature includes: Inputting input data into a gate cycle unit, and updating the state of the input data through time sequence iteration to obtain a hidden state vector; The hidden state vector and the input data are input into an intention encoder, the hidden state vector and the input data are mapped into Gaussian distribution parameters through a variation self-encoder, the Gaussian distribution parameters are sampled, and the correlation between the input data is extracted, so that the current intention characteristic is obtained. According to one embodiment of the present application, the inputting the current intention feature into the differential attention module, and weighting and aggregating the current intention feature through the differential attention weight, to obtain the perception feature, inclu