CN-121984636-A - Unmanned aerial vehicle electronic control intelligent agent collaborative decision-making method and system
Abstract
The invention discloses a collaborative decision-making method and a collaborative decision-making system for an electromagnetic control intelligent agent of an unmanned aerial vehicle, and belongs to the technical field of unmanned aerial vehicle prevention and control. Including multi-source information fusion, threat priority computation, collaborative decision-making, resource allocation, interference enforcement, and policy optimization. And each control agent collects the position, speed and spectrum characteristic information of the target unmanned aerial vehicle and encodes the information into a state vector, and fusion situation information is generated through fusion in an attention weighting mode. Threat priority values are calculated and ranked by integrating the target distance, speed, type and recognition confidence. And each control agent calculates the state action value through Q value learning, exchanges strategy information, calculates the cooperative income value, and selects the optimal action to output a decision result. Interference power is allocated in threat priority duty cycle. An interference signal is transmitted and a target behavior change is monitored. And updating strategy parameters according to the efficiency evaluation value to form closed loop optimization. The method solves the problems of insufficient information fusion precision, low collaborative decision efficiency, unreasonable resource allocation and the like.
Inventors
- ZHAO XIUJUN
- Song Guancheng
- ZHOU YONG
Assignees
- 青岛诺德环境技术有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260202
Claims (10)
- 1. The unmanned aerial vehicle electronic control intelligent agent collaborative decision-making method is characterized by comprising the following steps of: S1, multisource information fusion, namely respectively acquiring position information, speed information and spectrum characteristic information of a target unmanned aerial vehicle by a plurality of management and control agents, wherein each management and control agent encodes the acquired information into state vectors and mutually transmits the state vectors, and the state vectors of each management and control agent are fused in an attention weighted fusion mode to generate fusion situation information; S2, threat priority calculating, namely extracting distance, speed and type parameters of each target unmanned aerial vehicle according to the fusion situation information, calculating threat priority values of each target unmanned aerial vehicle according to a weighted combination mode of distance inverse proportion, speed direct proportion and type threat coefficients, and generating threat priority sequences according to the threat priority values in a high-to-low sequence; S3, a collaborative decision step, namely calculating state-action value through a Q value learning mode according to threat priority sequences and local state information by each management and control agent, exchanging strategy information among the management and control agents, calculating a collaborative benefit value, selecting optimal actions according to the Q value and the collaborative benefit value, and outputting action decision results of each management and control agent, wherein the action decision results comprise target allocation, interference frequency bands and interference power levels; S4, distributing interference power according to threat priority sequences and threat priority value proportion, and generating power distribution values and frequency band distribution results corresponding to all target unmanned aerial vehicles; S5, configuring interference parameters according to action decision results and power distribution values by each control intelligent agent, transmitting interference signals, monitoring behavior change of the target unmanned aerial vehicle, and recording success or failure states of interference; and S6, calculating an efficiency evaluation value according to the interference success or failure state, updating the strategy parameters of Q value learning by using the efficiency evaluation value, and applying the updated strategy parameters to the next round of collaborative decision step.
- 2. The method for collaborative decision-making by an unmanned aerial vehicle electromagnetic control agent according to claim 1, wherein in step S1, the attention weighted fusion method specifically comprises the following operations: S11, the first The individual control agent encodes the collected position information, speed information and spectrum characteristic information of the target unmanned aerial vehicle into a state vector And vector the state Broadcasting to adjacent management and control intelligent agents; S12, after the self-control intelligent agent receives the state vectors sent by each adjacent control intelligent agent, the state vector of the self-control intelligent agent is recorded as Will be at the first The state vectors of adjacent control agents are recorded as Calculate the first according to the following formula Attention score of individual adjacent management agents : ; Wherein, the Is the first Attention scores of adjacent management agents; As a hyperbolic tangent function; is an attention weight matrix; is a state vector And state vector Is a splicing result of (2); is a bias vector; s13, calculating the first according to the following formula according to the attention score of each adjacent management and control intelligent agent Fusion weights of adjacent management and control agents : ; Wherein, the Is the first Fusion weights of adjacent management and control agents; is an exponential function; Is the first Attention scores of adjacent management agents; To control all adjacent intelligent agents An operator of the summation; Is the first Attention scores of adjacent management agents; S14, calculating fusion situation information according to the following formula : ; Wherein, the The information is fusion situation information; To control all adjacent intelligent agents An operator of the summation; Is the first Fusion weights of adjacent management and control agents; Is a multiplication operator; Is the first The state vectors of the adjacent management and control agents; The state vector of the agent is controlled.
- 3. The method for collaborative decision-making by an unmanned aerial vehicle electromagnetic control agent according to claim 1, wherein in step S2, the calculation of threat priority values comprises the following operations: S21, reading the first information from the fusion situation information Distance between individual target unmanned aerial vehicle and center of protection area First, the Speed of approach of individual target unmanned aerial vehicle to protection area ; S22, according to the first Type identification of each target unmanned aerial vehicle, and inquiring corresponding type threat coefficients from the type threat coefficient table Read at the same time the first Identification confidence of individual target unmanned aerial vehicle ; S23, calculating the following formula Threat priority value for individual target drones : ; Wherein, the Is the first Threat priority values for individual target drones; Is a distance weight coefficient; Is the first The distance (meters) between the target unmanned aerial vehicle and the center of the protection area; is a speed weight coefficient; Is the first The speed (meters/second) at which the individual target drones approach the protected area; Is a type weight coefficient; Is the first Type threat coefficients of the individual target drones; Is the first Identifying confidence of the target unmanned aerial vehicle; S24, enabling all target unmanned aerial vehicles to be according to threat priority values A threat priority sequence is generated from a high to low ranking.
- 4. The method for collaborative decision-making by an unmanned aerial vehicle electromagnetic control agent according to claim 1, wherein in step S3, the collaborative decision-making comprises the following operations: S31, the first Construction of local states by individual control agents Local state The method comprises the steps of controlling the position coordinates of the intelligent agent, the available interference power and the threat priority value of the currently responsible target; S32, the first Individual controlling agent definition actions Action of The method comprises the steps of selecting a target, selecting an interference frequency band and selecting an interference power level; S33, the first Individual control agent performs actions Instant rewards are obtained after The state-action pairs are updated as follows Q value of (c): ; Wherein, the The updated Q value; Is a constant; is the learning rate; Is a multiplication operator; the Q value before updating; Is the first Managing and controlling the instant rewards of the intelligent agents; Is a discount factor; The maximum Q value for the next state; S34, the first The individual management and control agent broadcasts the local strategy information to the adjacent management and control agent, and after receiving the strategy information of the adjacent management and control agent, calculates the cooperative income value according to the following formula : ; Wherein, the Is a collaborative benefit value; awarding coefficients for success; Is a multiplication operator; target number for successful management; Is the total target number; a stacking penalty coefficient; for a target number of repeated disturbances by a plurality of management intelligence agents; s35, the first The individual control agent adds the Q value and the collaborative gain value to be used as the comprehensive evaluation value, and selects the action with the largest comprehensive evaluation value And outputting as an action decision result.
- 5. The unmanned aerial vehicle electromagnetic control agent collaborative decision-making method according to claim 1, wherein the resource allocation in step S4, the interference execution in step S5, and the policy optimization in step S6 specifically comprise the following operations: s41, counting the current total available power of the system The allocation to the first is calculated according to the following formula Power allocation value of individual target unmanned aerial vehicle : ; Wherein, the To be allocated to the first Power allocation values (watts) for the individual target drones; Total available power (watts) for the system; Is the first Threat priority values for individual target drones; Threat priority value sum for all target unmanned aerial vehicles; s42, according to the first The frequency spectrum characteristic information of the target unmanned aerial vehicle is matched with the corresponding interference frequency band from a preset frequency band library; s43, each control intelligent agent distributes values according to the power The interference signal generator is configured in the matched interference frequency band, and transmits an interference signal and continuously monitors the flight state of the target unmanned aerial vehicle; s44, recording interference success when forced landing, returning or hovering behaviors of the target unmanned aerial vehicle occur, otherwise recording interference failure, and counting the number of interference successes And number of interference failures ; S45, calculating a performance evaluation value according to the following formula : ; Wherein, the Is an efficacy evaluation value; Is the number of interference successes; Is the number of interference failures; S46, updating strategy parameters according to the following formula : ; Wherein, the The updated strategy parameters; The policy parameters before updating; is the strategy learning rate; Is a multiplication operator; Is an efficacy evaluation value; Is the reference efficacy value; Is the strategic gradient direction.
- 6. An unmanned aerial vehicle electronic management control agent co-decision system for implementing the method of any one of claims 1 to 5, comprising: each management and control agent node comprises a sensing acquisition unit, an information fusion unit, a decision calculation unit, an interference execution unit and a feedback monitoring unit; the sensing acquisition unit comprises a radar detection module, a photoelectric tracking module and a radio detection module, wherein the output end of the radar detection module is connected with the first input end of the information fusion unit and is used for outputting the position information and the speed information of the target unmanned aerial vehicle; The information fusion unit comprises a state coding module and an attention fusion module, wherein the input end of the state coding module is respectively connected with the output ends of the radar detection module, the photoelectric tracking module and the radio detection module, the output end of the state coding module is connected with the input end of the attention fusion module and used for coding perception data into state vectors; The decision-making calculation unit comprises a threat assessment module, a Q value calculation module and an action selection module, wherein the input end of the threat assessment module is connected with the output end of the attention fusion module, the output end of the threat assessment module is connected with the first input end of the Q value calculation module and is used for executing the step S2 to calculate the threat priority value and output the threat priority sequence; the interference execution unit comprises a resource allocation module, an interference signal generator and a directional antenna, wherein a first input end of the resource allocation module is connected with an output end of the threat assessment module, a second input end of the resource allocation module is connected with an output end of the action selection module, and an output end of the resource allocation module is connected with a control end of the interference signal generator and used for executing the step S4 to calculate a power allocation value; The input end of the feedback monitoring unit is connected with the output end of the sensing acquisition unit, the first output end of the feedback monitoring unit is connected with the second input end of the Q value calculation module, and the second output end of the feedback monitoring unit is connected with the third input end of the resource allocation module, so that the behavior change of the target unmanned aerial vehicle is monitored, and the performance evaluation and the strategy optimization of the step S6 are executed.
- 7. The unmanned aerial vehicle electromagnetic control agent collaborative decision-making system of claim 6, wherein the attention fusion module comprises: The first input end of the vector splicing sub-module is connected with the output end of the state coding module to receive the state vector of the management and control agent node, the second input end of the vector splicing sub-module is connected with the agent communication interface to receive the state vector of other management and control agent nodes, and the output end of the vector splicing sub-module is connected with the input end of the attention calculating sub-module; The attention calculating sub-module is connected with the first input end of the weighted summation sub-module at the output end thereof and is used for calculating the attention score and the fusion weight of each adjacent management and control agent; and the second input end of the weighting summation sub-module is connected with the intelligent agent communication interface to receive the state vector of each adjacent management and control intelligent agent node, and the output end of the weighting summation sub-module is connected with the input end of the decision calculation unit and is used for outputting the fusion situation information.
- 8. The unmanned aerial vehicle electromagnetic control agent collaborative decision-making system according to claim 6, wherein the Q-value calculation module comprises: A Q value memory for storing a Q value of the state-action pair; the first input end of the Q value updater is connected with the output end of the Q value memory to read the Q value, the second input end of the Q value updater is connected with the first output end of the feedback monitoring unit to receive instant rewards, and the output end of the Q value updater is connected with the input end of the Q value memory to write the updated Q value; the input end of the strategy parameter memory is connected with the first output end of the feedback monitoring unit to receive the strategy parameter updating value, and the output end of the strategy parameter memory is connected with the input end of the action selection module to output the current strategy parameter.
- 9. The unmanned aerial vehicle electromagnetic control agent collaborative decision-making system according to claim 6, wherein the disturbance execution unit further comprises: The power amplifier is connected with the output end of the interference signal generator, the output end of the power amplifier is connected with the input end of the directional antenna, and the control end of the power amplifier is connected with the output end of the resource allocation module to receive the power allocation value; the first input end of the beam controller is connected with the output end of the action selection module to receive the beam pointing information in the action decision result, the second input end of the beam controller is connected with the output end of the threat assessment module to receive the target position information, and the output end of the beam controller is connected with the beam control end of the directional antenna.
- 10. The unmanned aerial vehicle electromagnetic control agent collaborative decision-making system according to claim 6, wherein the feedback monitoring unit comprises: the input end of the behavior judging module is connected with the output end of the sensing acquisition unit to receive real-time position information and speed information of the target unmanned aerial vehicle, and the real-time position information and speed information are used for judging the success or failure state of interference; The input end of the efficiency calculation module is connected with the output end of the behavior judgment module to receive the interference success or failure state, and the output end of the efficiency calculation module is connected with the second input end of the Q value calculation module and is used for calculating an efficiency evaluation value and instant rewards; The input end of the policy optimization module is connected with the output end of the efficiency calculation module to receive the efficiency evaluation value, and the output end of the policy optimization module is connected with the input end of the policy parameter memory in the Q value calculation module and is used for calculating a policy parameter update value.
Description
Unmanned aerial vehicle electronic control intelligent agent collaborative decision-making method and system Technical Field The invention belongs to the technical field of unmanned aerial vehicle prevention and control, and particularly relates to an unmanned aerial vehicle electromagnetic control intelligent agent collaborative decision-making method and system. Background With the rapid development and wide popularization of unmanned aerial vehicle technology, unmanned aerial vehicles are widely applied to civil fields such as aerial photography, logistics, agriculture and the like, but simultaneously, the unmanned aerial vehicle also brings serious challenges for low-altitude safety protection. An illegally invaded unmanned aerial vehicle may cause security threat to sensitive areas such as airports, important facilities, large-scale places of activities and the like, so that unmanned aerial vehicle electromagnetic control technology is generated. The electromagnetic control technology blocks a communication link between the unmanned aerial vehicle and the remote controller by transmitting interference signals in a specific frequency band, so that the unmanned aerial vehicle is forced to approach landing, return or hover, and effective control over the invading unmanned aerial vehicle is realized. In practical applications, coverage and processing capacity of a single management and control node are limited, and a plurality of management and control nodes are often required to be deployed to form a cooperative protection network so as to cope with multi-objective and large-scale protection requirements. The existing unmanned aerial vehicle electromagnetic control technology has a plurality of technical problems in the aspect of multi-node cooperation. In the aspect of information fusion, the existing system generally adopts a fusion mode of simple average or fixed weight when processing the perception data of a plurality of management and control nodes, and cannot carry out self-adaptive adjustment according to the quality difference and the correlation of the observation data of each node, so that the fused situation information has insufficient precision and influences the accuracy of subsequent decisions. In the aspect of threat assessment, the existing method mostly adopts single factor judgment or simple threshold division, cannot comprehensively consider the influence of multi-dimensional factors such as target distance, approaching speed, unmanned aerial vehicle type, recognition confidence and the like, and is difficult to accurately quantify the actual threat degree of each target. In the aspect of collaborative decision, the existing multi-node system adopts a centralized decision architecture or independent decision mode of each node, the former has the problems of strong communication dependence and large decision delay, and the latter easily has the problem that a plurality of nodes repeatedly interfere with resources responsible by the same target or part of targets without nodes. In terms of resource allocation, the existing system mostly adopts a static pre-allocation or average allocation mode, and cannot be dynamically adjusted according to the target threat level, so that the conditions of insufficient high threat target interference power and low threat target resource waste are caused. In the aspect of policy optimization, the existing system lacks real-time evaluation of interference efficiency and an online updating mechanism of policy parameters, and cannot continuously improve a decision policy according to execution feedback, so that long-term operation performance of the system and adaptability to new conditions are affected. Disclosure of Invention In order to solve the problems in the background technology, the invention provides an unmanned aerial vehicle electromagnetic control intelligent agent collaborative decision-making method, which comprises the following steps: S1, multisource information fusion, namely respectively acquiring position information, speed information and spectrum characteristic information of a target unmanned aerial vehicle by a plurality of management and control agents, wherein each management and control agent encodes the acquired information into state vectors and mutually transmits the state vectors, and the state vectors of each management and control agent are fused in an attention weighted fusion mode to generate fusion situation information; S2, threat priority calculating, namely extracting distance, speed and type parameters of each target unmanned aerial vehicle according to the fusion situation information, calculating threat priority values of each target unmanned aerial vehicle according to a weighted combination mode of distance inverse proportion, speed direct proportion and type threat coefficients, and generating threat priority sequences according to the threat priority values in a high-to-low sequence; S3, a collaborative decisi