CN-121612310-B - Intelligent unmanned aerial vehicle flight path planning method and system
Abstract
The invention discloses an intelligent unmanned aerial vehicle flight path planning method and system, the method comprises the steps of environment perception, multisource feature fusion attention extraction, path utility network training, exploration rewarding design, time sequence adaptation rewarding fusion and unmanned plane path planning strategy optimization. The invention belongs to the field of path planning, and particularly relates to an intelligent unmanned aerial vehicle flight path planning method and system, wherein the scheme is generated through dynamic utility labels, and weights are adaptively adjusted in a task stage; the method comprises the steps of designing environment risk factors, actively pre-judging high risk areas, keeping normal weights of low risk areas, realizing smooth transition of an exploration-target stage, reflecting the characteristics of the explored areas through neighbor samples, designing coverage guiding measure rewards to strengthen global coverage guiding, designing utility attenuation measure rewards, reducing collision risk, optimizing global coverage and target guiding based on time sequence adaptation rewards fusion, and further improving the flight path planning effect of the unmanned aerial vehicle.
Inventors
- GUO QIANG
- MENG XU
- OUYANG ZHIHENG
- RUI RUI
- ZHANG CONGBO
- SHEN GANG
- SONG ZEMING
- WU XU
Assignees
- 中国铁塔股份有限公司天津市分公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260202
Claims (5)
- 1. An intelligent unmanned aerial vehicle flight path planning method is characterized by comprising the following steps: s1, environment sensing, namely acquiring multi-source heterogeneous environment data, and converting the multi-source heterogeneous environment data into a high-dimensional environment observation vector with unified dimension; S2, multisource feature fusion attention extraction, namely building a feature extraction network containing channel attention, space attention and full-connection dimension reduction based on a high-dimensional environment observation vector to generate a low-dimensional environment feature vector; S3, path utility network training, namely constructing a full-connection path utility network based on the low-dimensional environment feature vector, and performing network training; S4, exploring a reward design, storing flight history data and searching neighbor samples, and designing coverage guidance and utility attenuation double-measure rewards; step 5, time sequence adaptive rewarding fusion, namely, based on a time sequence weight function, adaptively fusing double-measure rewards according to the current flight step number, and generating final exploration rewards; step S6, optimizing a path planning strategy of the unmanned aerial vehicle, constructing a strategy network and a path utility network, and performing iterative training to realize real-time path planning of the unmanned aerial vehicle; In step S3, the path utility network training specifically includes: Path utility network construction, namely constructing a 3-layer full-connection network, and inputting the full-connection network into a low-dimensional environment feature vector The output is the path utility ; Dynamic tag generation, namely, designing dynamic task weights for adapting to environmental dynamic changes and task phase switching in the flight process of an unmanned aerial vehicle, introducing environmental risk factors to correct local barrier density items, and realizing dynamic self-adaptation of utility tags Calculating the distance between target points and the density of local obstacles, normalizing, and generating a dynamic utility label by combining dynamic task weight and environmental risk factor Expressed as: ; wherein, the method comprises the steps of, And Is a utility weight coefficient, and respectively controls the target guidance and the security priority; ; ; is a task phase coefficient, and is formed by the current flight step number t and the total exploration step number of the unmanned aerial vehicle Is determined by the ratio of (a) to (b), ; And Is the maximum value and the minimum value of the guidance quality; The normalized Euclidean distance from the unmanned aerial vehicle to the target point; is the normalized local obstruction density; is an environmental risk factor; is a risk magnification factor; Is an obstacle density threshold; model training, constructing path utility network loss function Optimizing path utility network parameters through gradient descent until loss converges, wherein the path utility network loss function is expressed as: wherein, the method comprises the steps of, Is desirable; is the path utility; In step S4, the exploratory reward design specifically includes: The historical data storage is used for storing the low-dimensional environment feature vector and the path utility in the unmanned aerial vehicle flight process in real time, and constructing a historical data set; neighbor search, namely low-dimensional environment feature vector for current step number Path utility Searching k most similar samples in the historical data set, and calculating neighbor average value And ; Calculating the characteristic distance between the current average value and the adjacent average value And utility distance ; Calculating the coverage guide measure rewards and the utility attenuation measure rewards respectively, wherein the calculation is expressed as: wherein, the method comprises the steps of, Is a coverage guidance measure reward for the number of t steps; Is a deviation measure; is a balance parameter, and controls the weight of the environment exploration and the path utility; wherein, the method comprises the steps of, Is a utility decay measure reward for the number of t steps; is a measure order parameter; Is the utility attenuation coefficient; is a low utility penalty term, and maps two rewards to the [0,1] interval through Min-Max normalization, so as to ensure consistent rewards scale; in step S5, the time sequence adaptive reward fusion sets a total exploration step number, fixes a smoothness parameter, calculates a coverage guidance measure reward weight according to the current flight step number t through a time sequence function, and indicates that: , wherein, Is the coverage guide measure rewarding weight of the number of t steps; Is a smoothness parameter, T is the midpoint of the total exploration steps, and normalized double-measure rewards are fused according to weights to generate final exploration rewards Expressed as: 。
- 2. The intelligent unmanned aerial vehicle flight path planning method of claim 1, wherein in step S2, the multisource feature fusion attention extraction specifically comprises: Constructing a feature extraction network of channel attention-space attention-full-connection dimension reduction; the channel attention calculation, namely carrying out global average pooling on the high-dimensional environment observation vector, calculating channel weight through a full-connection layer and a Sigmoid function, and carrying out element weighting on the channel weight and the high-dimensional environment observation vector; the space attention calculation, namely carrying out global average and maximum pooling on the channel weighted characteristics, calculating space weight through a convolution layer and a Sigmoid function, and carrying out element weighting on the channel weighted characteristics; and (3) dimension reduction processing, namely inputting the weighted features into a full-connection layer and mapping the features into low-dimension environment feature vectors.
- 3. The intelligent unmanned aerial vehicle flight path planning method according to claim 2, wherein in the step S1, the environment awareness is to acquire multi-source heterogeneous environment data of unmanned aerial vehicle flight through multi-source data acquisition, and the acquired multi-source heterogeneous environment data is converted into a high-dimensional environment observation vector with unified dimension through point cloud processing, image processing, normalization processing and data fusion.
- 4. The method for intelligent unmanned aerial vehicle flight path planning according to claim 3, wherein in step S6, the unmanned aerial vehicle flight path planning strategy optimization specifically comprises: constructing a strategy network and a path utility network; The unmanned aerial vehicle executes the action output by the strategy network, and collects the sample of the state-action-rewarding-next state; Calculating a dominance function, namely calculating a dominance function estimated value by using a time sequence difference method according to the collected samples; strategy optimization, namely optimizing strategy network parameters by taking a PPO cutting objective function as loss and maximizing cumulative exploration rewards through gradient descent; Iterative training, namely repeating data collection and strategy optimization until the strategy converges; and (3) real-time path planning, namely after the unmanned plane path planning network training is completed, path planning is realized on real-time environment data.
- 5. An intelligent unmanned aerial vehicle flight path planning system for realizing the intelligent unmanned aerial vehicle flight path planning method according to any one of claims 1-4, which is characterized by comprising an environment sensing module, a multi-source feature fusion attention extraction module, a path utility network training module, an exploration rewarding design module, a time sequence adaptation rewarding fusion module and an unmanned aerial vehicle path planning strategy optimization module; the environment sensing module acquires multi-source heterogeneous environment data and converts the multi-source heterogeneous environment data into a high-dimensional environment observation vector with unified dimension; The multisource feature fusion attention extraction module builds a feature extraction network containing channel attention, space attention and full-connection dimension reduction based on the high-dimensional environment observation vector, and generates a low-dimensional environment feature vector; The path utility network training module constructs a full-connection path utility network based on the low-dimensional environment feature vector and performs network training; the exploration rewarding design module stores flight history data and searches neighbor samples, and designs coverage guidance and utility attenuation double-measure rewards; The time sequence adaptive rewarding fusion module is used for adaptively fusing double-measure rewards according to the current flight step number based on a time sequence weight function, and generating final exploration rewards; The unmanned aerial vehicle path planning strategy optimization module constructs a strategy network and a path utility network, and performs iterative training so as to realize unmanned aerial vehicle real-time path planning.
Description
Intelligent unmanned aerial vehicle flight path planning method and system Technical Field The invention relates to the technical field of path planning, in particular to an intelligent unmanned aerial vehicle flight path planning method and system. Background The unmanned aerial vehicle flight path planning method is a technical method for generating a feasible flight path which meets performance indexes from a starting point to a target point through an environment modeling, path searching or optimizing algorithm under given environment constraint and task requirements. The general unmanned aerial vehicle flight path planning method has the problems of poor dynamic adaptability, unbalanced safety and target guidance, and over-high path failure or collision risk, and the general unmanned aerial vehicle flight path planning method has the problems of unbalanced global coverage and local communication, disconnection of rewarding design from flight data association, insufficient exploration of a low-probability passable area, and poor path planning effect. Disclosure of Invention Aiming at the problems of poor dynamic adaptability, unbalanced safety and target guidance performance, and excessively high path failure or collision risk of a general unmanned aerial vehicle flight path planning method, the method and the system aim at solving the problems of excessive target tracking in the early stage and excessive obstacle avoidance in the later stage by generating dynamic utility labels, adaptively adjusting weights in the task stage, designing environment risk factors, actively pre-judging a high risk area, forcing an unmanned aerial vehicle to be far away from a high risk area, keeping normal weights in a low risk area, not affecting flight efficiency, realizing smooth transition of an exploration-target stage, actively avoiding the high risk area, improving flight safety, aiming at the problems of poor path planning effect caused by the fact that the general unmanned aerial vehicle flight path planning method has global coverage and local communication unbalance, the exploration design is separated from flight data, and the low probability passable area is explored inadequately, the method reflects the characteristics of the areas through neighbor samples, provides rewards for the areas which are subsequently deviated from the explored areas and unknown areas, avoids repeated exploration, strengthens the global coverage and guidance, strengthens the high risk area, forces the exploration and the low risk area, and the overall efficiency is improved, and the method is improved by optimizing the time sequence and the method. The intelligent unmanned aerial vehicle flight path planning method provided by the invention comprises the following steps: s1, environment sensing; s2, multisource feature fusion attention extraction; s3, training a path utility network; s4, exploring a reward design; step S5, time sequence adaptation rewarding fusion; and S6, optimizing the unmanned plane path planning strategy. Further, in step S1, the environment sensing is to acquire multi-source heterogeneous environment data of unmanned aerial vehicle flight through multi-source data acquisition, and the acquired multi-source heterogeneous environment data is converted into a high-dimensional environment observation vector with unified dimension through point cloud processing, image processing, normalization processing and data fusion. Further, in step S2, the multi-source feature fusion attention extraction specifically includes: Constructing a feature extraction network of channel attention-space attention-full-connection dimension reduction; the channel attention calculation, namely carrying out global average pooling on the high-dimensional environment observation vector, calculating channel weight through a full-connection layer and a Sigmoid function, and carrying out element weighting on the channel weight and the high-dimensional environment observation vector; the space attention calculation, namely carrying out global average and maximum pooling on the channel weighted characteristics, calculating space weight through a convolution layer and a Sigmoid function, and carrying out element weighting on the channel weighted characteristics; and (3) dimension reduction processing, namely inputting the weighted features into a full-connection layer and mapping the features into low-dimension environment feature vectors. Further, in step S3, the path utility network training specifically includes: Constructing a full-connection network, wherein the input is a low-dimensional environment feature vector, and the output is path utility; The method comprises the steps of generating a dynamic label, designing a dynamic task weight for adapting to the dynamic change of the environment and task stage switching in the flight process of an unmanned aerial vehicle, introducing an environment risk factor to correct a local barrier density i