CN-121806936-B - Multi-objective performance constraint collaborative optimization method for aircraft based on multi-dimensional virtual-real parameter mapping
Abstract
The invention discloses an aircraft multi-target performance constraint collaborative optimization method based on multi-dimensional virtual-real parameter mapping, which is characterized in that real parameter space and virtual parameter space are constructed, task key elements such as target arrival, obstacle risk, formation holding error, minimum safety margin and the like are uniformly organized into multi-dimensional real parameter input, speed scheduling quantity is uniformly organized into virtual parameter output, so that a consistent parameter expression and a unified decision interface of cross-scene and cross-risk level are realized, engineering applicability and reusability under a task constraint scene are improved, interactive response data are acquired in a multi-scene set, a track sample set is established, PPO reinforcement learning is adopted to train a mapping network of real parameters to speed virtual parameters, and a self-adaptive balance can be formed between multiple targets such as arrival efficiency, formation holding and speed stability without explicit construction of complex manual rules or segmentation threshold control logic, so that the artificial parameter debugging cost is reduced, and strategy generalization capability is improved.
Inventors
- WANG YIN
- LEI LEI
- KANG JIE
- SHEN GAOQING
Assignees
- 南京航空航天大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260310
Claims (6)
- 1. A multi-target performance constraint collaborative optimization method of an aircraft based on multi-dimensional virtual-real parameter mapping is characterized by comprising the following steps: S1, establishing a task space for forming an aircraft in a two-dimensional plane to reach a target area and avoid obstacles, and unifying a target reaching criterion, an obstacle safety criterion, a formation maintenance and aircraft interval safety criterion and a speed capability boundary criterion; S2, quantitatively characterizing the unified target arrival criteria, obstacle safety criteria, formation maintenance criteria, inter-machine distance safety criteria and speed capability boundary criteria in the S1, uniformly organizing target related real parameters, obstacle related real parameters, formation maintenance errors and minimum inter-machine distance safety margins into multidimensional real parameter vectors, and outputting speed scheduling virtual parameters constructed according to the speed capability boundary criteria as single virtual parameters; the step S2 comprises the following steps: s21, constructing real parameters of the target based on the centroid position With the target point Calculating the target distance The direction unit vector from the center of mass to the target is taken as the direction information, and the components of the direction unit vector on two axes of a two-dimensional coordinate system are taken as the direction information And (3) with For simultaneously characterizing the distance and relative direction of the target; s22, constructing real obstacle parameters, and collecting obstacles Calculating the net safety distance from the mass center to each obstacle, and taking the net safety distance corresponding to the smallest mass center as And taking the direction unit vector component pointing to the most dangerous obstacle as 、 For characterizing the proximity and relative direction of the obstacle; S23, constructing formation holding errors, and regarding a preset edge set Calculating the distance between two sides And constructing a dimensionless formation holding error: ; Wherein, the For the number of aircraft in the formation; for a set of formation adjacencies, for specifying pairs of aircraft that need to maintain relative distance constraints; For one of the edges, represent the first Frame and the first The aircraft frames form a pair of adjacent constraint relations; The number of elements in the edge set; is the current moment; and (3) with Respectively the first Frame and the first The aircraft is at moment Is a two-dimensional position vector of (a); representing a binary norm; Is the first Frame and the first The aircraft is at moment Is the actual distance of (3); Is the corresponding edge Is a desired distance from the first end of the first link; the scale parameters when the reference distances are uniformly configured; Maintaining errors for the formation; S24, constructing a minimum safety margin, counting the distance between any two machines in the formation, taking the minimum value as the minimum machine distance, and combining the minimum machine distance with a threshold value Normalized comparison is performed to obtain the minimum safety margin When (when) Determining that the minimum safe spacing constraint is satisfied when Judging that a contact or boundary crossing risk exists; S25, forming real parameter vector and virtual parameter definition, and organizing elements in S2 into real parameter vectors: ; Wherein: Representing a distance between the formation centroid and the target point; and (3) with Representing the components of a direction unit vector pointing from the formation centroid to the target point on two axes of a planar coordinate system; Representing the net safe distance between the formation centroid and the nearest obstacle; and (3) with Representing the components of the direction unit vector pointing from the formation centroid to the nearest obstacle on both axes of the planar coordinate system; representing a formation holding error; representing a minimum safety margin; The virtual parameter space defines only the speed virtual parameter For speed scheduling and multi-objective constraint collaborative optimization, and pose information for calculating But does not directly enter the output definition of the virtual-real mapping, thereby avoiding the dimensional expansion and the interpretation weakness caused by the direct mapping of the pose and the control quantity; s3, generating a scene set, wherein the scene set comprises a target near-far typical situation, a barrier sparse and barrier dense typical situation, a static barrier and dynamic barrier typical situation; S4, training a mapping network which takes the real parameter vector constructed in the S2 as input and takes the speed virtual parameter as output by adopting a PPO algorithm, so that the strategy realizes cooperative optimization of arrival efficiency and formation error on the premise of meeting safety constraint, and suppresses the problem of unsmooth execution caused by severe speed change; S5, using the mapping network obtained by training for online rolling execution, outputting a speed virtual parameter in each control period according to the real parameter vector, and mapping the network output into a final speed command meeting a speed boundary through boundary constraint based on a Sigmoid function.
- 2. The method for collaborative optimization of multi-objective performance constraints of an aircraft based on multi-dimensional virtual-real parameter mapping according to claim 1, wherein S1 comprises: s11, determining target elements and arrival criteria, wherein the targets are given in the form of target points and target area radiuses, and the positions of the target points are recorded as The radius of the target area is recorded as ; S12, determining obstacle safety criteria including determining the distance between an obstacle element and the safety expansion, and marking an obstacle set as Wherein For the purpose of the index of the obstacle, For the number of the barriers, the number of the barriers is equal to the number of the barriers, As the center position of the obstacle, Is equivalent radius, introduces obstacle safety expansion distance for unifying safety distance standard Wherein Is a preset constant, is determined by the geometric outline dimension and the positioning error of the aircraft and meets the following requirements Wherein For the equivalent radius of the aircraft, To locate the upper boundary of error, the obstacle is discriminated according to the equivalent radius Executing; s13, determining a queue maintaining and machine spacing safety criterion, namely determining a queue shape and a safety spacing constraint threshold, wherein the queue maintaining does not explicitly limit a specific geometric queue shape and adopts a contiguous edge set constraint mode to describe a preset contiguous relation set , wherein, Representing the number of aircraft in the formation, And (3) with Numbering and indexing aircraft and for each edge Setting a reference distance For formation holding error calculation while setting minimum safe spacing threshold For safety discrimination of inter-machine distance, wherein Is a preset constant, is determined by the geometric outline dimension and the positioning error of the aircraft and meets the following requirements Wherein For the equivalent radius of the aircraft, Is the upper bound of the positioning error; S14, determining a speed capability boundary criterion, namely determining a speed capability boundary and a control period, and setting a speed virtual parameter value range as Wherein For the lower bound of the speed deficiency parameter, Is the upper limit of the virtual speed parameter and sets the maximum acceleration With maximum deceleration For candidate speed generation, feasible domain screening and online rollback, wherein the control period is that The system is used for unified time sequence of sample collection and strategy execution.
- 3. The method for collaborative optimization of multi-objective performance constraints of an aircraft based on multi-dimensional virtual-real parameter mapping according to claim 2, wherein S1 further comprises: S15, establishing state acquisition quantity and intermediate quantity calculation specifications, and acquiring the positions of all aircrafts in each control period And speed of Wherein For the purpose of indexing the aircraft reference numerals, For the current time, and calculating the formation centroid position: ; Wherein, represent The current position of the aircraft is determined by the position of the aircraft, The centroid position is used for real parameter calculation of target distance, obstacle net safety distance and direction component, which represents the number of aircrafts in formation.
- 4. The method for collaborative optimization of multi-objective performance constraints of an aircraft based on multi-dimensional virtual-real parameter mapping according to claim 1, wherein S3 comprises: s31, generating a scene set and aiming at a target point Random sampling within a task area, number of obstacles Sampling within a preset range, obstacle radius And position In the dynamic obstacle scene, a part of obstacles are endowed with uniform motion or preset tracks so as to generate intersection or crossing risks near the formation tracks, thereby covering typical dynamic obstacle avoidance situations; s32, forming initial condition sampling, wherein the initial position randomly perturbs around a certain initial center, and the initial speed is equal to the initial speed Internal sampling; S33, track acquisition and termination criteria according to a control period The propulsion system calculates real parameter vector in each step, outputs speed virtual parameters and propulsion state, and records real parameter vector at current moment And speed virtual parameter Next time real parameter vector And rewarding and constraint states, wherein the termination condition comprises reaching a target area, obstacle contact limit or machine interval crossing limit; S34, sample cleaning and scale unification are carried out, numerical divergence and obvious unrealizable tracks are removed, and each component of the real parameters is normalized or truncated according to a unified range.
- 5. The method for collaborative optimization of multi-objective performance constraints of an aircraft based on multi-dimensional virtual-real parameter mapping according to claim 1, wherein S4 comprises: S41, inputting and outputting definition of a strategy network, wherein the strategy network uses real parameter vectors Speed suggestions for input and output And ensuring that the output is located by boundary constraints The value network estimates state value by the same input and is used for calculating the dominance function; s42, adopting a PPO (point-to-point) updating mode, adopting a PPO shearing target to limit the change of the ratio of new strategies to old strategies, avoiding unstable training caused by overlarge strategy updating, adopting small-batch multi-round updating to improve the sample utilization efficiency, monitoring the success rate and average rewards on a verification scene, and exporting and storing model parameters for deployment after meeting performance requirements; S43, dividing rewards into a reaching progress block, a time cost block, a formation maintaining block, a safety constraint block, a speed flat block and a terminal block according to the design of the rewards function blocks, respectively defining and combining, wherein the total rewards meet the following conditions: ; Wherein, the Indicating that a progression of the reward is reached, Indicating a time-cost benefit is to be awarded, Indicating that the formation remains rewarded, Indicating a smooth-rate bonus is presented, A security penalty term is represented and is used to indicate, Indicating a terminal reward.
- 6. The method for collaborative optimization of multi-objective performance constraints of an aircraft based on multi-dimensional virtual-real parameter mapping according to claim 1, wherein S5 comprises: S51, acquiring a target related parameter, an obstacle related parameter, a formation holding error and a real parameter of a minimum safety margin in each control period, and forming a real parameter vector according to a unified dimension and a fixed sequence to be used as a strategy network input; s52, inputting the real parameter vector into the strategy network after training convergence to obtain network speed output, and mapping the network output into a final speed command meeting the speed boundary through boundary constraint based on Sigmoid function to meet the requirements of ; Wherein, the For the discrete control step index, For the moment of time Is set to be equal to or greater than the target speed of (1), For the lower boundary of the velocity to be the lower boundary of the velocity, As an upper bound of the velocity of the vehicle, As a function of the Sigmoid, As a forward mapping function of the policy network, As a function of the parameters of the policy network, For the moment of time Is used for the real-parameter vector of (a), To assist in speed observance, it is used to characterize the current or last time speed information.
Description
Multi-objective performance constraint collaborative optimization method for aircraft based on multi-dimensional virtual-real parameter mapping Technical Field The invention belongs to the technical field of aerospace, and particularly relates to an aircraft multi-target performance constraint collaborative optimization method based on multi-dimensional virtual-real parameter mapping. Background In the prior art, in order to achieve target arrival and obstacle avoidance, track planning and tracking control based on waypoints, a local obstacle avoidance method based on an artificial potential field or speed obstacle, and a constraint control method based on model predictive control or on-line optimization are often adopted. For formation tasks, common schemes include pilot following, consistency control, virtual structure, etc., and formation maintenance is achieved by relative position error or distance maintenance. In practical application, multiple targets and constraints often are strongly coupled, for example, the speed is usually required to be increased or the bypassing distance is shortened to improve the arrival efficiency, but obstacle avoidance safety is possibly reduced, formation errors are increased, the minimum distance is insufficient or saturated frequency is controlled, and excessive conservation can cause the increase of task time, the increase of energy consumption and even the arrival failure of targets. Particularly, under the conditions of dense barriers, narrow channels or wind disturbance, the performance and safety constraint of the traditional method are difficult to ensure simultaneously; In recent years, deep learning and reinforcement learning have been used for aircraft decision making or parameter tuning, and it is desired to improve the adaptive capacity of the environment by data driving. However, in the multi-objective and multi-constraint scene, the weight design of the reward function and the cost function often depends on experience, and the phenomena of unstable training, slow convergence or performance sacrifice in safety are easy to occur, meanwhile, the learning strategy still faces the problem of generalization deficiency caused by the difference between virtual and real after training in simulation. In the prior art, few methods take multi-dimensional real parameters and virtual parameters as cores, key real parameters such as targets, barriers, formation errors, safety margins and the like are structurally input, virtual parameters which can be directly used for controlling and planning module scheduling are output, and simultaneously, the collaborative optimization and feasibility guarantee of multi-target performance and constraint conditions are completed under the same framework. Disclosure of Invention The invention aims to provide an aircraft multi-target performance constraint collaborative optimization method based on multi-dimensional virtual-real parameter mapping, so as to solve the problems. In order to achieve the purpose, the technical scheme provided by the invention is that the multi-target performance constraint collaborative optimization method of the aircraft based on multi-dimensional virtual-real parameter mapping comprises the following steps: S1, establishing a task space for forming an aircraft in a two-dimensional plane to reach a target area and avoid obstacles, and unifying a target reaching criterion, an obstacle safety criterion, a formation maintenance and aircraft interval safety criterion and a speed capability boundary criterion; S2, quantitatively characterizing the unified target arrival criteria, obstacle safety criteria, formation maintenance criteria, inter-machine distance safety criteria and speed capability boundary criteria in the S1, uniformly organizing target related real parameters, obstacle related real parameters, formation maintenance errors and minimum inter-machine distance safety margins into multidimensional real parameter vectors, and outputting speed scheduling virtual parameters constructed according to the speed capability boundary criteria as single virtual parameters; s3, generating a scene set, wherein the scene set comprises a target near-far typical situation, a barrier sparse and barrier dense typical situation, a static barrier and dynamic barrier typical situation; S4, training a mapping network which takes the real parameter vector constructed in the S2 as input and takes the speed virtual parameter as output by adopting a PPO algorithm, so that the strategy realizes cooperative optimization of arrival efficiency and formation error on the premise of meeting safety constraint, and suppresses the problem of unsmooth execution caused by severe speed change; S5, using the mapping network obtained by training for online rolling execution, outputting a speed virtual parameter in each control period according to the real parameter vector, and mapping the network output into a final speed command meeting a