Search

CN-116400729-B - Method for avoiding missiles by split airplane based on deep reinforcement learning and split airplane

CN116400729BCN 116400729 BCN116400729 BCN 116400729BCN-116400729-B

Abstract

The invention discloses a method for avoiding missiles by a split airplane based on deep reinforcement learning and the split airplane, the method comprises the following steps of S1, aiming at control of the split airplane, constructing a decision network and an evaluation network based on DDPG algorithm and performing reinforcement learning training, wherein the split airplane comprises a front sub-airplane, a rear sub-airplane and a front sub-airplane and rear sub-airplane separating device, the front sub-airplane is provided with a front landing gear, a standby main landing gear and a standby engine, the rear sub-airplane is provided with a main landing gear and a main engine, and S2, in the flight process, the actual observed x-axis distance, y-axis distance and direction included angle of an airplane radar are used as states to be input into the decision network, and the decision network is used for controlling airplane steering and separating the rear sub-airplane from the front sub-airplane. The invention can control the steering of the split aircraft to ensure that the direction of the aircraft is consistent with that of the missile, and enables the rear sub-aircraft of the split aircraft to be separated from the front sub-aircraft before the missile hits the aircraft, thereby realizing the function of avoiding the missile in a split way.

Inventors

  • TAN JUNBO
  • WANG XUEQIAN
  • YANG ZHICHENG
  • LIANG BIN

Assignees

  • 清华大学深圳国际研究生院

Dates

Publication Date
20260505
Application Date
20230410

Claims (9)

  1. 1. A method for avoiding missiles by a split airplane based on deep reinforcement learning is characterized by comprising the following steps: S1, constructing a decision network and an evaluation network based on DDPG algorithm aiming at the control of a split aircraft and performing reinforcement learning training; The split aircraft comprises a front sub-aircraft, a rear sub-aircraft and a front sub-aircraft and rear sub-aircraft separating device, wherein the front sub-aircraft is provided with a front landing gear, a standby main landing gear and a standby engine, and the rear sub-aircraft is provided with a main landing gear and a main engine; In the training process, the observed x-axis distance, y-axis distance and direction included angle between the missile and the aircraft on the xy plane of the aircraft body coordinate system are taken as states to be input into the decision network, the decision network outputs actions to the evaluation network, the evaluation network carries out rewarding scoring and network parameter updating according to the states and the actions, the decision network carries out gradient ascending according to the scoring to obtain an updated strategy, so that the aircraft steering is controlled according to the updated strategy to enable the direction of the aircraft to be consistent with the direction of the incoming missile, the missile can only hit the rear sub-aircraft of the aircraft from the rear, the situation that the missile hits the front sub-aircraft at a large speed angle with the aircraft is avoided, the rear sub-aircraft is separated from the front sub-aircraft before the missile hits the aircraft, and the front sub-aircraft can also keep stable flying as soon as possible; S2, inputting the x-axis distance, the y-axis distance and the direction included angle actually observed by an airplane radar into the decision network as states, controlling the airplane to turn and enabling the rear sub-machine to be separated from the front sub-machine by the decision network, controlling a standby engine of the front sub-machine and a standby main landing gear which is originally retracted to be opened when the rear sub-machine is separated from the front sub-machine, and enabling the front sub-machine to finish landing by using the front landing gear and the standby main landing gear.
  2. 2. The method of claim 1, wherein in step S1, the step period for controlling the steering of the aircraft is 0.1S.
  3. 3. The method of claim 1, wherein in step S1, the angle of steering of the aircraft is controlled at-1 degree to 1 degree each time.
  4. 4. The method according to claim 1, wherein in step S1, controlling the direction of the aircraft to coincide with the direction of the missile means controlling the direction angle of the aircraft to be not more than 1 degree.
  5. 5. The method of claim 1, wherein in step S1, the prize is increased when the aircraft is less than 1 degree from the missile speed direction angle, otherwise the prize is unchanged.
  6. 6. The method according to any one of claims 1 to 5, wherein the training of the evaluation network is performed by using a TD method, wherein the TD error is counter-propagated, wherein the decision network is performed by using a value function as a maximum target, wherein the training is performed by using a gradient ascent method, and wherein the network parameters are determined after the training is completed.
  7. 7. A split aircraft comprising a front sub-aircraft having a nose landing gear, a back main landing gear and a back engine, a back sub-aircraft having a main landing gear and a main engine, a front and back sub-aircraft separation device, and a control module configured to perform the method of deep reinforcement learning based split aircraft avoidance missiles of any of claims 1 to 6.
  8. 8. The split aircraft of claim 7, wherein the front sub-aircraft and the rear sub-aircraft are each provided with a double vertical tail.
  9. 9. A split aircraft as claimed in claim 7 or claim 8, wherein the fore and aft sub-aircraft separation means is a thrust means.

Description

Method for avoiding missiles by split airplane based on deep reinforcement learning and split airplane Technical Field The invention relates to the technical field of aircrafts, in particular to a method for avoiding missiles by a split airplane based on deep reinforcement learning and the split airplane. Background The variant aircraft is a new concept aircraft which can sense the external environment in real time, and can timely and autonomously change the appearance layout of the body according to information such as flight tasks, flight states, flight environments and the like so as to realize the flight with optimal performance under different tasks. The variant aircraft is taken as a new concept, and compared with the fixed-shape aircraft, the variant aircraft can be deformed to have larger range, stronger environmental adaptability and the like, so that the variant aircraft is one of research hotspots in the current aerospace field. The task adaptive wing (mix ADAPTIVE WING, MAW) project was developed in combination by the united states national aerospace agency (National Aeronautics and Space Administration, NASA) and the united states air force from the end of the 70 th to the beginning of the 80 th century. Thereafter, NASA developed a modified aircraft (Aircraft Morphing) project in order to focus on smart device assemblies applied to the fuselage. In 2003, the DARPA (advanced research planning agency of the united states department of defense) developed a variant aircraft structure (MAS) project aimed at studying a larger scale deformation method than before, enabling the multitasking of variant aircraft. The design concept proposed in this project is to extend the leading and trailing edge control surfaces into fully adaptive wings, and the aircraft achieves flight control through wing deformation like birds. Variant aircraft have been specifically envisaged in the "21 st century aviation development prospect" made by the national astronaut agency (NASA) and are expected to become realistic around 2030. After the 21 st century, our country conducted an important study on variant aircraft related concepts. Some researchers have conducted intensive research into the problem of coordinated control of deformation and flight of a variant aircraft. Some studies have studied the unsteady aerodynamic characteristics of the airfoil during the continuous deformation process. The time delay, the packet loss and the like of sampling communication are considered, a distributed cooperative control strategy is researched, and a distributed driving intelligent deformation wing simulation platform is built. The Shenyang aircraft design institute provides a novel deformation control unmanned aerial vehicle, and a Q-Learning (reinforcement Learning algorithm) method is adopted to realize the reinforcement Learning control module of the deformation unmanned aerial vehicle. The western security traffic university researches a novel bionic deformation unmanned aerial vehicle by referring to the capability of birds to contract wings during high-speed flight and spread wings during low-speed flight. Compared with the traditional fixed structure aircraft, the variant aircraft has variable appearance capability, improves aerodynamic characteristics, inevitably brings new technical problems at the same time, and mainly aims at the design of the variant structure and the design of a flight control system to develop further the research status quo of the variant aircraft. A. Aircraft deformation structure design At present, scholars and industry people at home and abroad have proposed various aircraft variant structural design schemes. Among the most interesting are wing deformation mechanisms, the working principle of which is to change the aerodynamic lift of an aircraft by changing the shape of the wing. According to the deformation scale, wing deformation can be divided into three types of small-scale deformation, medium-scale deformation and large-scale deformation. The small-scale deformation refers to local changes of the wing, such as turbulence, bulge and the like, the medium-scale deformation refers to changes of wing wings, such as thickness variation, bending variation, torsion and the like, and the large-scale deformation refers to changes of the whole wing, such as folding, sweepback, expansion and the like. Among them, the university of vero radar in the united states developed a deformable microminiature aircraft resembling gull flight, which can be divided into three modes of neutral, forward-biased and reverse-biased. B. Aircraft control technique The control principle of the variant aircraft is basically the same as that of the traditional aircraft, but the change of aerodynamic characteristics, gravity center position, rotational inertia and the like caused by the deformation can change parameters in a conventional six-degree-of-freedom twelve-state quantity equation set of the aircraft, which brings a