CN-122015580-A - Guidance interception system and method for hybrid acoustic vision fusion and neural network

CN122015580ACN 122015580 ACN122015580 ACN 122015580ACN-122015580-A

Abstract

The invention discloses a guidance interception system and method for a hybrid acoustic vision fusion and neural network, wherein the system comprises a hybrid sensing unit composed of at least one camera and at least one two-position acoustic sensor array, a self-noise suppression unit for receiving acoustic signal sequences and outputting noise-reduced acoustic signals, a target state estimation unit for fusing vision and acoustic data to obtain a state vector, updating the state vector by using acoustic data and outputting a prediction angle after vision loss to drive a holder redirection unit to adjust the holder of a visible camera to rotate to a target prediction position, and a guidance unit for completing tracking interception by using an Actor-Critic network comprehensive state, a sight rate, a self state and a tracking mark. The method solves the problem of direct tracking interception failure caused by dynamic target loss, and ensures the continuous tracking capability of the unmanned aerial vehicle in a complex countermeasure environment.

Inventors

YANG LICHUAN
NIE ZHI
TANG SHIYAO
WANG CHENXI

Assignees

四川腾盾科技有限公司
四川腾盾良远智能科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (9)

1. The guidance interception system of the hybrid acoustic vision fusion and neural network is characterized by comprising a hybrid sensing unit, a self-noise suppression unit, a target state estimation unit, a cradle head redirection unit and a guidance unit; The hybrid perception unit comprises at least one camera and at least one two-position acoustic sensor array, wherein the two-position acoustic sensor array comprises a first microphone subarray and a second microphone subarray which are spatially separated, and the camera and the two-position acoustic sensor array respectively synchronously acquire a visual image sequence and an acoustic signal sequence of a target; the self-noise suppression unit receives the acoustic signal sequence, generates a time-frequency mask in real time, multiplies the time-frequency mask by the acoustic signal sequence point by point, and outputs a noise-reduced acoustic signal; the target state estimation unit receives the visual image sequence and the noise-reduced acoustic signal, calculates the visual image sequence and the noise-reduced acoustic signal to respectively obtain a visual azimuth angle and an acoustic elevation angle, and fuses the visual azimuth angle and the acoustic elevation angle to obtain a position and a speed state vector of the target; The cradle head redirecting unit is in communication connection with the target state estimating unit, and drives the cradle head of the visible light camera to rotate to a target predicted position according to the predicted azimuth angle and the pitch angle after receiving the vision loss signal; The guidance unit stores an Actor-Critic network model based on deep reinforcement learning, takes a state vector, a line-of-sight angular rate, a self-motion state of the rotor unmanned aerial vehicle and a tracking mode zone bit representing a current tracking mode as a state space input, and outputs a bottom flight control instruction of the rotor unmanned aerial vehicle.
2. The guidance intercept system of claim 1, wherein the first microphone sub-array is provided with two microphone devices at a first baseline length and the second microphone sub-array is provided with two microphone devices at a second baseline length; the first baseline length is greater than the second baseline length.
3. The guidance intercept system of claim 2, wherein the first baseline length is 300 mm and the second baseline length is 150 mm; the first microphone sub-array and the second microphone sub-array are disposed at a distance from the rotor plane of greater than 400 mm.
4. A guidance interception method of a hybrid acoustic vision fusion and neural network, characterized in that the method comprises, based on the guidance interception system of any one of claims 1-3: s1, acquiring visual image data, and searching and locking a target unmanned aerial vehicle; s2, after the target unmanned aerial vehicle is locked, entering a normal tracking mode, executing a step S3, simultaneously acquiring the visual tracking confidence coefficient in real time, comparing and judging with a set threshold value, determining the current target tracking state, and if the target vision is not lost, keeping the current normal tracking mode; S3, synchronously acquiring a visual image sequence and an acoustic signal sequence of a target, respectively analyzing to obtain a visual azimuth angle and an acoustic azimuth angle and a pitch angle, and fusing through an extended Kalman filter to obtain a position and a speed state vector of the target; S4, acquiring an acoustic signal sequence, analyzing to obtain an acoustic azimuth angle and an acoustic pitch angle, updating a position and a speed state vector of a target, and executing a step S5, driving a holder to rotate based on the current azimuth angle and the pitch angle, executing visual recapture of acoustic driving, returning to the step S2 if the recapture is completed, entering a normal tracking mode, and executing the step S4 again to drive the holder to rotate if the recapture is not completed; S5, outputting a bottom flight control instruction by taking a state vector, a sight angle rate, a self-motion state of the rotor unmanned aerial vehicle and a tracking mode zone bit as inputs through an actor-critic network based on depth reinforcement learning.
5. The guidance intercepting method according to claim 4, wherein in step S3, the target is identified and located in the visual image sequence by the target detection algorithm, the pixel coordinates of the target are output, the pixel coordinates are normalized, and then coordinate conversion from the camera coordinate system, the cradle head coordinate system, the machine body coordinate system to the navigation coordinate system is sequentially performed, so that the azimuth angle and the pitch angle of the target are finally calculated.
6. The guidance intercepting method according to claim 5, wherein if a wide-angle lens is used, the method further comprises performing distortion correction processing before normalizing the pixel coordinates.
7. The guidance interception method according to claim 4, wherein before the analyzing the collected acoustic signal sequence in steps S3 and S4, further comprising performing noise suppression processing by a trained deep convolution self-encoder based on a U-Net architecture.
8. The guidance interception method according to claim 7, wherein, for the acoustic signal after the noise suppression processing, a delay difference between each microphone pair is calculated by a GCC-phas algorithm; And calculating the arrival direction of the sound source according to the time delay difference and the geometric layout of the microphone array, and outputting an acoustic azimuth angle and a pitch angle after carrying out numerical robustness processing.
9. The guidance intercepting method according to claim 4, wherein in step S5, the generation of the flight control command is specifically as follows: s501, combining the position and speed state vector, the sight angle rate, the self-motion state of the rotor unmanned aerial vehicle and the tracking mode zone bit of the target into an input state vector, and carrying out normalization processing; S502, inputting the normalized state vector into a pre-trained Actor network, and outputting an original motion vector; s503, after the original motion vector is subjected to safe clipping and smoothing filtering processing, mapping the original motion vector into a bottom layer flight control instruction.

Description

Guidance interception system and method for hybrid acoustic vision fusion and neural network Technical Field The invention relates to the technical field, in particular to a guidance interception system and method for mixing acoustic vision fusion with a neural network. Background The existing anti-unmanned aerial vehicle technology has the following limitations in particular in the scene of dealing with small, high-speed and high-mobility fixed wing unmanned aerial vehicles: 1. Traditional interception systems rely primarily on visible light or radar sensors. When an object (e.g., a high-speed flying fixed-wing drone) briefly leaves the Field of View (FOV) of a main sensor (e.g., a camera) due to rapid maneuvers or environmental occlusion, the system easily loses the object, resulting in a tracking disruption and an intercept task failure. The existing system lacks an effective mechanism for quickly capturing the target again by utilizing auxiliary information after the main sensor fails, and has the problem of insufficient tracking stability. 2. Traditional guidance laws, such as proportional guidance (Proportional Navigation, PN) and variants thereof, while effective in many scenarios, are generally based on simplified kinematic model designs that fail to take full advantage of the unique nonlinear flight dynamics of the intercept platform (especially rotorcraft). In the terminal interception stage, the rotor unmanned aerial vehicle has high maneuvering capabilities such as hovering, lateral translation and the like, and the traditional guidance law cannot exert the full potential of the rotor unmanned aerial vehicle, so that the interception success rate is reduced when the rotor unmanned aerial vehicle faces a high maneuvering target. 3. The prior art deploys acoustic sensors on unmanned aerial vehicle platforms for passive detection, but faces severe "self-noise" (ego-noise) problems. The rotor of the interceptor itself can produce strong, non-stationary broadband noise that can easily drown out weak acoustic signals from the target, making it difficult for the on-board acoustic sensor to operate efficiently. 4. The rotor unmanned aerial vehicle and the fixed wing unmanned aerial vehicle have obvious asymmetry in flight mechanics (speed, maneuverability, endurance) and acoustic characteristics. Most of the existing anti-unmanned aerial vehicle systems are universal to platforms, and the tactical advantage is built by utilizing the asymmetry of the special scene of 'rotor wing interception fixed wing'. Disclosure of Invention In order to solve the problems, the invention provides a guidance interception system and method for a hybrid acoustic vision fusion and neural network, when a vision sensor loses a target due to high-speed maneuver of the target, the system can actively guide a camera holder to perform rapid redirection and target recapture by utilizing passive acoustic positioning of the noise of a target fixed wing unmanned aerial vehicle, thereby realizing uninterrupted closed loop tracking and interception, and improving the interception success rate and the tracking robustness of the dynamic fixed wing target in an asymmetric fight scene. The invention provides a guidance interception system for mixing acoustic vision fusion and a neural network, which is used for a rotor unmanned aerial vehicle to intercept a fixed wing unmanned aerial vehicle, and the specific technical scheme is as follows: the system comprises a hybrid sensing unit, a self-noise suppression unit, a target state estimation unit, a cradle head redirecting unit and a guidance unit; The hybrid perception unit comprises at least one camera and at least one two-position acoustic sensor array, wherein the two-position acoustic sensor array comprises a first microphone subarray and a second microphone subarray which are spatially separated, and the camera and the two-position acoustic sensor array respectively synchronously acquire a visual image sequence and an acoustic signal sequence of a target; the self-noise suppression unit receives the acoustic signal sequence, generates a time-frequency mask in real time, multiplies the time-frequency mask by the acoustic signal sequence point by point, and outputs a noise-reduced acoustic signal; the target state estimation unit receives the visual image sequence and the noise-reduced acoustic signal, calculates the visual image sequence and the noise-reduced acoustic signal to respectively obtain a visual azimuth angle and an acoustic elevation angle, and fuses the visual azimuth angle and the acoustic elevation angle to obtain a position and a speed state vector of the target; The cradle head redirecting unit is in communication connection with the target state estimating unit, and drives the cradle head of the visible light camera to rotate to a target predicted position according to the predicted azimuth angle and the pitch angle after receiving the vision loss signal; The guidance