CN-121982282-A - Airborne end 'low-low' target recognition tracking method and system based on improvement YoloV and ECO

CN121982282ACN 121982282 ACN121982282 ACN 121982282ACN-121982282-A

Abstract

The invention relates to an airborne end 'low-slow-small' target identification tracking method and system based on improvement YoloV and ECO, belongs to the technical field of unmanned aerial vehicle target identification and tracking, and solves the problems of low accuracy rate of 'low-slow-small' target identification and easy loss of tracking in the prior art. The method comprises the steps of adding a small-scale detection head P2 at a detection end to obtain a detection end of a 'low-slow small' target detection model, adding a feature fusion layer F2 corresponding to the detection head P2 at a neck network of a YoloV model, adding a CBAM model at each feature fusion layer of the neck network to obtain the neck network, training the 'low-slow small' target detection model based on a training set, carrying out 'low-slow small' target detection and identification on a target image acquired in real time through the trained 'low-slow small' target detection model, outputting a target identification result, determining a tracking target based on the target identification result, and continuously acquiring continuous frame images of the tracking target based on a preset tracking algorithm.

Inventors

MI YING
LIU CHAN
PENG YANYUN
BIAN WEIWEI
HUANG QIFU
JIANG DAWEI
QIU XUYANG

Assignees

北京机械设备研究所

Dates

Publication Date: 20260505
Application Date: 20241031

Claims (10)

1. An on-board end "slow-small" object recognition tracking method based on improvement YoloV and ECO, comprising: acquiring a micro target data set through a photoelectric pod, and preprocessing to form a training set; The method comprises the steps of improving a YoloV model to obtain a 'low-slow small' target detection model, adding a small-scale detection head P2 at the detection end of the YoloV model to obtain the detection end of the 'low-slow small' target detection model, adding a feature fusion layer F2 corresponding to the detection head P2 at a neck network of the YoloV model, and adding CBAM models at each feature fusion layer of the neck network to obtain the neck network; Training the 'low-low' target detection model based on the training set to obtain a trained 'low-low' target detection model, carrying out 'low-low' target detection and identification on a target image acquired in real time through the trained 'low-low' target detection model, and outputting a target identification result; and determining a tracking target based on a target recognition result, and continuously acquiring continuous frame images of the tracking target based on a preset tracking algorithm.
2. The method of claim 1, wherein the neck network comprises four feature fusion layers F2-F5, each feature fusion layer being connected to a respective detection head p2-p5, each feature fusion layer comprising at least one convolution layer and CBAM models, CBAM models comprising a channel attention module CAM and a spatial attention module SAM.
3. The method of claim 1, wherein the "low-slow small" object detection model has a loss function of DDIoU, as shown in equation (1): DDIOU =e -ad ·IoU * (1), wherein, D is the distance between the predicted boundary frame and the center point of the real boundary frame, x 1 、x 2 is the abscissa and the ordinate of the center point of the predicted boundary frame, y 1 、y 2 is the abscissa and the ordinate of the center point of the real boundary frame, ioU * is the ratio of the intersection area and the union area of the two boundary frames, a is an adjusting factor, and e is the Euler number.
4. The method of claim 1, wherein continuously acquiring successive frame images of the tracked object based on a preset tracking algorithm comprises: Selecting a tracking target in an initial frame in a frame mode, and determining the initial position of the tracking target; respectively extracting texture and color characteristics of a tracking target from each frame of target image through HOG and CN characteristic extraction algorithms; predicting the rough position of the tracking target in the target image of the next frame based on the fusion characteristic and the initial position of the tracking target, and accordingly setting a search area of the target image of the next frame; And performing feature matching in a search area in the next frame of image, and determining the accurate position of the tracking target.
5. The method of claim 4, wherein performing feature matching in the search area in the next frame of image, determining the precise location of the tracking target comprises: Determining a target candidate region in the search region, performing similarity matching on the fusion characteristic generated based on the previous frame and the target candidate region in the current frame, and generating a response graph; Judging the peak point response value and the peak point threshold value of the response graph, and if the peak point response value is larger than the peak point threshold value, determining the peak point position as the accurate position of the tracking target, wherein the peak point position is a target candidate region with highest matching score with the fusion feature; If the response value of the peak point is lower than the threshold value of the peak point, the shielding judgment is carried out, if the response value of the peak point is not shielded, the candidate area corresponding to the position of the peak point is determined to be the accurate position of the tracking target, if the response value of the peak point is not shielded, the search area is replaced by the full-image area, the candidate area of the target is redetermined in the full-image area, the similarity matching is carried out, the response image is obtained, and the candidate area of the target corresponding to the position of the peak point of the response image is taken as the accurate position of the tracking target.
6. The method of claim 5, wherein the occlusion determination comprises: Calculating peak sidelobe ratio PSLR and average peak correlation energy APCE of the response diagram; Judging whether the fused peak sidelobe ratio PSLR and the average peak correlation energy APCE simultaneously meet shielding conditions, and if so, determining that the target is shielded; if the two types of the liquid medicine are not satisfied at the same time, then no occlusion occurs.
7. The method according to claim 6, wherein the peak sidelobe ratio PSLR of the response map is calculated by the calculation formula (2); Wherein g max is the main lobe response value, m s1 is the mean value of the side lobe response values, and S S1 is the standard deviation of the side lobe response values; Calculating average peak correlation energy APCE of the response graph by a calculation formula (3); Wherein F max is the peak point response value, F min is the minimum response value in the response graph, F v,h is the response value at (v, h), and APCE is the average peak correlation energy.
8. The method of claim 7, wherein the step of determining the position of the probe is performed, The shielding condition of the fused peak sidelobe ratio PSLR is as follows: PSLR <0.75PSLR avg or PSLR <0.9PSLR min ; The blocking condition of the average peak correlation energy APCE is: wherein PSLR avg is the average value of the historical peak side lobe ratio PSLR, and PSLR min is the minimum value of the historical peak side lobe ratio PSLR; APCE <0.75 x APCE avg or APCE <0.9APCE min ; Wherein APCE avg is the average value of the historical average peak correlation energy APCE, and APCE min is the minimum value of the historical average peak correlation energy APCE.
9. The method of claim 1, wherein training the "low-slow-small" target detection model based on the training set to obtain a trained "low-slow-small" target detection model comprises: inputting samples in the training set into a constructed 'low-low' target detection model for training, and adjusting model parameters until the loss function converges or reaches the preset training times; Based on the adjusted parameters, verifying the trained 'low-low' target detection model by using a test set, and evaluating model accuracy; If the model precision meets the preset precision requirement, the model parameters are saved to obtain a trained 'low-small' target detection model.
10. An on-board end "low slowness" target recognition tracking system based on modifications YoloV and ECO, the system comprising: the data processing module is used for acquiring a micro target data set through the photoelectric pod, and forming a training set after preprocessing; The model construction module is used for improving the YoloV model to obtain a 'low-slow small' target detection model, wherein a small-scale detection head P2 is added at the detection end of the YoloV model to obtain the detection end of the 'low-slow small' target detection model, a feature fusion layer F2 corresponding to the detection head P2 is added at the neck network of the YoloV model, and a CBAM model is added at each feature fusion layer of the neck network to obtain the neck network; The result output module is used for training the 'low-low' target detection model based on the training set to obtain a trained 'low-low' target detection model, carrying out 'low-low' target detection and identification on a target image acquired in real time through the trained 'low-low' target detection model, and outputting a target identification result; and the tracking module is used for determining a tracking target based on the target recognition result and continuously acquiring continuous frame images of the tracking target based on a preset tracking algorithm.

Description

Airborne end 'low-low' target recognition tracking method and system based on improvement YoloV and ECO Technical Field The invention relates to the technical field of unmanned aerial vehicle target recognition and tracking, in particular to an airborne end 'low-speed and small' target recognition and tracking method and system based on improvement YoloV and ECO. Background Along with the rapid development of unmanned aerial vehicle technology, unmanned aerial vehicles are increasingly widely applied in the fields of motion tracking, reconnaissance patrol, traffic monitoring and the like. Particularly, the small, slow and low-altitude flying target (i.e. the 'low and slow' target) has the characteristics of strong concealment and flexibility, and becomes a technical problem which needs to be solved urgently for effective identification and real-time tracking of the 'low and slow' target. Currently, detection means for "low-slow" targets mainly include radar detection and photoelectric detection. Radar detection recognizes targets by analysis of target echo information, but is susceptible to interference in complex environments, has low measurement accuracy, and is costly. The photoelectric detection obtains the target image through the photoelectric detection equipment, has higher measurement precision and lower cost, but still needs to improve the recognition precision and tracking stability of the remote small target. In recent years, a target detection and tracking method based on deep learning has been significantly advanced. For example, YOLO (You Only Look Once) series models are of great interest because of their rapid and accurate target detection capabilities. However, the existing YOLO model is mainly aimed at the target with a conventional size, and when the target with a low speed is detected and tracked, the high-accuracy identification and stable tracking are difficult to realize due to the small size and unobvious characteristics of the target. In addition, existing tracking algorithms, such as ECO (Enhanced Correlation Filters), while performing well in certain tracking scenarios, often suffer from tracking occlusion or loss when dealing with "slow and small" targets due to rapid movement of the target and interference from complex backgrounds. Disclosure of Invention In view of the above analysis, the embodiment of the invention aims to provide a method and a system for identifying and tracking an airborne end low-speed small object based on improvement YoloV and ECO, which are used for solving the problems of low accuracy rate and easy loss of tracking of the existing low-speed small object. The first aspect of the present invention provides a method for identifying and tracking an airborne end 'low-slow-small' object based on improvement YoloV and ECO, comprising: acquiring a micro target data set through a photoelectric pod, and preprocessing to form a training set; The method comprises the steps of improving a YoloV model to obtain a 'low-slow small' target detection model, adding a small-scale detection head P2 at the detection end of the YoloV model to obtain the detection end of the 'low-slow small' target detection model, adding a feature fusion layer F2 corresponding to the detection head P2 at a neck network of the YoloV model, and adding CBAM models at each feature fusion layer of the neck network to obtain the neck network; Training the 'low-low' target detection model based on the training set to obtain a trained 'low-low' target detection model, carrying out 'low-low' target detection and identification on a target image acquired in real time through the trained 'low-low' target detection model, and outputting a target identification result; and determining a tracking target based on a target recognition result, and continuously acquiring continuous frame images of the tracking target based on a preset tracking algorithm. As a further improvement of the application, the neck network comprises four feature fusion layers F2-F5, each feature fusion layer being connected to a detection head p2-p5, respectively, each feature fusion layer comprising at least one convolution layer and CBAM models, and the CBAM model comprising a channel attention module CAM and a spatial attention module SAM. As a further improvement of the present application, the loss function of the "low-slow-small" target detection model is DDIoU loss functions, as shown in the calculation formula (1): DDIOU =e -ad. IoU (1), wherein, D is the distance between the predicted boundary frame and the center point of the real boundary frame, x 1、x2 is the abscissa and the ordinate of the center point of the predicted boundary frame, y 1、y2 is the abscissa and the ordinate of the center point of the real boundary frame, ioU * is the ratio of the intersection area and the union area of the two boundary frames, a is an adjusting factor, and e is the Euler number. As a further improvement of the present application, contin