CN-121999395-A - Unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation

CN121999395ACN 121999395 ACN121999395 ACN 121999395ACN-121999395-A

Abstract

The invention discloses an unmanned aerial vehicle aerial image target detection method based on high-frequency detail feature compensation, which comprises the steps of obtaining an unmanned aerial vehicle aerial image data set and preprocessing, constructing a target detection model, specifically comprising the steps of reconstructing a multi-scale feature fusion network to enhance retention of high-resolution features and detail compensation, constructing a multi-scale high-frequency feature compensation module to recover detail loss in a downsampling process, constructing a local space attention fusion module to optimize feature expression quality, and training and evaluating the constructed target detection model by utilizing the preprocessed data set. According to the invention, through a high-frequency detail compensation and local space enhancement mechanism, the accuracy and the robustness of the model to target detection in the unmanned aerial vehicle aerial image are improved.

Inventors

WANG KAI
ZHANG YUNZUO
ZHANG XINCHENG
GAO KANG

Assignees

石家庄铁道大学

Dates

Publication Date: 20260508
Application Date: 20260202
Priority Date: 20251124

Claims (4)

1. The unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation is characterized by comprising the following steps of: step 1, data acquisition and preprocessing, namely acquiring an unmanned aerial vehicle aerial image data set and preprocessing the unmanned aerial vehicle aerial image data set; Step 2, constructing a target detection model; Reconstructing YOLOv a multi-scale feature fusion network in a target detection algorithm, and on the basis of the original top-down and bottom-up feature fusion paths, enhancing the reservation and detail compensation of a high-resolution feature map, reducing the interference of low-resolution features on fusion results, and obtaining a detail-compensated multi-scale feature fusion network; Step 2-2, adding a multi-scale high-frequency characteristic compensation module in YOLOv target detection algorithm, wherein the module is used for reserving details of the large-scale characteristic diagram, compensating the details after the characteristic downsampling process, and enhancing the characteristic details of the downsampled characteristic diagram so as to enhance the detection performance; Step 2-3, introducing a local space attention fusion module in front of a detection head of YOLOv < 11 > target detection algorithm to replace the original C3K2 structure, wherein the module is positioned before the large-scale output characteristics are sent into the detection head, and carrying out detail enhancement and noise suppression on the fused characteristics through a multi-branch convolution and attention weighting mechanism so as to optimize the characteristic expression quality; And step 3, performing model training by using the preprocessed unmanned aerial vehicle aerial image data set, loading the trained model and performing performance evaluation.
2. The unmanned aerial vehicle aerial image target detection method based on high-frequency detail feature compensation according to claim 1 is characterized in that a multi-scale fusion network for detail compensation is provided with a high-frequency detail compensation branch and a local space attention fusion module besides a top-down fusion path and a bottom-up fusion path, the high-frequency detail compensation branch is connected with a large-scale feature map in parallel and used for compensating detail information lost in a downsampling process, the local space attention fusion module can perform space attention weighting and multi-scale detail fusion operation on input features, local space information expression related to targets is enhanced, fine grain textures and key structural features are highlighted, and therefore feature resolvability of small targets and low-contrast targets is assisted to be improved, and feature support with more discrimination is provided for a subsequent detection process.
3. The unmanned aerial vehicle aerial image target detection method based on high-frequency detail feature compensation according to claim 1 is characterized in that the high-frequency feature compensation module sequentially comprises an input receiver, a wavelet high-frequency convolution unit, a high-frequency subband fusion unit, a channel compression unit and a lightweight channel attention unit, wherein the wavelet high-frequency convolution unit carries out grouping convolution on an input feature map respectively based on three types of high-frequency subbands of LH, HL and HH obtained through Haar wavelet decomposition, extracts high-frequency detail components, carries out dynamic correction on fixed wavelet kernels by introducing a learnable increment parameter, the high-frequency subband fusion unit further applies non-negative frequency domain weights to the three high-frequency components, carries out weighted fusion, the channel compression unit adopts 1×1 convolution to map the fused high-frequency features to the number of target channels so as to reduce calculation complexity, the lightweight channel attention unit applies channel weighting to the compressed features so as to strengthen important texture information, and finally, the generated high-frequency compensation features are input to a subsequent fusion structure so as to restore edge and detail information lost in a downsampling process.
4. The unmanned aerial vehicle aerial image target detection method based on high-frequency detail feature compensation according to claim 1 is characterized in that the local spatial attention fusion module is composed of an identical direct connection branch and a lightweight feature extraction branch, the extraction branch firstly uses depth separable convolution to carry out preliminary processing on input, then splits the feature into three parallel multi-scale convolution branches, the three parallel multi-scale convolution branches are respectively combined with 1×1 convolution by expansion convolution, 1×1 convolution alone and asymmetric convolution composed of 1×3 convolution and 3×1 convolution to extract local features of different receptive fields and directional characteristics, after output of each branch is spliced along a channel dimension, the output of each branch is modulated through a SimAM spatial attention mechanism, primary fusion is achieved through 1×1 convolution, the obtained features are spliced with input of the identical branch again along the channel dimension, and finally mapped to a target channel number through second SimAM attention modulation and 1×1 convolution in sequence, the structure effectively enhances local texture expression and suppresses background noise, and therefore feature discrimination under small targets and complex scenes is improved before the local texture expression is sent to a detection head.

Description

Unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation Technical Field The invention relates to an unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation, and belongs to the field of computer vision. Background In recent years, a target detection technology based on remote sensing images has been widely applied to key scenes such as traffic flow monitoring, disaster response evaluation and military reconnaissance. Compared with a satellite platform, the unmanned aerial vehicle platform has the advantages of low manufacturing cost, flexible deployment, capability of acquiring medium-low-altitude high-resolution images and the like, and gradually becomes a main stream means of space-to-ground perception. However, as the targets in the unmanned aerial vehicle aerial pictures generally show the characteristics of small scale, dense quantity, serious shielding, complex background and the like, the detection precision of the traditional target detection algorithm in the scene is obviously reduced. The existing detection algorithm mainly comprises the following categories: Although the dual-stage detector (such as a fast R-CNN series) has higher positioning precision, the structure of candidate region generation and classification regression separation leads to high reasoning delay, and the real-time requirement of an unmanned plane platform is difficult to meet; The single-stage detector (such as SSD, YOLOv5/YOLOv 8/YOLOv) has simple structure and high reasoning speed, is convenient to be deployed on the edge equipment with limited resources, but has congenital defects in multi-scale fusion and small target detail modeling; The improvement method based on the feature pyramid structure (such as FPN and variants thereof BiFPN, PANet and the like) enhances shallow semantics through top-down or bottom-up feature transfer, but still depends on a large amount of low-resolution trunk features in the fusion process, and high-frequency texture information is easy to lose; Transformer-based end-to-end detectors (e.g., RT-DETR) have strong global modeling capability, but high-level semantics often cover local edge features in the interaction process, lack sensitivity to small-scale targets, and have higher model computation complexity, thus being unfavorable for lightweight deployment. In summary, the following technical bottlenecks still exist in the existing algorithm. The high-frequency detail features are compressed or eliminated in the process of repeated downsampling, so that the semantics of a small target are unrecognizable, the low-resolution features are dominant in the process of multi-scale feature fusion, a compensation mechanism for high-resolution spatial information is lacked, interference textures are easily introduced into a complex background area, and the model lacks noise suppression capability for a local key structure. Therefore, a novel unmanned aerial vehicle aerial image target detection method with high-frequency detail compensation capability, explicit enhancement of high-resolution characteristics before fusion and realization of local space selective filtering before detection head is needed to improve recognition accuracy of small targets and complex background scenes. Disclosure of Invention Aiming at the problems existing in the existing method, the invention aims to provide an unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation. The embodiment of the invention provides an unmanned aerial vehicle aerial image target detection method based on high-frequency detail characteristic compensation, which comprises the following steps: Step 1, data acquisition and preprocessing, namely acquiring an unmanned aerial vehicle aerial image data set and preprocessing the unmanned aerial vehicle aerial image data set. Step 2, constructing a target detection model; And 2-1, reconstructing YOLOv a multi-scale feature fusion network in a target detection algorithm, and on the basis of the original top-down and bottom-up feature fusion path, enhancing the reservation and detail compensation of a high-resolution feature map, reducing the interference of low-resolution features on a fusion result, and obtaining the detail-compensated multi-scale feature fusion network. Further, the multi-scale feature fusion network aims to increase the retention and detail compensation of a high-resolution feature map, and particularly, a high-frequency detail compensation branch and local space attention fusion mechanism is introduced on the basis of an original feature pyramid structure, so that the contribution of high-resolution features from a shallow network in the fusion process is enhanced. The network not only keeps the transmission of semantic information in a top-down path, but also enhances the return of space details thr