CN-121259289-B - Remote sensing image rotation target detection method based on dual-path feature enhancement

CN121259289BCN 121259289 BCN121259289 BCN 121259289BCN-121259289-B

Abstract

The invention provides a remote sensing image rotating target detection method based on dual-path characteristic enhancement, and relates to the technical field of computer vision and remote sensing image processing. The method comprises the steps of constructing a dual-path characteristic enhanced remote sensing image rotation target detection network comprising a texture enhancement path and a direction modeling path, wherein in the texture enhancement path, a self-adaptive wavelet reconstruction module is adopted to enhance texture details and edge characteristics in an input characteristic image, in the direction modeling path, a multi-scale angle guiding deformable encoder is adopted to conduct space alignment and direction consistency modeling on the input characteristic image, characteristics comprising structure and direction information are extracted, the characteristics output by the two paths are fused, a joint loss function is constructed, an end-to-end training network is constructed, and the rotation target in the remote sensing image is detected and positioned by utilizing the trained network. The method and the device can effectively improve the detection precision and the robustness of multi-direction, multi-scale, densely distributed and small-size targets in the remote sensing image.

Inventors

ZENG HUI
LI TIANHUI
WU XINLU
SHI XUEFEI
LIU TAO

Assignees

北京科技大学

Dates

Publication Date: 20260512
Application Date: 20250928

Claims (6)

1. A method for detecting a rotational target of a remote sensing image based on dual-path feature enhancement, the method comprising: S1, constructing a dual-path characteristic enhanced remote sensing image rotation target detection network comprising a texture enhanced path and a direction modeling path, In the texture enhancement path, an adaptive wavelet reconstruction module is adopted to enhance texture details and edge features in an input feature map, wherein the feature map is extracted from a remote sensing image; In a direction modeling path, a multi-scale angle guiding deformable encoder is adopted to perform space alignment and direction consistency modeling on an input feature map, and features containing structure and direction information are extracted; fusing the characteristics output by the two paths through a cross attention mechanism to generate multi-scale characteristics with enhanced discrimination capability; Wherein the multi-scale angle-guided deformable encoder comprises: the light angle prediction module is used for predicting pixel-level angle information from the input feature map and generating an angle guiding map so as to provide target orientation or direction prompt of the space position; The rotation sensitive visual encoder is used for carrying out collaborative modeling on the angle guiding graph and the input feature graph, extracting direction sensitive features and generating a rotation sensitive visual feature graph; the visual-angle cross attention module is used for realizing the spatial alignment of rotation sensitive visual features and angle features in the rotation sensitive area and generating a multi-scale feature map with enhanced direction consistency; The light angle prediction module is used for outputting a multi-scale characteristic graph from the characteristic pyramid network In the prediction pixel level angle information, and generating a corresponding angle guidance diagram To provide a target heading or directional hint for each spatial location, wherein, ; Wherein, the Representing a lightweight angle prediction module; Wherein the rotation sensitive visual encoder is particularly used for guiding the image to the angle Performing channel expansion to make it and In channel dimension alignment, an angular feature is generated, expressed as: ; Wherein, the The characteristic of the angle is represented by the angle, Representing an angle encoder consisting of two layers of convolution and normalization, nonlinear activation; For a pair of Introducing spatial perception modeling Constructing more discriminant spatial context enhancement feature map Expressed as: ; Will be And (3) with Element-by-element addition, global average pooling is applied to extract the spatial mean of each channel as its global representation, and channel attention weights are generated by nonlinear mapping Expressed as: ; Wherein, the Representing the sigmoid activation function, Representing a global average pooling of the data, Representing a channel-by-channel map; Using weighting factors Modulating the angle features and injecting into the original input features In forming a final rotation sensitive visual characteristic map Expressed as: ; Wherein the vision-angle cross-attention module is particularly adapted to sense visual features in rotation Constructing Query, and characterizing the encoded angle As Value, dynamic sampling of local area is carried out through reference points and learnable offsets, spatial alignment of rotation sensitive visual features and angle features is realized, and a multi-scale feature map with enhanced direction consistency is generated; S2, constructing a joint loss function, and enhancing a remote sensing image rotation target detection network based on the generated multi-scale characteristics with enhanced discrimination capability and the end-to-end training dual-path characteristics of the joint loss function, wherein the joint loss function comprises classification loss, rotation perception detection frame regression loss and angle regression loss; And S3, detecting and positioning a rotating target in the remote sensing image by utilizing the trained dual-path characteristic enhanced remote sensing image rotating target detection network.
2. The method for detecting a rotational target of a remote sensing image based on dual-path feature enhancement according to claim 1, wherein the dual-path feature enhancement remote sensing image rotational target detection network comprises a backbone network, a feature pyramid network, a dual-path feature enhancement framework and a target detection head, wherein the dual-path feature enhancement framework comprises a texture enhancement path and a direction modeling path, The main network is used for extracting multi-scale features from the input remote sensing image; the feature pyramid network is used for enhancing semantic expression capability of features with different scales; The dual-path characteristic enhancement framework is used for processing characteristics output by the trunk network and the characteristic pyramid network and realizing enhancement of texture details and edge characteristics, space alignment and direction consistency modeling; The target detection head is used for processing the characteristics output by the dual-path characteristic enhancement frame and finishing the classification and regression of the rotating target in the remote sensing image.
3. The dual-path feature enhancement-based remote sensing image rotation target detection method of claim 1, wherein the adaptive wavelet reconstruction module comprises: The discrete wavelet transformation sub-module is used for carrying out multistage discrete wavelet transformation on the input shallow high-resolution feature map and extracting high-frequency features containing texture and edge information and low-frequency features containing semantic structures; The self-adaptive weighted fusion sub-module is used for dynamically fusing the high-frequency characteristic and the low-frequency characteristic according to the response intensity so as to improve the structural integrity of the characteristic; And the selective residual error attention sub-module is used for enhancing the structural expression of the target area based on the fusion characteristics, inhibiting background redundancy and outputting a characteristic diagram with enhanced details.
4. The method for detecting a rotational target of a remote sensing image based on dual-path feature enhancement as claimed in claim 3, wherein for a P3-level feature map output by a feature pyramid network The fusion characteristics are expressed as: ; ; ; ; ; Wherein, the Representing a fusion feature; representing element-by-element multiplication; Representing a sigmoid activation function; to prevent the denominator from being a parameter of 0, reLU represents an activation function, GN represents group normalization, conv represents a convolution operation; 、 Weights respectively representing low-frequency components and high-frequency components; 、 respectively representing the weights of the normalized low-frequency components and the normalized high-frequency components; 、 Representing the low frequency component and the high frequency component after the primary wavelet transform, respectively.
5. The method for detecting a rotational target of a remote sensing image based on dual-path feature enhancement as claimed in claim 4, wherein the selective residual attention sub-module is specifically configured to calculate a fusion feature With original input features Residual error between: ; ; Wherein, the The residual is represented by a representation of the residual, A feature map representing the P3 hierarchy of the feature pyramid network output; generating a spatial-channel joint attention map using a convolution block attention module : ; Wherein CBAM () represents a convolution block attention module; Using For residual errors And (3) selectively enhancing to obtain: ; Wherein, the A feature map representing the selectively enhanced features; Using residual weighting factors Control of Injection intensity and with original input features Fusing, namely integrating information through a convolution block to obtain a final feature image with enhanced details, wherein the feature image is expressed as follows: ; Wherein, the Representing the feature map with enhanced detail.
6. The dual-path feature enhancement-based remote sensing image rotation target detection method according to claim 1, wherein the classification Loss is a Focal Loss-based classification Loss for processing foreground and background imbalance problems; The rotation sensing detection frame regression Loss is based on Smooth L1 Loss and is used for optimizing target position and scale prediction; and the angle regression loss is used for improving the accuracy of target direction prediction.

Description

Remote sensing image rotation target detection method based on dual-path feature enhancement Technical Field The invention relates to the technical field of computer vision and remote sensing image processing, in particular to a remote sensing image rotation target detection method based on dual-path characteristic enhancement. Background With the rapid development of satellite remote sensing and unmanned aerial vehicle imaging technologies, remote sensing images are widely applied to the fields of urban planning, disaster monitoring, resource management and the like, and rotation target detection in images has become one of key technical tasks. Compared with the target detection in the natural scene, the target in the remote sensing image has the characteristics of large scale difference, wide direction distribution, dense arrangement and the like, and the accuracy and the robustness of the detection algorithm are more highly required. The resolution of the existing remote sensing images is continuously improved, and the demands of application on target positioning accuracy and feature expression capability are increasing. On the spatial scale, the remote sensing image is limited by overlook imaging, and often contains a small target and a large ground feature at the same time, and the detection model needs to consider both fine grain texture characteristics and global semantic expression. However, shallow details are easily submerged by deep semantics in cross-layer feature fusion, resulting in layer-by-layer attenuation of small object (small-size object) information, degrading detection performance. In terms of geometric dimensions, due to factors such as target morphology, imaging angles and the like, targets such as ships, airplanes and the like can appear in any direction in an image, and are densely distributed in scenes such as ports, airports, parking lots and the like, and the space alignment and direction perception capability of the features are challenged. In addition, the remote sensing image is often influenced by complex environments such as illumination change, cloud layer shielding and the like, the detection uncertainty is further increased, and higher requirements are put on the robustness and the adaptability of the algorithm. Although the existing method alleviates the problems to a certain extent, the method still has obvious defects in the aspects of small target detail modeling, direction perception construction and dense region detection, and is difficult to meet the complex requirements of rotating target detection in remote sensing images. Therefore, a new feature enhancement mechanism and a rotation sensing detection framework are needed to improve the overall detection accuracy and generalization capability of the detection model. Disclosure of Invention In order to solve the technical problem that the complex requirement of detecting the rotating target in the remote sensing image is difficult to meet in the prior art, the embodiment of the invention provides a remote sensing image rotating target detection method based on dual-path characteristic enhancement. The technical scheme is as follows: In one aspect, a method for detecting a rotational target of a remote sensing image based on dual-path feature enhancement is provided, including: S1, constructing a dual-path characteristic enhanced remote sensing image rotation target detection network comprising a texture enhanced path and a direction modeling path, In the texture enhancement path, an adaptive wavelet reconstruction module is adopted to enhance texture details and edge features in an input feature map, wherein the feature map is extracted from a remote sensing image; In a direction modeling path, a multi-scale angle guiding deformable encoder is adopted to perform space alignment and direction consistency modeling on an input feature map, and features containing structure and direction information are extracted; fusing the characteristics output by the two paths through a cross attention mechanism to generate multi-scale characteristics with enhanced discrimination capability; S2, constructing a joint loss function, and enhancing a remote sensing image rotation target detection network based on the generated multi-scale characteristics with enhanced discrimination capability and the end-to-end training dual-path characteristics of the joint loss function, wherein the joint loss function comprises classification loss, rotation perception detection frame regression loss and angle regression loss; And S3, detecting and positioning a rotating target in the remote sensing image by utilizing the trained dual-path characteristic enhanced remote sensing image rotating target detection network. Further, the dual-path characteristic enhanced remote sensing image rotation target detection network comprises a main network, a characteristic pyramid network, a dual-path characteristic enhanced framework and a target detection he