CN-121982531-A - Remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion

CN121982531ACN 121982531 ACN121982531 ACN 121982531ACN-121982531-A

Abstract

The invention discloses a remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion, which relates to the technical field of remote sensing image processing and comprises the steps of S1, constructing a partition space enhancement feature module PSEF, S2, constructing a frequency domain contrast fusion module PDCF, S3, constructing a shallow feature enhancement module LFE, comprising a large receptive field attention structure submodule LSKA and a lightweight feature Efficient fusion submodule C2f_effect, S4, cooperatively integrating the partition space enhancement module, the frequency domain contrast fusion module and the shallow feature enhancement module into a YOLOv S target detection model to obtain an improved detection model for remote sensing target detection, and solving the problems of weak small target characterization, easy detail, insufficient edge texture utilization and the like in the existing remote sensing target detection.

Inventors

NIU WEIHUA
GUO XUN

Assignees

华北电力大学（保定）

Dates

Publication Date: 20260505
Application Date: 20260123

Claims (10)

1. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion is characterized by comprising the following steps of: s1, constructing a partition space enhancement feature module PSEF; s2, constructing a frequency domain contrast fusion module PDCF; s3, constructing a shallow characteristic enhancement module LFE, wherein the shallow characteristic enhancement module LFE comprises a large receptive field attention structure sub-module LSKA and a lightweight characteristic Efficient fusion sub-module C2f_efficiency; S4, the partition space enhancement module, the frequency domain comparison fusion module and the shallow feature enhancement module are cooperatively integrated into a YOLOv S target detection model to obtain an improved detection model, and remote sensing target detection is carried out.
2. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 1, wherein in S1: s11, carrying out partitioned downsampling processing on the input features, dividing the input features into four paths of first-stage sub-features with the same size, wherein the expression is as follows: ; In the formula, Representing the first-order sub-features obtained after the partitioned downsampling process, Representing a first level sub-feature index; features representing upper level inputs; representing a partitioned downsampling operation; s12, respectively carrying out feature extraction and channel information fusion on the four paths of first-level sub-features to obtain fusion features; s13, dividing the fusion characteristic into three secondary sub-characteristics along the channel dimension, wherein the expression is as follows: ; In the formula, Representing a secondary sub-feature obtained after the fusion feature is divided along the channel dimension, Representing secondary sub-feature indexes after channel division; representing a channel splitting operation; S14, carrying out enhancement processing on the two paths of secondary sub-features to obtain two paths of enhancement features, splicing the two paths of secondary sub-features which do not participate in the enhancement processing with the two paths of enhancement features in the channel dimension, and carrying out cross-channel information fusion through point-by-point convolution to obtain partition features, wherein the expression is as follows: ; In the formula, Representing partition characteristics corresponding to the first-level sub-characteristics; The method comprises the steps of representing splicing according to channel dimensions and performing 1 multiplied by 1 convolution fusion operation; representing a fused coordinate attention operation into the feature map; representing performing depth separable convolution operation on the feature map; Representing a first path of second-level sub-features; representing a second path secondary sub-feature; Representing a third path of second level sub-features; S15, uniformly splicing and fusing the four-way partition characteristics to obtain the output characteristics of the partition space enhancement characteristic module The expression is: ; In the formula, Representing the first path partition characteristics; representing a second path partition characteristic; Representing a third path of partition characteristics; representing the fourth path partition characteristics.
3. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 2 is characterized in that enhancement processing in S14 specifically comprises the steps of introducing a space position modeling mechanism into one secondary sub-feature, coding position information in a feature map, superposing and fusing the other secondary sub-feature and a corresponding primary sub-feature, and extracting effective feature information through light convolution operation.
4. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 3, wherein in S2: S21, to input characteristics Local context modeling is performed, wherein 、 And The number of channels, height and width of the features are represented respectively, Representing real number domain, extracting neighborhood information in the feature map by continuous convolution operation to obtain initial enhancement feature ; S22, for initial enhancement feature Performing linear mapping to obtain a value vector , Expanding the value vector in a local window range to obtain local characteristics, wherein the expression is: ; In the formula, Expressed in terms of For all values in the central local window, , Indicating the size of the window; And Respectively representing the horizontal offset and the vertical offset of the local window; S23, introducing Haar wavelet transformation to carry out frequency domain decomposition on the features obtained after the initial enhancement features are subjected to padding operation, and dividing the features into high-frequency components And low frequency component ; S24, respectively performing feature mapping on the high-frequency component and the low-frequency component to generate a high-frequency weight And low frequency weights The expression is: ; ; In the formula, A linear transformation weight matrix representing high frequency components; A linear transformation weight matrix representing the low frequency components; S25, carrying out feature recombination and fusion in a weighted mode to obtain output features of the frequency domain comparison fusion module The normalization processing is carried out on the high-frequency weight and the low-frequency weight, the high-frequency weight is applied to the local feature obtained in the S22 after the normalization processing, and the expression is: ; In the formula, Representing a convolution operation using two standard convolutions; representing a normalization function; Expressed in terms of Low frequency weights in the local window that is the center; Expressed in terms of High frequency weights in a local window that is centered; Representing element-by-element multiplication.
5. The method for detecting the remote sensing target based on the contrast fusion of the partition space enhancement and the frequency domain according to claim 4, wherein the high-frequency component comprises detail information, and the low-frequency component comprises the whole structure and the background semantic information of the image.
6. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 5, wherein in S3: s31, modeling is carried out through separable large-size convolution check features in the large receptive field attention structure submodule, and the expression is as follows: ; ; In the formula, Representing an intermediate feature map after depth separable convolution; represent the first Weights of the individual channel convolution kernels; representing a convolution operation; representing the first input feature map A plurality of channels; representing a feature map after cavity convolution; representing the original convolution kernel size; Representing the void fraction; representing the height of the feature map; representing the width of the feature map; s32, constructing a lightweight characteristic efficient fusion sub-module through a branched characteristic fusion structure.
7. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 6, wherein in S32: s321, carrying out channel adjustment on input features, wherein the expression is as follows: ; In the formula, Representing a characteristic diagram after adjusting the channel; Representing input features of the lightweight feature efficient fusion sub-module; indicating that the convolution kernel is of size Channel fusion operations of (2); Representing the number of input channels; representing the number of channels; s322, enhancing the features layer by layer through a plurality of high-efficiency feature processing units FasterBlock, and carrying out cross-layer transfer and fusion of the features in the lightweight high-efficiency feature fusion sub-module to obtain a high-resolution feature map, wherein the expression is as follows: ; In the formula, Representing the output characteristics of the lightweight characteristic efficient fusion submodule; representing the number of lightweight feature efficient fusion submodules internal FasterBlock; represent the first Index of FasterBlock units; And (3) with Respectively represent the following The layer input features are divided into a first part of sub-features and a second part of sub-features according to channel dimensions; representing a partial channel feature performing a convolution kernel of size Is a convolution operation of (1); Representing feature stitching operations performed according to channel dimensions; The representation uses convolution kernel of size And performing channel fusion operation.
8. The method for remote sensing target detection based on partition space enhancement and frequency domain contrast fusion according to claim 7, wherein in S4, the improved detection model uses YOLOv S target detection model as a base network, maintains the whole frame of YOLOv S target detection model, and improves the backbone network part and the neck network part.
9. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion according to claim 8, wherein the 2 nd and 4 th CBS modules are replaced by partition space enhancement feature modules for backbone network parts, feature characterization capability of a small-scale target in a remote sensing image is improved through partition downsampling and space feature enhancement processing, and a frequency domain contrast fusion module is introduced after the SPPF module to conduct frequency domain decomposition and contrast fusion processing on features.
10. The remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion of claim 9, wherein for a neck network part, a new feature branch is led out from a1 st C2f module of a backbone network to construct a shallow feature enhancement module.

Description

Remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion Technical Field The invention relates to the technical field of remote sensing image processing, in particular to a remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion. Background The remote sensing image is data obtained by imaging a ground surface target through a remote sensing sensor carried by a satellite, an unmanned aerial vehicle and other platforms, has the characteristics of wide coverage, high acquisition efficiency, rich space and spectrum information and the like, and is widely applied to the fields of ocean monitoring, disaster early warning, resource investigation and the like. Along with the continuous improvement of the remote sensing data acquisition capability, how to efficiently and accurately detect targets from remote sensing images has become an important research direction in the field of remote sensing information processing. However, compared with a natural scene image, the remote sensing image has the common problems of complex background, large target scale difference, large number of small targets, sparse distribution, limited resolution, fuzzy target boundary and the like, so that the difficulty of distinguishing between the targets and the background is remarkably increased, and the conventional method is easy to miss detection and misdetection particularly in complex scenes and small target detection tasks. The existing remote sensing image target detection method mainly comprises a traditional method based on manual characteristics and a method based on deep learning. The traditional method relies on manual design features and combines a classifier to realize detection, has limited feature expression capability, and is difficult to adapt to complex background and multi-scale change. Along with the development of deep learning, a target detection method based on a convolutional neural network gradually becomes the mainstream, wherein the double-stage method has higher detection precision, complex calculation and insufficient real-time performance, and the single-stage method has higher detection speed, but has still limited characteristic expression capability on a small target under a complex background condition. Disclosure of Invention The invention aims to provide a remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion, which solves the problems of weak small target characterization, volatile details, insufficient edge texture utilization and the like in the existing remote sensing target detection. In order to achieve the above purpose, the invention provides a remote sensing target detection method based on partition space enhancement and frequency domain contrast fusion, which comprises the following steps: s1, constructing a partition space enhancement feature module PSEF; s2, constructing a frequency domain contrast fusion module PDCF; s3, constructing a shallow characteristic enhancement module LFE, wherein the shallow characteristic enhancement module LFE comprises a large receptive field attention structure sub-module LSKA and a lightweight characteristic Efficient fusion sub-module C2f_efficiency; S4, the partition space enhancement module, the frequency domain comparison fusion module and the shallow feature enhancement module are cooperatively integrated into a YOLOv S target detection model to obtain an improved detection model, and remote sensing target detection is carried out. Preferably, in S1: s11, carrying out partitioned downsampling processing on the input features, dividing the input features into four paths of first-stage sub-features with the same size, wherein the expression is as follows: ; In the formula, Representing a first-level sub-feature obtained after the partitioned downsampling process; features representing upper level inputs; representing a partitioned downsampling operation; representing a first level sub-feature index; s12, respectively carrying out feature extraction and channel information fusion on the four paths of first-level sub-features to obtain fusion features; s13, dividing the fusion characteristic into three secondary sub-characteristics along the channel dimension, wherein the expression is as follows: ; In the formula, Representing a secondary sub-feature obtained after the fusion feature is divided along the channel dimension; representing a channel splitting operation; representing secondary sub-feature indexes after channel division; S14, carrying out enhancement processing on the two paths of secondary sub-features to obtain two paths of enhancement features, splicing the two paths of secondary sub-features which do not participate in the enhancement processing with the two paths of enhancement features in the channel dimension, and carrying out cross-channel information fusion throug