CN-121982525-A - Multi-scale agricultural remote sensing image super-resolution reconstruction method with progressive sparse attention

CN121982525ACN 121982525 ACN121982525 ACN 121982525ACN-121982525-A

Abstract

The invention discloses a multi-scale agricultural remote sensing image super-resolution reconstruction method with progressive sparse attention, and belongs to the technical field of image processing. The method comprises the steps of sequentially processing by a shallow feature extraction module, a mixed attention module, a multi-scale reconstruction module and an up-sampling module, wherein the mixed attention module dynamically focuses on a key region and models long-range dependence by alternately executing a progressive sparse local attention extraction process and a progressive sparse global attention extraction process, the multi-scale reconstruction module extracts and fuses different receptive field features by parallel multi-scale paths, performs detail enhancement and spectrum correlation modeling, and finally up-samples to obtain a super-resolution image. The invention can obviously reduce the calculation cost, effectively integrate local details and global semantics, enhance the trans-scale feature expression capability, obtain the reconstruction result with high quality and high fidelity, and serve the fine agricultural management tasks such as crop growth monitoring, pest and disease identification and the like.

Inventors

LIU DEYANG
WU YIXIANG
LIU YONG
CHEN XIAOFENG
ZHANG HONGYAN
WANG QIJIA
WANG XIANYANG

Assignees

安庆师范大学
北大荒信息有限公司

Dates

Publication Date: 20260505
Application Date: 20260119

Claims (10)

1. The method is characterized by being realized through a constructed super-resolution reconstruction network, and the network sequentially comprises a shallow layer feature extraction module, a mixed attention module, a multi-scale reconstruction module and an up-sampling module; S1, performing initial feature extraction and flattening treatment on an input low-resolution multispectral agricultural remote sensing image through the shallow feature extraction module; S2, inputting the features obtained in the S1 into the mixed attention module, and alternately executing a progressive sparse local attention extraction process and a progressive sparse global attention extraction process by the mixed attention module to obtain deep features fused with local details and global semantics in an iterative mode; S3, after the deep features obtained in the S2 are fused with the shallow features obtained in the S1, inputting the deep features into the multi-scale reconstruction module, extracting and fusing the features of different receptive fields through a multi-scale convolution path connected in parallel, and carrying out detail enhancement and spectrum correlation modeling to obtain multi-scale reconstruction features; S4, inputting the multiscale reconstruction features obtained in the step S3 into the up-sampling module, improving the resolution, and outputting a super-resolution multispectral agricultural remote sensing image.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises, The S1 shallow feature extraction module performs initial feature extraction and flattening processing on the input low-resolution multispectral agricultural remote sensing image, and the initial feature extraction and flattening processing comprises the following steps: extracting R, G, B, NIR four channel data from the low-resolution agricultural remote sensing image, extracting shallow layer characteristics of the four channel data by utilizing a convolution layer with the convolution kernel size of 3 multiplied by 3, flattening the extracted two-dimensional spatial characteristics into three-dimensional tensors, and carrying out layer normalization processing.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises, The step of performing progressive sparse local attention extraction by the mixed attention module in S2 includes: The method comprises the steps of carrying out layer normalization on input features, generating query, keys and value matrixes through linear projection, carrying out grouping convolution on the value matrixes to generate local position codes, calculating scaling dot product attention with the local position codes in each window based on window division, generating a trainable progressive focusing matrix in each window according to the query matrixes and the key matrixes, and enabling the trainable progressive focusing matrix to act with the value matrixes to obtain progressive sparse attention features, wherein parameters of the progressive focusing matrix dynamically change along with a training process.
4. The method of claim 3, wherein the step of, The mixed attention module in the S2 alternately executes a progressive sparse local attention extraction process and a progressive sparse global attention extraction process, wherein the progressive sparse global attention extraction process is different from the progressive sparse local attention extraction process in that an interval blocking strategy is adopted to replace a continuous blocking strategy when window division is carried out on input features.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises, In the step S3, after the deep features and the shallow features are fused, the input multi-scale reconstruction module comprises: And adding the result of the deep layer feature subjected to 3×3 convolution processing with the shallow layer feature element by element.
6. The method of claim 1, wherein the step of determining the position of the substrate comprises, The step S3 of extracting and fusing different sensing wild features by the multi-scale reconstruction module through a multi-scale convolution path connected in parallel comprises the following steps: The input features are simultaneously sent into at least three parallel convolution paths, wherein the first path uses 1X 1 convolution, the second path uses 3X 3 convolution with the step length of 2 and then is connected with an up-sampling layer, the third path uses 3X 3 convolution with the step length of 2 and then is connected with the up-sampling layer, and feature images output by the paths are spliced and convolved.
7. The method of claim 6, wherein the step of providing the first layer comprises, The step S3 of detail enhancement by the multi-scale reconstruction module comprises the following steps: and processing the features obtained after convolution fusion through a convolution layer group with a residual connection structure, wherein the convolution layer group with the residual connection structure comprises two serially connected 3×3 convolution layers, and the outputs and the inputs of the convolution layer groups are added element by element.
8. The method of claim 7, wherein the step of determining the position of the probe is performed, The step S3 of carrying out spectrum correlation modeling by the multi-scale reconstruction module comprises the following steps: grouping the features with enhanced details according to channels, carrying out global average pooling on each group of features, interactively calculating the attention weights among the channels through a full connection layer, and carrying out weighted adjustment on the original features by utilizing the attention weights.
9. A computer terminal device, comprising: One or more processors; A memory coupled to the processor for storing one or more programs; When executed by the one or more processors, causes the one or more processors to implement the steps of the method of any of claims 1-8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-8.

Description

Multi-scale agricultural remote sensing image super-resolution reconstruction method with progressive sparse attention Technical Field The invention belongs to the technical field of image processing, and particularly relates to a multi-scale agricultural remote sensing image super-resolution reconstruction method with progressive sparse attention. Background The multispectral super-resolution agricultural remote sensing image is a key data source of fine agricultural management tasks such as crop growth monitoring, pest and disease identification and the like. However, due to the restriction of objective factors such as sensor cost and physical diffraction limit, the directly acquired remote sensing image often has insufficient spatial resolution, and the high-precision analysis requirement on the details of the ground object is difficult to meet. For this reason, remote sensing image super-resolution reconstruction techniques have been developed to recover super-resolution images from low-resolution images. Early reconstruction capabilities based on conventional methods such as interpolation were limited. In recent years, the super-resolution method based on deep learning has made remarkable progress by virtue of the strong feature learning capability, and is gradually applied to the field of remote sensing image processing. The existing super-resolution reconstruction method based on deep learning mainly follows two main body architectures, namely an architecture based on a convolutional neural network, features are extracted through stacking convolutional layers and upsampling is carried out, and another architecture based on a transducer is used for modeling the dependency relationship inside an image by using a self-attention mechanism. These methods perform well on general natural images, but still face a number of fundamental challenges in processing agricultural multispectral remote sensing images with unique properties. First, the problem of huge computational overhead is particularly pronounced. Agricultural multispectral data typically contains multiple spectral channels, with data dimensions and information content much higher than common RGB images. This results in existing complex network models, particularly the Transformer class models that rely on global interactions, facing significant computational and memory pressures during processing. The high training and reasoning cost severely restricts the deployment of the model in practical agricultural application, and the strict requirements of scenes such as field monitoring on the timeliness of the treatment are difficult to meet. Secondly, the receptive field of the existing method is limited, and long-range dependence and large-scale structures in agricultural scenes are difficult to effectively model. Macroscopic features such as regular textures of continuous farmland, continuity contours of large ditches or protective forests and the like widely exist in the agricultural remote sensing image. However, existing methods rely mostly on local operations, lack of efficient modeling capabilities for global semantic consistency and macrostructure of images, resulting in a lack of overall structural consistency of the reconstructed images. Finally, the lack of multi-scale feature capture capability of agricultural scenes is also a key drawback. A typical agricultural remote sensing image simultaneously comprises ground object targets with different scales such as farmlands, crops and ditches. The traditional super-resolution method often adopts a single-scale feature extraction strategy, so that the restoration of local crop details and the reconstruction of global farmland patterns are difficult to be simultaneously considered, and the practicability of the method in multi-level agricultural analysis tasks is limited. Solving the above problems is difficult, because reducing the computational complexity often comes at the expense of model expression capability, and expanding the receptive field and capturing the multi-scale features further tends to further exacerbate the computational burden, and how to achieve efficient, high-fidelity reconstruction under limited resources is a long-standing technical problem. Disclosure of Invention In order to solve the technical problems, the invention provides a multi-scale agricultural remote sensing image super-resolution reconstruction method with progressive sparse attention, which aims to solve the problems in the prior art. In order to achieve the above object, the present invention provides a method for reconstructing a multi-scale agricultural remote sensing image with progressive sparse attention, wherein the method is implemented by a constructed super-resolution reconstruction network, and the network sequentially comprises a shallow feature extraction module, a mixed attention module, a multi-scale reconstruction module and an up-sampling module; S1, performing initial feature extraction and