CN-121998830-A - NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion

CN121998830ACN 121998830 ACN121998830 ACN 121998830ACN-121998830-A

Abstract

The invention discloses an NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion, and belongs to the technical field of remote sensing image processing. Aiming at the problems of insufficient prior constraint, easy distortion of a space structure, low heterogeneous landscape reconstruction precision and the like of the existing space-time fusion model, swinTransformer is taken as a space-time coding backbone, independent multi-scale LUCC and DEM prior coding branches are constructed, step-by-step conditional injection of ground-like semantic and topographic information is realized in a decoding stage through a feature refining adapter FRA, and SpatialMixingMLP is adopted to enhance space induction bias. The invention can effectively inhibit the wrong migration, boundary drift and excessive smoothness of the ground crossing type, and remarkably improve the NDVI reconstruction precision, spatial consistency and ecological rationality of the finely divided plaque and the complex terrain area. The method is suitable for multisource remote sensing data such as high score, landsat, MODIS and the like, can be used for generating high space-time resolution NDVI time sequence products in batches, and provides stable and reliable data support for ecological monitoring, wetland protection, coastal zone management and the like.

Inventors

CHEN ZIYING
YAN FENGQIN
MAO YUJIE
SU FENZHEN

Assignees

中国科学院地理科学与资源研究所

Dates

Publication Date: 20260508
Application Date: 20260401

Claims (10)

1. The NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion is characterized by comprising the following steps of: s1, acquiring and preprocessing a high-spatial resolution, low-spatial and high-temporal resolution remote sensing image, and calculating to obtain corresponding time sequence NDVI data; S2, constructing a space-time fusion input sample consisting of a high-low resolution NDVI at a reference moment and a low-resolution NDVI at a target moment, and taking the high-resolution NDVI at the target moment as a supervision tag; s3, preparing the ground data and DEM topographic data of the research area LUCC as static space prior and realizing space binding and dimension unification with the sample; S4, carrying out multi-scale space-time characteristic coding on the NDVI samples by taking SwinTransformer as a main encoder, wherein a SpatialMixingMLP structure is adopted in a feedforward network of the encoder; S5, constructing independent priori coding branches, and respectively carrying out multi-scale coding on LUCC and DEM to obtain semantic features and topographic features aligned with pyramid scales of the image features; s6, in the step-by-step upsampling process of the decoder, the same-scale priori features are conditionally injected into decoding features through a feature refining adapter FRA, so that step-by-step space constraint is realized; s7, outputting the high-resolution NDVI at the target moment through multi-scale up-sampling decoding, and finishing model training and long-time sequence product generation through multi-constraint joint loss.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises, The SpatialMixingMLP sequentially comprises 1 multiplied by 1 convolution dimension ascending, 7 multiplied by 7 depth separable convolution, channel mixed arrangement, 1 multiplied by 1 convolution dimension reduction and GELU activation, which are used for enhancing space induction bias and improving boundary and finely divided plaque characterization capability.
3. The method of claim 1, wherein the independent prior encoding branches comprise parallel LUCC semantic encoding branches and DEM topographic encoding branches, outputting a multi-scale semantic feature pyramid and a topographic feature pyramid, respectively, each scale feature being rigidly aligned with an image encoding feature size.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises, The feature refinement adapter FRA performs the following steps: (1) Splicing the same-scale LUCC semantic features with DEM topographic feature channels, and performing 3×3 convolution, groupNorm normalization and GELU activation to obtain fusion prior features; (2) Generating a channel-level scaling parameter gamma and an offset parameter beta by two-layer 1×1 convolution; (3) Affine transformation modulation is performed on the decoded features, F out =γ⊗F dec +β, where ⊗ is element-wise multiplication.
5. The method of claim 1, wherein the decoder employs a SpatialMixingMLP structure consistent with the encoder at each upsampling stage to preserve spatial detail of ground-like edges, textures, and finely divided vegetation patches.
6. The method of claim 1, wherein the multi-constraint joint loss for training comprises NDVI numerical reconstruction L1 loss or Charbonnier loss, spatial structure SSIM loss or gradient loss, and setting weighting coefficients in LUCC class boundary regions to enhance edge accuracy.
7. The method of claim 1, wherein the step of determining the position of the substrate comprises, LUCC data resampling uses nearest neighbor interpolation to preserve class semantics and DEM data uses linear interpolation to preserve terrain continuous gradients.
8. The method of claim 1, wherein the model reasoning takes the high-low resolution NDVI at the reference time, the low resolution NDVI at the target time, the LUCC and the DEM as prior inputs, and the target date high-resolution NDVI is obtained by block-wise reasoning and seamless stitching.
9. The NDVI reconstruction system based on LUCC-DEM space priori and SwinTransformer space-time fusion is characterized by comprising a data preprocessing module, a sample construction module, a priori preparation module, a SwinTransformer coding module, an independent priori coding module, an FRA condition injection module, a multi-scale decoding module and a training reasoning module; the system performs the method of any of claims 1-8.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1-8.

Description

NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion Technical Field The invention belongs to the technical field of remote sensing image processing and geographic information intelligent analysis, and particularly relates to an NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion. Background Along with the rapid development of remote sensing earth observation technology, a high space-time resolution vegetation index has become core basic data in the fields of ecological monitoring, wetland protection, coastal zone management, agricultural fine observation and the like. In practical application, the remote sensing data usually presents two typical characteristics, namely high spatial resolution and low temporal resolution data (such as high-resolution No. six PMS and Landsat series), can clearly reflect the surface details, but has long revisit period and sparse time sequence, and low spatial resolution and high temporal resolution data (such as MODIS and high-resolution No. six WFV), can continuously capture the time sequence change of vegetation, but has insufficient space detail deletion and broken landscape characterization capability. The space-time fusion (STF) technology generates an NDVI sequence with high spatial resolution and high time resolution by combining high and low resolution data, and is a mainstream technology path for solving the contradiction. The existing normalized vegetation index (NDVI) space-time fusion reconstruction method is mainly divided into three categories of traditional weighted fusion, time sequence fitting enhancement and deep learning fusion. At early stage, STARFM, ESTARFM is represented, based on the assumption of spatial similarity and time consistency, NDVI space-time interpolation is realized through weighted migration, time sequence change can be reflected well in a homogeneous vegetation area, but ground-crossing type error migration, boundary blurring and spatial structure distortion are easy to occur in a heterogeneous landscape area with staggered ground types and complex topography. Models such as STVIFM, SSFIT, STFSR special for NDVI are reconstructed by strengthening time sequence continuity and airspace coupling, so that the time smoothness and the space structure recovery capability are improved to a certain extent, but the generalization capability on complex surface processes is limited by relying on artificial design rules. The fusion-fitting method combines space-time fusion with Savitzky-Golay and other filtering, improves the sequence stability under the conditions of cloud pollution and observation deletion, but has insufficient constraint capability on boundaries and finely divided plaques. In recent years, the reconstruction precision is remarkably improved by the nonlinear mapping between the high-resolution NDVI and the low-resolution NDVI through the end-to-end learning based on a deep learning space-time fusion method of a convolutional neural network and a residual network. However, the existing depth model still has obvious defects that firstly, data statistics characteristics are excessively depended, explicit constraint of geographical and ecological priori knowledge is lacking, space structure irrational and ecological meaning distortion easily occur in high-level mixed areas of land surface types such as coastal zones and delta wetlands, secondly, auxiliary information is mostly input by simple splicing, a multi-scale alignment and conditional constraint mechanism is not formed, priori knowledge is difficult to effectively guide a reconstruction process, thirdly, space induction bias of a pure transducer model is weak, and recovery capability of fine structures such as tidal flat edges, culture pond boundaries and narrow strip vegetation is insufficient, and excessive smoothness and boundary drift easily occur. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion, which takes SwinTransformer as a space-time coding backbone, constructs independent multi-scale priori coding branches, realizes step-by-step conditional injection of ground semantic and terrain information in a decoding stage through a Feature Refining Adapter (FRA), adopts SpatialMixingMLP to enhance space structure induction bias, improves the space consistency, boundary stability and ecological rationality of NDVI reconstruction of a heterogeneous landscape region, and solves the problems of prior constraint loss, space structure distortion, poor finely divided region recovery and the like of the prior method. The technical scheme adopted for solving the technical problems is as follows: An NDVI high-resolution reconstruction method based on LUCC-DEM space priori and SwinTransformer space-time fusion comprises