Search

CN-121033675-B - Remote sensing image water body and shadow classification method based on double-classification attention network

CN121033675BCN 121033675 BCN121033675 BCN 121033675BCN-121033675-B

Abstract

The invention discloses a remote sensing image water body and shadow classification method based on a dual-classification attention network, which relates to the technical field of remote sensing image analysis and geographic information processing and comprises the steps of preprocessing a Sentinel-2 original image to generate a multiband synthesized image with preset resolution, calculating a normalized water body index and an extracted brightness channel, respectively generating a water body mask and a shadow mask, constructing a sample set, dividing the sample set into a training set and a verification set, constructing a dual-classification attention network model based on the training set, optimizing the model, adopting the trained dual-classification attention network model to perform sliding window reasoning on the multiband synthesized image, fusing and splicing predicted binary masks, and outputting a dual-band water body/shadow binary mask image. The method provided by the invention can be used for remarkably saving a large amount of manual labeling work required by the traditional method, effectively overcoming the interference of ice and snow shadow areas on the water body recognition result, and showing better robustness and stability.

Inventors

  • SHEN YI
  • WANG YI
  • CHEN QIANQIAN
  • ZHU TINGTING
  • XU LEI
  • CHEN YAO
  • WEI JIANXIONG

Assignees

  • 南京市计量监督检测院
  • 南京工业大学

Dates

Publication Date
20260508
Application Date
20250910

Claims (6)

  1. 1. The remote sensing image water body and shadow classification method based on the double-classification attention network is characterized by comprising the following steps of, Preprocessing the Sentinel-2 original image by a nearest neighbor interpolation method to generate a multiband synthesized image with preset resolution; Calculating a normalized water index and extracting a brightness channel based on the multiband synthetic image, and respectively generating a water mask and a shadow mask; The calculating the normalized water body index and the extracting the brightness channel based on the multiband synthetic image comprises the following steps: ground reflectivity of green light wave band and near infrared wave band is extracted from multi-wave band synthesized image, and normalized water index is calculated The formula is: Wherein: And The ground reflectivities of the green light wave band and the near infrared wave band respectively, and the normalized water body index is binarized by an empirical threshold value to generate a water body mask; Extracting red, green and blue three wave bands in multi-wave band synthesized image, converting to HSV color space and extracting brightness channel Generating a preliminary shadow mask Expressed as: Wherein, the Representing the position The position is determined as a shadow region, Obtaining a shadow mask for the brightness threshold; constructing a sample set containing the multiband composite image, the waterbody mask and the shadow mask, and dividing the sample set into a training set and a verification set; constructing a double-classification attention network model based on the training set, and carrying out optimization training on the double-classification attention network model; Carrying out sliding window reasoning on the multiband synthesized image by adopting a trained dual-classification attention network model, carrying out fusion splicing on predicted binary masks, and outputting a dual-band water body/shadow binary mask image; the sliding window reasoning for the multi-band synthesized image by adopting the trained dual-classification attention network model comprises the following steps: The cut multiband synthesized image is scanned in blocks by taking 512 multiplied by 512 pixels as the block size, and global minimum and maximum values are calculated in the whole image range for each wave band; Traversing the whole image in the row and column directions by adopting a window size of 512 multiplied by 512 pixels and overlapping 64 pixels, wherein the left upper corner coordinate of any one of the images is The original pixel value of the area on three wavebands is firstly read and stored as a channel number of 3 and a height of 3 With a width of A three-dimensional array of (2); all pixels within the window are normalized over each band, expressed as: Wherein: For the normalized pixel value, Is of wave band Pixels within a window Is used for the original value of (1), And (3) with Respectively the wave bands Minimum and maximum values in the full graph range; Organizing the normalized pixel values into a height of Pixels with width of The number of pixels and channels is As a model input; taking the generated three-dimensional tensor as network input, and simultaneously outputting a water body probability map and a shadow probability map by the model A shadow probability map representing the probability that each pixel point is predicted as a body of water Representing the probability that each pixel is predicted to be shadow; Respectively carrying out binarization operation on the water body probability map and the shadow probability map to generate a water body and shadow binary mask corresponding to the window level, wherein the water body and shadow binary mask is expressed as: Wherein: And A water body and shadow binary mask representing window level; Writing the water body and the shadow binary mask generated by each window back to the corresponding position in the output file according to the upper left corner coordinate; and when writing back the same position, executing a maximum value taking strategy for pixel values of the overlapping area, and if a plurality of windows output different values at the same original image coordinate, respectively writing back and fusing all windows by taking the value as a result of a target pixel as reference to obtain a dual-band binary mask image.
  2. 2. The method for classifying water and shadow of a remote sensing image based on a dual classified attention network of claim 1, wherein constructing a sample set comprising a multi-band composite image, a water mask, and a shadow mask comprises: Synchronously sliding and cutting the multiband synthetic image, the water mask and the shadow mask by using a fixed window with the size of 256 multiplied by 256 pixels, wherein rows and columns are not overlapped, and if the edge area is less than 256 multiplied by 256, the multiband synthetic image, the water mask and the shadow mask are directly discarded without filling; for each cutting window, generating a group of three files, namely a 256 multiplied by 3 multiband synthesized image sub-block, a corresponding 256 multiplied by 256 water binary mask sub-block, a corresponding 256 multiplied by 256 shadow binary mask sub-block; Traversing all the obtained sub-blocks, marking the sub-blocks as water body samples if at least one pixel with the value of 255 exists in a water body mask of the sub-blocks, marking the sub-blocks as shadow samples if at least one pixel with the value of 255 exists in a shadow mask of the sub-blocks, marking the sub-blocks as water body samples and shadow samples if the conditions are met at the same time, and respectively reserving two sets of labels of water body and shadow; randomly dividing the screened water body samples and the screened shadow samples into a training set and a verification set according to the ratio of 8:2.
  3. 3. The method for classifying water and shadow of remote sensing images based on dual-classification attentional networks of claim 2, wherein constructing a dual-classification attentional network model based on the training set comprises: Using four layers of downsampling encoders, wherein each layer sequentially comprises 3×3 convolution, multiplication of the number of output channels from 64 layer by layer, reLU activation function, probability of setting random inactivation layer as 0.2, 3×3 convolution, reLU activation function, 2×2 max pooling, downsampling to the next layer; The method comprises the steps of executing four branches in parallel on the last layer of output characteristic diagram of an encoder, wherein the four branches are used as a bottleneck module of pyramid pooling of a cavity space, namely, one 1 multiplied by 1 convolution and ReLU activation, and the three 3 multiplied by 3 convolutions are respectively 6, 12 and 18, and are added with the ReLU activation; respectively constructing a water body branch decoder and a shadow branch decoder, and performing bilinear upsampling on the output of the hollow space pyramid pooling bottleneck module; The up-sampling feature and the jump connection feature of the encoder layer are subjected to channel splicing; Inputting the splicing result into a CBAM-channel attention module, a CBAM-space attention module and an attention gating module in sequence; Splicing the weighted features of the attention gating module with the upsampled features, and performing 3×3 convolution twice to restore to the current decoding channel number; generating a single-channel water body probability map at the tail end of the water body branch through 1X 1 convolution and Sigmoid activation; Generating a single-channel shadow probability map at the tail end of a shadow branch through 1X 1 convolution and Sigmoid activation, and adding mutual exclusion loss to inhibit overlapping of a water body and a shadow space, wherein the definition is as follows: Wherein: In order to be a loss of mutual exclusion, And Respectively pixels Probability maps of the outputs at the water branches and shadow branches, And losing weight coefficients for mutual exclusion.
  4. 4. The method for classifying water and shadow of remote sensing images based on dual-classification attentional networks of claim 3, wherein said optimizing training of dual-classification attentional network model comprises: Using a custom data loader to load the multiband image in the training set and the corresponding water body and shadow binary mask into a model in batches, and fixing the input size to 256 multiplied by 256; In the model compiling stage, setting the loaded attention U-Net network as an Adam optimizer, carrying out parameter updating with an initial learning rate of 1×10 -4 , defining the loss functions of the water body branch and the shadow branch as the sum of binary cross entropy and DiceLoss, and setting the evaluation indexes of the water body branch and the shadow branch as classification accuracy; Automatically storing the weight of each round of verification set with the lowest loss by using a model check point callback, performing forward reasoning on the complete verification set after each training round is finished, and calculating and recording the accuracy, recall rate, F1 score and IoU index evaluation model generalization performance of the water body branch and the shadow branch; when the loss of the verification set is not reduced any more in three successive rounds, the training process is terminated in advance, and the weight file with optimal verification loss is automatically loaded.
  5. 5. The computer equipment comprises a memory and a processor, wherein the memory stores a computer program, and the computer program is characterized in that the processor realizes the steps of the remote sensing image water body and shadow classification method based on the dual-classification attention network according to any one of claims 1-4 when executing the computer program.
  6. 6. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the remote sensing image water body and shadow classification method based on the dual classification attention network of any one of claims 1 to 4.

Description

Remote sensing image water body and shadow classification method based on double-classification attention network Technical Field The invention relates to the technical field of remote sensing image analysis and geographic information processing, in particular to a remote sensing image water body and shadow classification method based on a double-classification attention network. Background In view of the extremely special geographical environment, satellite remote sensing technology has become a key means for monitoring the dynamic changes of ice lakes. The extraction of the ice lake is essentially the identification of the water body on the surface of the ice cover, and can be divided into two types, namely artificial digitization and automation from the perspective of a remote sensing image method. The manual digital method can realize fine labeling pixel by pixel, but is time-consuming, complicated and easily influenced by subjective experience of an interpreter, has insufficient labeling consistency and expandability, is difficult to meet the requirement of large-scale long-time sequence monitoring, is superior to manual in efficiency, has limited false detection inhibition effect due to spectrum confusion of cloud shadow and rock high-reflection areas, still has the phenomenon of missed detection in the rock and shadow staggered areas, and has no fundamental solution to the problem of false detection of the rock and the shadow under the complex ice and snow background, and in addition, the existing automatic algorithm often depends on a high-precision Digital Elevation Model (DEM) or a complex physical and statistical model, and is difficult to popularize to the long-time sequence and large-scale polar automatic monitoring. In recent years, by means of a multi-scale coding-decoding architecture and jump connection characteristics of a deep Convolutional Neural Network (CNN), shallow space details and deep semantic features can be fused in an end-to-end training process, and water body and background interference can be automatically learned and distinguished, so that segmentation accuracy and stability are remarkably improved in batch processing of multi-time-phase and large-scale remote sensing images. However, many challenges remain in Antarctic surface water identification, in that high quality, annotated deep learning dataset construction is not yet complete, limiting the generalization ability of the model and sustainable application of long-term monitoring. Aiming at the problem that a water body and a shadow high reflection area are easy to be confused in a polar complex background, a U-Net architecture based on a double-classification attention mechanism is generated, through introducing a channel and a space attention module in an encoding-decoding network and explicitly separating the shadow as independent categories, multi-scale features of the water body and the shadow two types of targets can be synchronously learned in a training stage, false detection of a rock high reflection area and a shadow area is effectively restrained, and a water body and a shadow binary mask which are not mutually interfered are output in an reasoning process. The method not only realizes accurate segmentation of the water body in a high reflection cascade region of the shadow and the rock, reduces false detection and omission, but also explicitly distinguishes the shadow through double output branches, remarkably enhances the robustness of the model in a polar complex environment, enables a network to adaptively weight different scale characteristics to realize fine extraction of the water body edge and the shadow edge, simultaneously can generate a training set simultaneously containing the water body and the shadow priori label, relieves the dilemma of polar labeling samples, improves the generalization capability of the model, and more importantly, supports large-scale and long-time-sequence automatic batch processing, and provides a high-efficiency and extensible technical means for dynamic monitoring of the surface water body of the south pole. Disclosure of Invention The invention is provided in view of the problems of the existing remote sensing image water body and shadow classification method based on the double-classification attention network. Therefore, the problem to be solved by the present invention is how to provide a remote sensing image water body and shadow classification method based on a dual classification attention network. In order to solve the technical problems, the invention provides the following technical scheme: In a first aspect, the invention provides a remote sensing image water body and shadow classification method based on a dual-classification attention network, which comprises the steps of preprocessing a Sentinel-2 original image by a nearest neighbor interpolation method to generate a multiband synthetic image with preset resolution; Calculating a normalized water index a