CN-121725372-B - Physical prior-based segmented multi-source shoreline extraction and calibration method and system

CN121725372BCN 121725372 BCN121725372 BCN 121725372BCN-121725372-B

Abstract

The invention discloses a segmented multisource shoreline extraction and calibration method and system based on physical priori, which particularly relate to the technical field of remote sensing image processing and computer vision, integrate consistency evaluation and reliability measurement mechanisms of multisource surface water products on a data layer, construct a high-quality training sample set by utilizing water body persistence characteristics, develop and merge feature extraction architecture of physical priori guidance and geometric self-adaptive operators on a model layer, construct multidimensional physical constraint and noise tolerance loss functions, and realize pixel-level accurate identification and topology-tight quality closed-loop calibration of the shoreline under a large-scale remote sensing image.

Inventors

GAO YONGNIAN
SUN YONGQI
GUO XIAOYANG
ZHOU QIAN

Assignees

河海大学

Dates

Publication Date: 20260512
Application Date: 20260214

Claims (8)

1. The segmented multisource shoreline extraction and calibration method based on physical priori is characterized by comprising the following steps of: Preprocessing an original Landsat image to generate an annual synthetic base map of the true reflectivity of the earth surface without cloud coverage and stable spectrum; Extracting a basic wave band from the annual synthetic base map, and calculating an NDWI index of the annual synthetic base map based on the selected basic wave band, thereby obtaining a five-channel multidimensional feature cube; constructing a multisource consistency probability Map (MCP) serving as a confidence priori for subsequent Shore-SegFormer model training; superposing the generated initial shoreline binary label graph and the annual synthetic graph to generate an initial label, performing synchronous cutting and matching with a sample classification screening mechanism after sub-pixel level topology verification and geometric alignment to obtain a mixed training set containing land-water boundary characteristics; the Shore-SegFormer model takes a layered transform encoder as a main feature extraction network, the first layer of the layered transform encoder is overlapped with a patch embedding module to adapt to input images of a cube containing spectral bands, normalized difference water indexes and five-channel multidimensional features, the NDWI indexes and boundary candidate priori feature images are utilized to guide neurons to pay attention to a high-frequency area of a water-land junction in advance, four-level multiscale feature images are output step by step through layered multi-head self-attention calculation, and a single-channel shorelin probability image is obtained through a multiscale feature fusion decoder; The method comprises the steps of embedding a dominant gradient guided edge perception refinement module EARM after a low-level characteristic diagram, combining a multisource consistency probability diagram MCP dominant guidance to realize overlapped perception of the characteristic diagram, embedding a DR-LDSC module after a high-level characteristic diagram, capturing detail bending of local serpentine shorelines and utilizing long-distance receptive fields to ensure overall logic continuity of cross-regional shorelines; introducing a reliability weighting mechanism, and overcoming noise when an initial shoreline binary label graph is generated by constructing a triple constraint composite loss function and combining a consistency probability graph MCP; High-precision shoreline extraction and database construction of a large-scale image are realized through sliding window reasoning, AWEI nsh index fusion, eight-neighborhood connected component analysis and vector optimization; Integrating an original Landsat image by using a GEE platform, realizing sub-pixel level alignment by geometric correction and resampling, calling a QA (Quadrature imaging) wave band mask to remove cloud and performing mean value aggregation to generate an annual synthetic base map; The five-channel multidimensional feature cube construction process comprises the steps of selecting NIR, SWIR1, red and Green basic wave bands from a generated annual synthetic base map, calculating an NDWI index of the annual synthetic base map, and calculating the NDWI index as follows: ; Extracting a step amplitude of the NDWI index by using a3×3 Scharr operator, and capturing a boundary high-frequency mutation signal to generate a boundary candidate priori feature map, wherein the step amplitude is as follows: Extracting horizontal and vertical gradient components using horizontal convolution kernel gx= [ [ -3, 0, 3], [ -10, 0, 10], [ -3, 0, 3] ] and vertical convolution kernel gy= [ [ -3, -10, -3], [0, 0, 0], [3, 10, 3] ]; Calculating gradient magnitude Amplifying spectral response of the land-water junction by capturing high-frequency mutation signals distributed in the NDWI space, and generating a boundary candidate priori feature map; and finally, carrying out multi-channel splicing on the basic wave band, the NDWI index and the boundary candidate priori feature map to construct a five-channel multidimensional feature cube.
2. The physical prior based segmented multisource shoreline extraction calibration method of claim 1, wherein the water body frequency WIF of each pixel in the annual scale is calculated by the following formula: ; Wherein, n=12, Is the binary water state of the ith month; The earth's surface is divided into: permanent water PSW, corresponding WIF is more than or equal to 75%; intermittent water ISW, corresponding 5% < WIF <75%; land, corresponding WIF is less than or equal to 5%; And synchronously constructing a multisource consistency probability map MCP, namely comparing prediction results of multisource surface water product data at the same space-time position, if the prediction results are completely consistent, weighting w=1.0, and if two types of conflicts exist in the prediction results, weighting w=0.6, and taking the prediction results as confidence degree prior of subsequent Shore-SegFormer training.
3. The physical prior-based segmented multi-source shoreline extraction and calibration method is characterized in that permanent water PSW is defined as a shoreline target class, a morphological gradient operator is adopted to extract the spatial profile of the PSW, an initial binary shoreline label is generated and is overlapped with a annual synthetic base map, edge gradient features of the annual synthetic base map are extracted, then local normalized cross correlation NCC feature matching is carried out on the obtained label and the binary shoreline label in a 5X 5 pixel window, peak correlation coefficients in a search window determine sub-pixel displacement vectors, sub-pixel level alignment correction of the label and an image is achieved, and the calibrated label is mapped into a single-channel 8-bit gray scale map layer in a unified mode.
4. The method for calibrating the segmented multi-source shoreline extraction based on physical prior according to claim 1 is characterized in that a training set comprising water-land boundary features is obtained by selecting NIR, SWIR1 and Red wave bands sensitive to water response to construct a 3-channel feature image, performing 256×256-pixel synchronous clipping with corresponding labels, defining a pixel occupation ratio threshold T p ,T p to represent the proportion of the number of pixels marked as water in each 256×256 training set slice to the number of total pixels, and only preserving the slices with the water pixel occupation ratio of 5% < P wate < 95% as a mixed data set of the water-land boundary features to reject redundant information of all land and all water areas.
5. The physical prior based segmented multisource shoreline extraction calibration method as claimed in claim 1, wherein the Shore-SegFormer model architecture is as follows: a backbone feature extraction network adopts a layered Transformer architecture; An input layer, namely adopting an overlapped patch embedding module to adapt to a five-channel multidimensional feature cube I epsilon R H×W×5 , outputting a four-level multiscale feature map { C 1 ,C 2 ,C 3 ,C 4 } step by step through hierarchical multi-head self-attention calculation, and guiding a neuron to pay attention to a high-frequency region of a water-land junction in advance by utilizing an NDWI index and a boundary candidate priori feature map; After the edge perception refinement module EARM guided by the explicit gradient is deployed in the low-level feature map { C 1 , C 2 }, the generated boundary candidate prior feature map G prior is introduced to implement explicit guidance, G prior is used as a space weight operator, overlapping perception is performed on the feature map, and the calculation logic is as follows: ; Wherein X represents an input feature map, Y represents an output feature map, G prior represents a boundary candidate prior feature map, conv represents a convolution operation, BN represents a batch normalization, Representing pixel-level multiplication by element, Aiming at the geometric characteristics that the shoreline is in an elongated strip shape in an image and has random trend and changeable curvature, embedding a DR-LDSC module after a high-level stage { C 3 , C 4 } of a Shore-SegFormer encoder, wherein a characteristic diagram with 1/16 of spatial resolution is enhanced and received in a stage C 3 , and the detail bending of a local serpentine shoreline is mainly captured; The DR-LDSC module adopts a dual-path parallel structure to respectively process the spatial dependence of the horizontal H-path and the vertical V-path, and dynamically learns the 2D offset of each sampling point by a light convolution layer f off with the channel number of 2K before executing the strip convolution Performing geometric self-adaptive rotation and deformation, wherein for a horizontal path of 1 XK, the dimension generated by the offset is R H×W×2K ; for the asymmetric cavity convolution parameters, the kernel size K=15 is set, the horizontal path adopts a1×15 convolution kernel, the vertical path adopts a 15×1 convolution kernel, the expansion rate r=3 is set in a layer C 3 , the expansion rate r=5 is set in a layer C 4 , and the coordinates of the sampling points after offset correction are as follows Obtaining pixel values at non-integer coordinates through bilinear interpolation, and finally outputting: ; ; Wherein, the And Respectively represent the output characteristic values after convolution aggregation of the horizontal path and the vertical path, K represents the convolution kernel size, Representing the convolution weight of the nth sample point, x representing the pixel value on the input feature map, Representing the currently calculated center pixel coordinates, Representing the original sample offset coordinates of a standard convolution kernel on a regular grid, Representing the adaptive position offset learned by the model, Representing corrected sampling point coordinates; Will be And (3) with After pixel-level summation and fusion, restoring the characteristic dimension to each level of original dimension of the encoder through a 1X 1 convolution layer, merging with the input characteristic by utilizing a residual structure, and outputting to a decoder; The multi-scale feature fusion decoder consists of linear projection, up-sampling alignment, feature series connection, fusion convolution and a prediction layer, wherein the MLP projection layer uniformly maps the number of channels of a four-level feature map { C 1 , C 2 , C 3 , C 4 } to d=256, adopts bilinear interpolation to align resolution to original 1/4, splices the aligned features along a channel axis into 1024-dimensional tensors, performs cross-scale feature fusion compression retraction 256 dimensions through 1×1 convolution, and outputs a single-channel shoreline probability map through the linear prediction layer and a Sigmoid operator.
6. The physical prior based segmented multisource shoreline extraction calibration method as claimed in claim 1, wherein the method is characterized by constructing a triple constraint composite loss function: ; Wherein, the Representing the total value of the triple constraint composite loss function, 、 And Respectively representing the balance weight coefficients of different loss terms, 、 And Respectively representing noise tolerance weighted binary cross entropy loss and noise tolerance weighted Dice loss and physical consistency constraint loss; Noise tolerant weighted binary cross entropy loss Utilizing generated pixel level reliability weights Dynamic re-weighting of classification loss: ; Representing the total number of pixels in the image, n being the pixel index, The pixel level reliability weight, i.e., the pixel level reliability weight, representing the nth pixel, is used for dynamic weighting, Representing the true label value of the nth pixel, Representing a probability value of the model predicting the nth pixel as a shoreline; noise tolerant weighted Dice loss Aiming at sparse shoreline pixel distribution, introducing a weighted region overlapping degree constraint: ; Wherein, the Representing smoothing coefficients for preventing denominator zero and stabilizing gradient return; Physical coherence constraint penalty Introducing geographic explicit constraint, calculating Laplacian operator consistency error between a model predictive probability map and a Boundary candidate priori feature map, forcing an extraction result to conform to a physical mechanism of surface water distribution by using second derivative constraint, and adopting AdamW optimizer for iterative training until IoU and Boundary F-score of the model on a verification set reach a steady state threshold.
7. The physical prior based segmented multi-source shoreline extraction and calibration method is characterized by comprising the steps of performing pixel-level semantic prediction on a large-scale image to be monitored by utilizing sliding window reasoning, setting window size to be 256×256 pixels, window overlapping degree to be 50%, eliminating splicing traces at window edges, and adopting a weighted fusion strategy based on a two-dimensional Gaussian kernel for an overlapping area: wherein m is the number of windows covering the pixel, For probability output of the kth window, W k is a gaussian weight coefficient; Aiming at high-interference artificial ground objects, a shadowless automatic water body extraction index AWEI nsh index is introduced to implement semantic refinement, and a green wave band G, a near infrared band NIR and a short wave infrared band SWIR1/SWIR2 of Landsat images are called by using GEE, wherein the calculation formula is as follows: Constructing a two-way prediction confidence fusion function, and performing pixel level deviation correction on the model probability map if the model prediction probability is 0.4< y < 0.6 If the pixel level is a high significant value area, performing pixel level recalibration, and forcedly correcting semantic ambiguity of the model at an artificial shoreline, wherein y represents a predicted probability value of a water body of pixels output by the deep learning model; identifying a space isolated pixel cluster by adopting an eight-neighborhood connected component analysis method, setting an area threshold value T area =500 pixels, and automatically removing the plaque with the number of connected pixels less than T area ; The corrected binarization mask is converted into a vector format, the crossed polygons, the overhanging sections and the non-closed loop lines in the vector line segments are detected, the Snapping operator is used for automatically capturing and connecting the broken line segments within a set tolerance, the Cleaning operator is used for correcting the logic overlapping and non-closed structures in the vector line segments, and a long time sequence shoreline standard database which is continuous in space, strict in topology and accords with a physical mechanism is output.
8. A segmented multisource shoreline extraction calibration system based on physical priors is characterized in that the system operates the method of any one of claims 1-7, comprising: The image data preprocessing and physical priori construction module is used for executing sub-pixel level alignment and cloud-free synthesis of an original Landsat image, extracting NDWI spatial gradient amplitude values by using a Scharr operator, constructing a five-channel multidimensional feature cube containing spectrum and physical dimensions, and providing initial input with explicit edge guidance for a Shore-SegFormer model; The multi-source gap filling and reliability weighting training set construction module is used for fusing multi-source surface aquatic products to execute gap filling, automatically extracting permanent water body edge generating labels based on water body persistence, introducing a multi-source consistency probability Map (MCP) to measure pixel reliability weight, rectifying by NCC sub-pixels, and automatically constructing a mixed training set containing land-water boundary characteristics; The Shore-SegFormer training module is used for constructing an edge perception refinement module EARM and a DR-LDSC module which integrate explicit gradient guidance, coupling physical consistency and noise tolerance composite loss function, and strengthening the geometric continuity of a model to a landline and the robust capturing capability of complex semantics; And the physical-topological dual calibration extraction module is used for reasoning a yield probability map by utilizing a sliding window, introducing AWEI nsh indexes, eight-neighborhood connected component analysis and performing geometric correction and quality closed-loop control by a topological repair operator, and outputting a shoreline vector result with strict topology.

Description

Physical prior-based segmented multi-source shoreline extraction and calibration method and system Technical Field The invention relates to the technical field of remote sensing image processing and computer vision, in particular to a physical priori based segmented multisource shoreline extraction and calibration method and system. Background The shoreline is used as a key transition zone for alternating land and water, not only plays a role in maintaining the biological diversity of the wetland and purifying inflow runoff, but also is a core window for monitoring the dynamic evolution of the surface water body. The regional shoreline research is developed, the dredging evolution rule of the water-land boundary can be accurately mastered, core support is provided for preventing the risk of collapse, guaranteeing the safety of the embankment and optimizing the hydraulic engineering scheduling, and meanwhile, the dynamic monitoring of the shoreline is also an important basic basis for balancing the development of the shoreside resources and the ecological protection. The shoreline presents high space-time dynamic characteristics, not only is the long-term landform remodeling of port reclamation and the like embodied, but also the shoreline is continuously influenced by short-term tidal fluctuation and natural erosion, so that the land and water boundary is always in complex dynamic alternation. The existing research focuses on specific local scale, and shoreline extraction accuracy is limited under the conditions of frequent cloud coverage and limited image quality. In the face of complex landforms such as natural rocky coast, tidal flat, artificial shoreline and the like, the traditional algorithm lacks robustness to tidal interference and terrain complexity, so that the consistency and universality of an extraction result are poor, and the requirement of high-precision monitoring on shoreline transition in a variable environment is difficult to meet. The visual interpretation of high resolution images, while having high accuracy, is difficult to popularize on a global scale due to the high interpretation cost. The semantic segmentation model based on deep learning is excellent in the field of target extraction, but highly depends on a large-scale and high-quality labeling sample, is limited by significant space-time dynamic property and landform complexity of a shoreline, and is difficult to obtain a large-scale and high-precision sample, so that the feature mining and generalization capability of the existing model under the medium resolution scale is insufficient. A systematic technical scheme is needed to be constructed, a consistency assessment and reliability measurement mechanism of the multi-source surface aquatic products is integrated on a data layer, a high-quality training sample set is constructed by utilizing the water body persistence characteristics, a feature extraction framework integrating physical priori guidance and geometric self-adaptive operators is developed on a model layer, a multidimensional physical constraint and noise tolerance loss function is constructed, and pixel-level accurate identification and close-topology quality closed-loop calibration of a shoreline under a large-scale remote sensing image are realized. Disclosure of Invention Therefore, the invention provides a physical priori based segmented multisource shoreline extraction and calibration method and system, which aim to solve the problems in the background technology. In order to achieve the purpose, the invention provides the following technical scheme that the segmented multisource shoreline extraction and calibration method based on physical priori comprises the following steps: Preprocessing an original Landsat image to generate an annual synthetic base map of the true reflectivity of the earth surface without cloud coverage and stable spectrum; Extracting a basic wave band from the annual synthetic base map, and calculating an NDWI index of the annual synthetic base map based on the selected basic wave band, thereby obtaining a five-channel multidimensional feature cube; The method comprises the steps of integrating multisource surface aquatic product data, adopting a self-adaptive gap filling method and a multisource fusion strategy to obtain a reconstructed monthly water body dynamic data set, calculating the water body frequency of each pixel of an annual scale based on the monthly water body dynamic data set so as to finish water body type division, comparing the matching degree of the water body frequency of each pixel of the annual scale of the multisource surface aquatic product data at the same space-time position with the surface classification rule, and distributing pixel-level reliability weights to construct a multisource consistency probability map MCP as the confidence priori of subsequent Shore-SegFormer model training; superposing the generated initial shoreline binary label graph and the an