CN-121982721-A - Mangrove remote sensing image segmentation extraction method based on improved HRNet-OCR semantic segmentation model

CN121982721ACN 121982721 ACN121982721 ACN 121982721ACN-121982721-A

Abstract

The invention discloses a mangrove remote sensing image segmentation extraction method based on an improved HRNet-OCR semantic segmentation model, and belongs to the technical field of mangrove remote sensing extraction. The method comprises the steps of S1, obtaining a mangrove original satellite remote sensing image, cutting the mangrove remote sensing image, S2, drawing a mangrove vector label in the mangrove remote sensing image, carrying out rasterization, S3, cutting the rasterization label and a research area image to obtain tile data, S4, dividing the tile data into a training set, a verification set and a testing set, S5, carrying out data enhancement on the tile data of the training set to obtain a processed image data set, and S6, constructing a TANet model based on the image data set. The method enhances the whole semantic expression capability of the mangrove on complex boundaries and finely divided areas, and solves the problems of misjudgment and boundary blurring of the mangrove and tidal flat and water bodies in the existing method.

Inventors

FU DONGYANG
LIN HAONAN
WANG CHUHONG
HUANG JINJUN
WU HANRUI
HUANG YU
Xiong Litian

Assignees

广东海洋大学

Dates

Publication Date: 20260505
Application Date: 20260202

Claims (10)

1. A mangrove remote sensing image segmentation extraction method based on an improved HRNet-OCR semantic segmentation model is characterized by comprising the following steps: s1, acquiring an original satellite remote sensing image of a mangrove, and obtaining the mangrove remote sensing image through cutting operation; S2, drawing a vector label on the mangrove remote sensing image and carrying out rasterization treatment to obtain a rasterized label; S3, cutting the rasterized label and the research area image into TIF pictures to obtain tile data; S4, dividing the tile data into a training set, a verification set and a test set; s5, carrying out data enhancement on the training set tile data to obtain a processed image data set; s6, constructing a mangrove extraction model TANet based on HRNet-OCR high-resolution extraction network based on the image data set; and S7, evaluating the mangrove extraction model TANet based on HRNet-OCR high-resolution extraction network by using a test set, comparing the mangrove extraction model with the original model and the main stream model, and obtaining a TANet model segmentation extraction result.
2. The mangrove remote sensing image segmentation extraction method based on the improved HRNet _ocr semantic segmentation model of claim 1, wherein the process of drawing a vector label for the mangrove remote sensing image and performing rasterization in step S2 includes: s21, drawing a mangrove vector label through ENVI5.6 and visual interpretation; s22, rasterizing the drawn vector label to obtain a binarized rasterized label.
3. The mangrove remote sensing image segmentation extraction method based on the improved HRNet _ocr semantic segmentation model of claim 1, wherein the process of data enhancement on training set tile data in step S5 includes: s51, performing geometric transformation operation on the image, including turning, scaling, translation and rotation; S52, performing optical transformation operation on the image, including adjusting brightness, saturation and contrast; S53, carrying out random shielding processing on the image so as to increase data diversity.
4. The mangrove remote sensing image segmentation extraction method based on the improved HRNet _ocr semantic segmentation model as set forth in claim 1, wherein the process of constructing the mangrove extraction model TANet based on the HRNet _ocr high resolution extraction network based on the image dataset in step S6 includes: S61, inputting the training set images of the dataset into a main network of an improved HRNet-OCR model in batches, firstly extracting primary features through shallow convolution, then entering a HRNet high-resolution parallel structure, and obtaining a group of aligned multi-scale feature images after feature extraction through maintaining parallel processing of feature branches of a plurality of high, medium and low scales; S62, utilizing an axial attention module as a space context enhancement module, receiving the feature map in S61, respectively and independently modeling long-distance dependency relations in two dimensions of a row direction and a column direction, capturing global structure perception capacities of tidal flat and a complex background area of a water body, and generating a high-level feature representation with global dependency; S63, processing the high-level features obtained in the S62 by utilizing an OCR object region construction module, and converging the pixel features to construct object region representations by calculating a relation matrix between the pixel features and the object regions so as to provide coarse-grained object-level context information for subsequent semantic enhancement; S64, utilizing an OCR object context representation module embedded with triple attentions as a semantic enhancement module, carrying out feature interaction on object region features in three dimensions of a channel, a row and a column in S63, applying additional attentional weights, generating weighted feature representations, and combining residual connection to retain original information; s65, mapping the fusion characteristics obtained by the semantic enhancement module to the target category number so as to conduct segmentation prediction, and calculating a prediction result and a real label by utilizing a combined loss function, wherein the combined loss function is formed by weighting weighted cross entropy loss and Dice loss, and updating network parameters through back propagation.
5. The method for extracting mangrove remote sensing image segmentation based on the improved HRNet _ocr semantic segmentation model according to claim 4, wherein the step S61 of inputting the training set image after data enhancement into the backbone network of the improved HRNet-OCR model in batches, the process of extracting the feature images of different levels via the high resolution network includes: S611, HRNet, utilizing a pre-trained w18 network to perform residual block and pooling downsampling in four stages, then entering a first stage, extracting initial features through 4 residual units, and obtaining four high-resolution features with spatial local information; S612, HRNet a backbone network utilizes a multi-resolution parallel convolution module to extract deep features to obtain global multi-scale semantic features, the network gradually increases low-resolution branches from a second stage, high-resolution branches are always kept parallel, information is exchanged among different resolution branches through repeated multi-scale fusion units at the end of each stage, the resolution of a feature map is kept to be 1/4, 1/8, 1/16 and 1/32 of that of an original map through four stages of processing, and finally four aligned multi-scale global semantic information is obtained.
6. The method for extracting mangrove forest remote sensing image segmentation based on the improved HRNet _OCR semantic segmentation model according to claim 4, wherein the process of converting the multi-scale feature information extracted in S61 into object region representations using the OCR object context representation module in step S62 includes: S621, adjusting the four feature graphs with different resolutions output by the S61 to a uniform resolution through up-sampling, and performing splicing and 1x1 convolution fusion on the channel dimension to obtain high-dimensional pixel representation features; s622, calculating probability matrixes of different object categories of each pixel, namely soft object areas, according to pixel representation characteristics by using a supervision auxiliary branch; s623, weighting and aggregating the pixel representation features based on the soft object regions, constructing global object region representations of each category, and realizing mapping from the pixel-level features to the object-level context features.
7. The method for extracting mangrove remote sensing image segmentation based on the improved HRNet _OCR semantic segmentation model as set forth in claim 4, wherein the capturing long-distance dependencies between pixels while reducing computational complexity with the axial attention module in step S63, the extracting the global context feature includes: s631, performing linear mapping on the fusion characteristics of the object region obtained in the S62 to generate a query vector, a key vector and a value vector, and introducing a learnable relative position code to capture space structure information; S632, decomposing the traditional two-dimensional attention into two independent one-dimensional attention operations, firstly executing a one-dimensional attention mechanism along the width axis of an image to capture intra-row dependencies, and then executing a one-dimensional attention mechanism along the height axis of the image to capture intra-column dependencies of output features, thereby realizing the coverage of a global receptive field; s633, in each single-axis attention calculation, calculating attention weights by combining content interaction and position deviation, introducing residual connection to fuse original input information with axial attention characteristics, and finally obtaining axial attention enhancement characteristics.
8. The method for extracting mangrove remote sensing image segmentation based on the improved HRNet _OCR semantic segmentation model according to claim 4, wherein the step S64 of calculating S63 an interaction attention matrix of the output feature map in channel and space dimensions by using a triple attention module, generating a multidimensional enhancement feature representation, and reserving original information by combining residual connection comprises the following steps: S641, taking the input feature map obtained in the S63 as input of a semantic enhancement module, respectively constructing three parallel interaction branches of a channel-height, a channel-width and a height-width for capturing cross-dimensional interaction information, and performing dimensional displacement operation on input features of the first two branches to convert the input feature map into a dimensional shape suitable for two-dimensional convolution processing; s642, in each branch, respectively performing average pooling and maximum pooling operation on the feature images along the zeroth dimension by utilizing a Z-Pool mechanism, splicing the obtained two groups of statistical feature images, and then calculating through a 7x7 convolution layer and a Sigmoid activation function to obtain a normalized attention weight matrix; S643, multiplying the generated attention weight matrix with corresponding input features element by element to perform self-adaptive weighting, then restoring the rotated feature map to an original shape, and finally aggregating the output of three branches by adopting a simple average strategy to obtain triple attention enhancement features.
9. The method for extracting mangrove remote sensing image segmentation based on the improved HRNet _ocr semantic segmentation model according to claim 4, wherein in step S65, the weighted feature representation obtained in S64 is mapped to a target class number to perform segmentation prediction, and when the segmentation result of the segmentation network prediction and the mangrove real segmentation label are subjected to composite loss calculation, a combined loss function is used to perform calculation, and the application process of the combined loss function includes: s651, in the training stage, counting the proportion of each class pixel in the training set, and calculating class weight so as to enhance the attention of the model to the few sample classes; S652, introducing a weighted cross entropy loss function of the classification accuracy of the pixel level of interest and a Dice loss function of the overlapping degree and the boundary continuity of the region of interest; S653, carrying out weighted summation on the weighted cross entropy loss function and the Dice loss function through balance coefficients, and calculating the total loss of the improved TANet model; and S654, jointly optimizing the sum of the loss functions, and updating network parameters through a back propagation algorithm until the network converges.
10. The method for extracting mangrove remote sensing image segmentation based on the improved HRNet _ocr semantic segmentation model as set forth in claim 1, wherein the process of evaluating the mangrove extraction model TANet with the test set in step S7 includes: S71, comparing the TANet model with PSPNet, UNet, HRNet, deepLabv & lt3+ & gt and Segformer main flow network models on a test set; s72, using the cross-over ratio, the average cross-over ratio, the accuracy rate, the recall rate and the F1-fraction as evaluation indexes to obtain a segmentation performance comparison result.

Description

Mangrove remote sensing image segmentation extraction method based on improved HRNet-OCR semantic segmentation model Technical Field The invention relates to a mangrove remote sensing image segmentation extraction method based on an improved HRNet-OCR semantic segmentation model. Background Mangrove forest is used as a typical woody plant community growing in the intertidal zone of tropical and subtropical coasts, and has remarkable ecological functions in the aspects of wind prevention, sand fixation, wave elimination, shore protection, seawater purification, biodiversity maintenance and the like. The developed root system structure not only can effectively stabilize the shoreline, but also provides a natural ecological barrier and a good water quality regulating function for high-level pond culture and coastal agricultural activities. By continuously monitoring the distribution range and dynamic change of mangrove, the health condition of the coastal zone ecological system can be accurately mastered, and a solid scientific basis is provided for implementing ecological system-based fishery management and defining ecological protection red lines. The method is not only important for maintaining the ecological safety of coastal zones, but also promotes the development of ecological fishery and carbon sink economy in coastal areas, and realizes the synergic basic stone of ecological protection and economic development. However, the extremely complex growth environment of mangroves, including broken and irregular plaque distribution, tidal beach background that is significantly affected by tidal tides, and spectral features that are highly intermixed with the water body, the light beach, pose a serious challenge to existing automated mangrove extraction methods. Traditional monitoring algorithms based on rules or single spectrum characteristics tend to be careless in such scenes, and the problems that mistakes caused by spectrum similarity of mangroves and backgrounds (such as near-shore water bodies and mud beaches) are missed and separated and the recognition capability of tiny and scattered mangrove areas is insufficient under shadow shielding, water body reflection and near-shore building interference are generally existed, so that the requirements of high-precision and high-robustness practical application are difficult to meet. Although semantic segmentation technologies (such as U-Net, deep Lab series and variants thereof) based on deep learning have made breakthrough progress in general feature extraction and remote sensing image processing in recent years and have shown strong feature learning and context understanding capabilities, research on deep learning model design and training strategy optimization of complex scenes with large canopy texture difference, fuzzy boundary and uneven scale distribution, such as mangrove, is still relatively deficient and not deep. The existing general model is often difficult to fully capture multi-scale characteristics of mangrove, saw teeth or discontinuity are easy to generate when edge details are processed, and segmentation accuracy is limited. Therefore, the application of the deep learning semantic segmentation method in the high-precision and automatic extraction of mangroves is needed to be developed in a targeted research, explored and optimized, and the method not only has important theoretical value for filling the technical blank of a specific scene, but also is a key application breakthrough point for breaking through the current technical bottleneck and meeting the requirements of refined monitoring and management of the mangroves. Based on the method, the invention provides a mangrove remote sensing image segmentation extraction method based on an improved HRNet-OCR semantic segmentation model, so as to solve the problems of mangrove boundary misjudgment, tiny target omission and insufficient feature expression in the existing method. On the basis of maintaining high-resolution characteristics, the invention fuses an Axial Attention (Axial Attention) mechanism and a triple Attention (Triplet Attention) mechanism, combines a combined loss function, and remarkably improves the recognition capability of the model on complex boundaries and finely divided areas of mangroves. Disclosure of Invention Aiming at the defects in the prior art, the invention provides a mangrove remote sensing image segmentation extraction method based on an improved HRNet-OCR semantic segmentation model. In order to achieve the aim of the invention, the invention adopts the following technical scheme: a mangrove remote sensing image segmentation extraction method based on an improved HRNet _OCR semantic segmentation model comprises the following steps: s1, acquiring an original satellite remote sensing image of a mangrove, and obtaining the mangrove remote sensing image through cutting operation; S2, drawing a vector label on the mangrove remote sensing image and carrying out ra