CN-122023827-A - Remote sensing image rammed earth site boundary automatic extraction method integrating U-Net and OBIA-CNN
Abstract
The invention provides an automatic extraction method of rammed earth site boundaries of remote sensing images fused with U-Net and OBIA-CNN, which comprises the steps of preprocessing an acquired remote sensing image containing rammed earth site boundaries to obtain a standardized remote sensing image sample block, inputting a U-Net network to perform pixel-level initial segmentation to obtain a pixel-level site candidate region, performing multi-scale segmentation on an ROI image defined by the obtained site candidate region by adopting an OBIA method to obtain image objects with spatial continuity and semantic consistency, cutting an object center sample block from each image object, and inputting a convolutional neural network to perform feature extraction and classification to obtain a site boundary extraction result. According to the invention, the U-Net network is introduced to perform pixel-level initial segmentation, and the OBIA-CNN is combined to perform fine analysis and classification on the object level, so that stable, continuous and high-precision extraction of the site boundary under the complex background condition is realized, and the integrity and stability of the site boundary are improved while the irrelevant background interference is reduced.
Inventors
- YU LI
- ZHANG XIU
- DAI SHUANG
- ZHANG XIANG
- Dang Xinghai
- WANG DONGHUA
- WU GUOPENG
- CUI KAI
Assignees
- 兰州理工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260202
Claims (10)
- 1. A remote sensing image rammed earth site boundary automatic extraction method integrating U-Net and OBIA-CNN is characterized by comprising the following steps: step S1, preprocessing an acquired remote sensing image containing rammed earth site boundaries to obtain a standardized remote sensing image sample block; S2, inputting a standardized remote sensing image sample block into a U-Net network to perform pixel-level initial segmentation to obtain a pixel-level site candidate region; S3, performing multi-scale segmentation on the ROI image defined by the obtained site candidate region by adopting an OBIA method to obtain an image object with space continuity and semantic consistency; And S4, cutting an object center sample block from each image object, inputting the object center sample block into a convolutional neural network for feature extraction and classification, and obtaining a site boundary extraction result.
- 2. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 1, wherein the high-resolution remote sensing image data is preferably a full-color image comprising a multispectral image and satellite/aerial photo data, the multispectral image to be processed is subjected to radiation calibration, then the atmospheric correction is carried out, and then the orthographic correction is carried out, so that the multispectral reflectivity image is obtained; Performing radiation calibration on the full-color image, converting a digital quantized value of the full-color image into atmospheric top radiation brightness or atmospheric top reflectivity according to calibration coefficients provided by image metadata to obtain the full-color radiation calibration image; The method comprises the steps of carrying out coordinate system unification and registration on a multispectral reflectance image and a panchromatic orthographic image, injecting the space details of the panchromatic orthographic image into the multispectral reflectance image by adopting a panchromatic sharpening method to obtain a fused high-resolution multispectral image, resampling the fused high-resolution multispectral image to a target resolution, carrying out normalization processing on each wave band, and obtaining network input image data after processing.
- 3. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 2 is characterized in that the resampling unifies the high-resolution multispectral image to a preset target spatial resolution and completes pixel level alignment, and normalizes or standardizes pixel values of each wave band; cutting the processed fusion image into sample blocks with fixed sizes by adopting a sliding window mode, setting 25% -75% of overlapping areas between adjacent sample blocks, and carrying out probability fusion or weighted fusion splicing on the overlapping areas to obtain the standardized remote sensing image sample blocks with uniform spatial resolution, spectral characteristics and data formats.
- 4. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to any one of claims 1-3 is characterized in that a sliding window mode is adopted to cut the remote sensing image to be processed into image blocks with fixed sizes, normalization/standardization processing is carried out on each channel of the image blocks, a single-channel feature map is output from the tail end of the U-Ne network through 1X 1 convolution, and a single-channel probability map of a corresponding image block is obtained through Sigmoid; And setting a fixed threshold value for the single-channel Sigmoid probability map, performing binarization processing, removing isolated noise points through morphological filtering, and extracting the maximum closed contour as a site candidate area and rough contour information.
- 5. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 4 is characterized in that the U-Net network realizes multi-scale feature fusion through a coding structure-decoding structure and jump connection, the U-Net network comprises 4 encoders and decoders, the encoders extract features by using 3X 3 convolution layers, a ReLU activation function and BatchNorm layers, the decoders upsample feature images through bilinear interpolation, splice the feature images in channel dimension after enabling the spatial dimension to be consistent with the feature images of the corresponding encoders, process the feature images through the 3X 3 convolution layers, and finish feature fusion by combining jump connection so as to fuse context information and detail features; The training and verification of the U-Net network only uses part of site images with pixel-level manual labeling, generates a pixel-level binary mask as a true value through manual labeling, and the rest site images are only used in a test stage; The maximum closed contour is obtained by marking the connected regions of the mask obtained by binarization, extracting the outer contour of each connected region, and sorting and selecting the outer contour with the largest area according to the area of the area surrounded by the outer contour; the fixed threshold is pre-selected on the validation set and fixes the binarization process for the inference phase.
- 6. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 4, wherein the multi-scale segmentation is based on a region merging criterion, and the image is divided into image objects with space continuity and semantic consistency from thick to thin by setting different segmentation scales, shape factors and compactness factors; The segmentation scale is set according to the spatial distribution characteristics and the image resolution of the site target, multi-scale segmentation is respectively carried out on the same ROI image by adopting a plurality of sets of scale parameters, the shape factor and the compactness factor are used for controlling the balance of the heterogeneity and the shape regularity of the spectrum when the objects are combined, the shape factor is used for adjusting the weight of the spectrum item and the shape item, and the compactness factor is used for adjusting the compactness and the smoothness of the objects.
- 7. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 6 is characterized in that the method for respectively executing multi-scale segmentation on the same ROI image by adopting a plurality of sets of scale parameters is that an upper large-scale object layer is generated by using larger segmentation scale parameters and used for controlling the integral range and the external contour of the site, a lower small-scale object layer is generated by using smaller segmentation scale parameters within the constraint range of the upper object and used for capturing local texture and structural difference, the upper object layer and the lower object layer are respectively obtained by segmenting two sets of scale parameters, and the lower small-scale object is attributed to the upper large-scale object with the largest area occupation ratio according to the geometric center of the lower small-scale object, so that a hierarchical mapping is formed and a father-son object relation is established; The method comprises the steps of converting a pixel set in an ROI image defined by a site candidate area into image units with definite geometric forms and spatial relations, wherein each image unit is determined by the pixel set obtained through segmentation, the geometric forms are obtained through calculation of object boundary outlines, the geometric forms comprise geometric features of areas, circumferences, length-width ratios, compactness and rectangularity, and the spatial relations among the image units are determined by the topology and the adjacent relations of objects, and the adjacent/containing relations, the distances among the objects and the relative azimuth relations.
- 8. The method for automatically extracting the rammed earth site boundary of the remote sensing image fusing U-Net and OBIA-CNN according to any one of claims 5 to 7, wherein a geometric center point of a generated image object is taken as a center of an object sample block, cutting is carried out on an ROI image according to a preset window size, when the center window exceeds the boundary of the ROI image, filling is carried out in a zero filling or mirror image filling mode, a multichannel image block obtained by cutting is saved as TIFF data according to a preset format, a convolutional neural network carries out convolutional feature extraction on the object center sample block, and spectrum feature, texture feature and depth feature information of the object are fused and object types are output for judging whether the object belongs to a site target.
- 9. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 8 is characterized in that the convolutional neural network comprises 3 convolutional layers and 2 maximal pooling layers, each maximal pooling layer is arranged between two adjacent convolutional layers, the convolution kernel size of the first layer of the convolutional layers is 7 multiplied by 7, the convolution kernel size of the subsequent two layers of the convolutional layers is 5 multiplied by 5, an activating function adopts a ReLU function and combines the maximal pooling layers to perform dimension reduction so as to halve the space size of a feature map and keep main response; The number of feature maps corresponding to the hidden layers is determined by the number of convolution kernels, the first hidden layer is provided with 4 convolution kernels to output 4 feature images, the second hidden layer is provided with 8 convolution kernels to output 8 feature images, the feature expression capacity is enhanced in a mode of increasing the number of channels layer by layer from shallow to deep, the hidden layer is an intermediate feature extraction layer of the convolution neural network and is formed by the feature images output by the convolution layers and subsequent activation and pooling processing.
- 10. The method for automatically extracting the rammed earth site boundary of the remote sensing image fused with the U-Net and the OBIA-CNN according to claim 9 is characterized in that an image object is generated by a multi-scale segmentation result and organized in the form of object ID-object geometric boundary-object hierarchical relation-object attribute so as to construct an image object library, wherein the image object library is used for storing the boundary contour, father-son hierarchy and spectrum/texture/geometric attribute of each image unit; the method comprises the steps of dividing the whole image in a region of interest (ROI) defined by a site candidate area in a multi-scale manner to generate an image object set, inputting each new object into a trained convolutional neural network to carry out classification judgment, taking a geometric center point of an image unit as a cutting center, intercepting a fixed-size object center sample block from the ROI image, carrying out normalization processing consistent with a training stage on the object center sample block, inputting the object center sample block into the convolutional neural network, outputting class probability or class labels of the object belonging to the site/non-site, writing classification results into attribute fields of the corresponding objects, and generating a high-precision vectorization boundary according to the classification results; The training data set is generated by an object obtained by multi-scale segmentation through an OBIA method in a training/verification image, wherein the overlapping proportion of a manually marked pixel level truth value mask and an object area is used as a label basis, when the site pixel ratio in the object area exceeds a preset threshold value, the site object is marked, otherwise, the site object is marked as a non-site object, and a training sample is formed from a correspondingly cut object center sample block; The training stage of the convolutional neural network takes cross entropy loss as an objective function, adopts a small batch gradient descent mode to iteratively update network parameters, and selects optimal model parameters on a verification set for an reasoning stage; Cutting object center sample blocks of each object, inputting convolutional neural network to obtain object category, aggregating the objects classified as heritage to generate object level mask map, extracting boundary of object level mask map and vectorizing to obtain heritage boundary line, vectorizing to extract outer contour by contour tracking, and fitting and smoothing multiple segments.
Description
Remote sensing image rammed earth site boundary automatic extraction method integrating U-Net and OBIA-CNN Technical Field The invention relates to the technical field of rammed earth site boundary extraction, in particular to a remote sensing image rammed earth site boundary automatic extraction method integrating U-Net and OBIA (Object-Based IMAGE ANALYSIS, object-oriented image analysis) -CNN. Background Rammed earth sites are taken as important substance carriers of ancient civilization of China, and have non-renewable historical, artistic and scientific values. However, such rammed sites, which are widely distributed in northwest arid regions of china, are exposed to the effects of natural environmental erosion and human activity for a long period of time, facing a sustained and accelerated risk of damage, and their effective protection has become an urgent global topic. Rammed earth sites are mainly composed of soil materials, are influenced by rainfall, sand wind and temperature difference for a long time, face serious erosion and damage risks, and are urgently required to be monitored and protected dynamically in real time. Traditional archaeological investigation mainly relies on ground manual measurement, has the problems of low efficiency, high cost and limited coverage range, and is difficult to meet the requirement of large-scale site protection. Currently, the protection practice in the field mainly faces three interrelated core challenges, and the effectiveness of protection measures is severely restricted, namely, firstly, the wide area distribution of rammed earth sites and the limitation of the traditional investigation means have obvious contradiction. This directly results in incomplete tamper site inventory and insufficient monitoring coverage, resulting in a large number of tamper sites in an "unknown" or "out of line" condition. Second, ramming spectrum and morphology ambiguity of earthen site boundaries constitutes a fundamental challenge to the ability of conventional remote sensing automatic recognition algorithms. The traditional pixel-based classification method, even a basic Convolutional Neural Network (CNN), is often difficult to effectively distinguish weak spectrum differences between rammed earth sites and surrounding drought environments, so that the extraction result is low in precision and serious in omission, and the limitations severely restrict sustainable and efficient normalized monitoring of the rammed earth sites. Finally, there is a critical timing misalignment between the long-term progressivity of the rammed site damage and the hysteresis of the protective intervention. Due to the lack of a reliable predictive model, protective actions are often passively performed after significant damage has occurred, severely impairing the timeliness and effectiveness of preventive protective interventions. In recent years, the combination of the remote sensing technology and the machine learning provides a new approach for solving the challenges, and by combining the remote sensing technology and the machine learning for accurate positioning, identification and dynamic monitoring, the machine learning remarkably improves the accuracy of detecting the ruins in the remote sensing image compared with the traditional detection method. With the development of remote sensing technology, methods based on machine learning and deep learning are gradually applied to site extraction. The prior art mainly comprises: 1. The extraction method based on the spectral features, such as a discrete particle swarm optimization algorithm (MEDPSO) based on a maximum entropy model, is good in the fields of water body extraction and the like, but when the rammed earth remains are processed, due to the fact that the remains are similar to surrounding bare earth background spectra, the phenomenon of salt and pepper noise is easy to generate, and complex topographic noise is difficult to process. 2. The deep learning method based on semantic segmentation is a network such as U-Net or Faster R-CNN. Although the sensitivity of the method to the target area is high (the recall rate is high), the method tends to be prone to boundary expansion, over-segmentation is caused, the problem of blurring or discontinuous edges is easy to occur on the extraction of irregular and slender features such as site boundaries, and the geometric form of the site is difficult to accurately restore. The U-Net and its variant network are utilized to extract the semantic segmentation of pixel level from the remote sensing image (including optical image and LiDAR data) for specific artificial or natural ground object, the extracted object includes building, mountain bunker village site, earth heap, ancient city wall site and coastline, these results provide important reference for U-Net to be applied in rammed earth site boundary extraction. In the existing research, the U-Net and other semantic segmentation networks are mainly