CN-122020128-A - Urban and rural change intelligent detection method for visual large model and real three-dimensional data

CN122020128ACN 122020128 ACN122020128 ACN 122020128ACN-122020128-A

Abstract

The invention relates to the technical field of remote sensing image recognition, and provides an urban and rural change intelligent detection method of a visual large model and real three-dimensional data, which comprises the steps of respectively acquiring digital orthophoto images DOM1 and DOM2 and digital surface models DSM1 and DSM2 of a target area under two time phases, and preprocessing the digital orthophoto images and the digital surface models of the two time phases; the method comprises the steps of performing semantic segmentation processing on two time-phase DOM data by using a pre-trained visual basic segmentation model, respectively outputting earth surface coverage classification graphs of which two time phases comprise semantic foundations, processing the two time-phase DSM data, extracting elevation change information, fusing the earth surface coverage classification graphs and the elevation change information, outputting an inter-ground transformation graph spot containing ground transformation and an artificial facility area containing elevation transformation, fusing high-resolution DOM and DSM information, constructing a multi-mode change detection model based on SAMLoRA fine tuning and elevation constraint, and improving the identification capability of three-dimensional changes of artificial facilities and the like.

Inventors

Hou Anye
Qi Gongtong
HAN LEI
WANG DAN
DU PENG
WANG GANG
HU ZHENBIAO
ZHANG JIANYUAN
LI LIN
SUN YIPING
ZHOU ZHIWEN
QIAO XIN

Assignees

青岛市勘察测绘研究院

Dates

Publication Date: 20260512
Application Date: 20260224

Claims (7)

1. An urban and rural change intelligent detection method for visual large model and real three-dimensional data is characterized by comprising the following steps: Respectively acquiring digital orthophoto images DOM 1 、DOM 2 and a digital surface model DSM 1 、DSM 2 of a target area under two time phases, and preprocessing the digital orthophoto images and the digital surface model of the two time phases; Performing semantic segmentation processing on the DOM data of two time phases by using a pre-trained visual basic segmentation model, and respectively outputting a surface coverage classification map of which the two time phases contain semantic foundations; processing the DSM data of the two time phases, and extracting elevation change information; And fusing the earth surface coverage classification map with the elevation change information, and outputting inter-earth transformation map spots containing earth transformation and an artificial facility area with elevation transformation.
2. The method for intelligent detection of urban and rural variations in three-dimensional data from a large visual model to live-action according to claim 1, wherein the preprocessing comprises resampling and spatial registration, so that the DOM and DSM data of two phases have uniform spatial reference and matching resolution.
3. The method for intelligently detecting urban and rural variations of visual large model and real three-dimensional data according to claim 1, wherein the pre-trained visual basic segmentation model keeps the main network weight W of an image encoder frozen, and a low-rank adaptation matrix is parallelly inserted beside a key linear layer of each transducer module of the image encoder When parameter fine adjustment is carried out and the input remote sensing image X h is transmitted forward, the image characteristics are converted by the original freezing weight W and also by the low-rank increment Transforming, and adding the two to form new weight adapting to remote sensing characteristics , , Wherein A is a dimension-reducing matrix, Projecting the input k-dimensional features into r-dimensional space, B is an ascending-dimensional matrix, re-projecting the r-dimensional features back into the output d-dimensional space, r is rank, determining the number of newly added parameters and the adaptability of the model, 。
4. The method for intelligently detecting urban and rural variations of visual large model and real three-dimensional data according to claim 3, wherein the training process of the visual basic segmentation model takes the segmentation loss on a minimized remote sensing sample as an optimization target L: , Wherein, the For a fine-tuned semantic segmentation model, L CE is a loss function, the cross entropy loss calculated for the remote sensing sample (X h , Y h ), X h is the remote sensing image, Y h is the corresponding ground object class label, H is a sample index, which is a frozen SAM trunk parameter; By updating only the a and B matrices in all LoRA modules and measuring the difference between the model segmentation output and the true surface coverage label Y h with the cross entropy loss function L CE , the overall loss is minimized.
5. The method for intelligently detecting urban and rural variation of visual large model and real three-dimensional data according to claim 1, wherein the processing of two time phase DSM data is specifically as follows: Performing pixel level differential calculation on two time phase DSM data to obtain an initial elevation change intensity image : ; For an initial elevation change intensity map And carrying out noise reduction treatment on the data to obtain a smooth elevation change intensity graph, and extracting an elevation change significant region in the smooth elevation change intensity graph based on a preset artificial facility elevation change threshold value.
6. The method for intelligently detecting urban and rural variations of visual large model and real three-dimensional data according to claim 5, wherein, The noise reduction processing adopts a 3X 3 moving window average value filtering algorithm to perform average value calculation on the high difference value of each pixel and 8 neighborhood pixels around the pixel, takes the average value as the final high difference value of the center pixel, , In the formula, For the new elevation change intensity graph obtained after the smoothing treatment, i and j are respectively the abscissa and the ordinate, and m and n are respectively the abscissa increment and the ordinate increment.
7. The method for intelligently detecting urban and rural variations of visual large model and real scene three-dimensional data according to claim 5, wherein the preset artificial facility height variation threshold tau is 3m-4m.

Description

Urban and rural change intelligent detection method for visual large model and real three-dimensional data Technical Field The invention relates to the technical field of remote sensing image recognition, in particular to an urban and rural change intelligent detection method for visual large model and live-action three-dimensional data. Background At present, the space-sky remote sensing technology is used for multi-time phase observation and automatic change detection, and the method is widely applied to a plurality of fields such as farmland protection, ecological monitoring, natural resource management, urban management and the like. The traditional remote sensing change detection methods are mainly divided into pixel-based, feature-based, object-based and other types, but the methods are easily interfered by factors such as ground object shadows, seasonal changes and the like, so that pseudo-change is frequently out, the detection accuracy and reliability are insufficient, and the method is difficult to adapt to complex and changeable actual scenes. With the development of artificial intelligence technology, a change detection method based on deep learning, in particular to a fusion model of a Convolutional Neural Network (CNN) and a transducer, has significantly improved detection precision, and has become a mainstream direction of current remote sensing change detection. However, with the continuous improvement of the spatial resolution of the remote sensing image, the feature of the ground object imaging is increasingly complex, and the changes of the same texture different height type, such as the changes of the ground object with similar spectral features but belonging to different three-dimensional forms, such as construction sites, building transformation, and the like, are still difficult to effectively distinguish based on the texture expression and the local semantic analysis method of artificial intelligence. In addition, the existing method focuses on the positioning of the change area, lacks automatic recognition capability of change semantics (namely 'what is changed from what to what'), and cannot meet the requirement of fine management. Disclosure of Invention In order to solve the problems in the prior art, the invention provides an urban and rural change intelligent detection method for visual large model and real three-dimensional data, which comprises the following steps: Respectively acquiring digital orthophoto images DOM 1、DOM2 and a digital surface model DSM 1、DSM2 of a target area under two time phases, and preprocessing the digital orthophoto images and the digital surface model of the two time phases; Performing semantic segmentation processing on the DOM data of two time phases by using a pre-trained visual basic segmentation model, and respectively outputting a surface coverage classification map of which the two time phases contain semantic foundations; processing the DSM data of the two time phases, and extracting elevation change information; And fusing the earth surface coverage classification map with the elevation change information, and outputting inter-earth transformation map spots containing earth transformation and an artificial facility area with elevation transformation. Specifically, the preprocessing includes resampling and spatial registration, so that DOM and DSM data of two phases have uniform spatial reference and matched resolution. Specifically, the pre-trained visual basic segmentation model keeps the backbone network weight W of the image encoder frozen, and a low-rank adaptation matrix is inserted beside the key linear layer of each transducer module of the image encoder in parallelWhen parameter fine adjustment is carried out and the input remote sensing image X h is transmitted forward, the image characteristics are converted by the original freezing weight W and also by the low-rank incrementTransforming, and adding the two to form new weight adapting to remote sensing characteristics, , Wherein A is a dimension-reducing matrix,Projecting the input k-dimensional features into r-dimensional space, B is an ascending-dimensional matrix, re-projecting the r-dimensional features back into the output d-dimensional space, r is rank, determining the number of newly added parameters and the adaptability of the model,。 Specifically, the training process of the visual basic segmentation model takes the minimization of the segmentation loss on the remote sensing sample as an optimization target L: , Wherein, the For a fine-tuned semantic segmentation model, L CE is a loss function, the cross entropy loss calculated for the remote sensing sample (X h, Yh), X h is the remote sensing image, Y h is the corresponding ground object class label,H is a sample index, which is a frozen SAM trunk parameter; By updating only the a and B matrices in all LoRA modules and measuring the difference between the model segmentation output and the true surface coverage label Y h