Search

CN-121999126-A - Satellite image building element parameterized reconstruction method and system based on coupling of physical geometric constraint and differential rendering

CN121999126ACN 121999126 ACN121999126 ACN 121999126ACN-121999126-A

Abstract

The invention discloses a satellite image building element parameterized reconstruction method and system based on coupling of physical geometric constraint and differential rendering. The method comprises the steps of firstly constructing a multi-scale multi-task network based on a Swin Transformer to extract image features, secondly introducing shadow consistency constraint of physical guidance, reversely calibrating high prediction by utilizing shadow geometric relation to solve ambiguity problem of single-view high estimation, meanwhile, carrying out regularized vector extraction on building contours by utilizing a frame field theory, and finally, adopting a reverse modeling strategy based on micro-rendering to automatically solve roof geometric parameters to generate a LOD 2-level parameterized three-dimensional model. The invention effectively improves the physical consistency, the contour regularity and the modeling automation level of the three-dimensional reconstruction, and is suitable for high-precision construction of large-scale urban digital bases.

Inventors

  • CHEN YUJIA
  • ZHU JINGLIANG
  • WU JIAMING
  • LIU DAN
  • JIA KAI
  • HAN XIN
  • XIAO BIN
  • XU CHUNMING

Assignees

  • 华设设计集团股份有限公司
  • 江苏狄诺尼信息技术有限责任公司

Dates

Publication Date
20260508
Application Date
20251224

Claims (10)

  1. 1. The satellite image building element parameterization reconstruction method based on coupling of physical geometric constraint and differential rendering is characterized by comprising the following steps: Step S1, preprocessing and enhancing multisource data, acquiring a monocular high-resolution optical satellite image of a target area, analyzing a solar altitude angle, a solar azimuth angle and satellite observation parameters in image element data, and performing physical enhancement processing on the image; Step S2, constructing a Swin transform multitask neural network, constructing a backbone network and a multitask decoding end based on the Swin transform, inputting the processed image into the network, and outputting a building semantic segmentation map, a pixel level height prediction map and a geometric outline frame field in parallel; Step S3, altitude deduction under physical constraint, namely, an optical imaging physical model is introduced to construct a micro shadow rendering layer, and the pixel level altitude prediction graph is reversely calibrated by utilizing the consistency of theoretical shadows and actual observation shadows based on the sun altitude angle and the sun azimuth angle; Step S4, based on the geometric outline standard frame field, guiding outline evolution by using an active skeleton model, and generating a building vector outline with geometric rule characteristics through energy minimization optimization; And S5, fitting a parameterized model, adopting an analysis-synthesis strategy, solving the geometric parameters of the building roof by using a micro-rendering technology and a gradient descent method based on the building vector outline and the pixel level height prediction map, and generating a three-dimensional parameterized model.
  2. 2. The method according to claim 1, wherein in the step S1, the physical enhancement process specifically includes: simulating shadow distribution diagrams of the target area under different solar altitude angles based on the rough digital surface model; and the simulated shadow distribution map is used as an auxiliary channel to be added into the original satellite image, and enhanced input data is generated so as to enhance the sensitivity of the network to shadow and high correlation.
  3. 3. The method according to claim 1, wherein in the step S2, the backbone network adopts a hierarchical Swin Transformer architecture; the backbone network divides an input image into a Patch sequence, and extracts characteristics through a Swin transform block of a plurality of stages; And in each Swin transducer block, utilizing a moving window multi-head attention mechanism (Shifted Window Attention) to perform cross-window information interaction, and outputting four layers of multi-scale feature graphs with different step sizes.
  4. 4. A method according to claim 3, wherein the multi-tasking decoding side comprises a neck network and three parallel tasking headers: the neck network adopts UperNet structure to fuse the multi-scale characteristic map; the segmentation head is used for outputting a binary probability map for distinguishing buildings and backgrounds; the altitude head is used for outputting continuous altitude regression values and mapping the continuous altitude regression values to a preset altitude range; the frame field head is used for outputting two-channel complex coefficients corresponding to PolyVector representation methods and representing local main tangential directions respectively.
  5. 5. The method according to claim 1, wherein in step S3, the micro-shadow-renderable layer calculates a shadow intensity field by the following formula : ; Wherein, the For the current pixel point of the pixel, For the upstream pixel point in the illumination direction, In order to predict the height of the object, Is the distance between the two points, As a coefficient of the length of the shadow, To control the super-parameters of the steepness of the Sigmoid function.
  6. 6. The method according to claim 5, wherein in the step S3, the specific procedure of the consistency back calibration is: Extracting an observation shadow mask from the original image by utilizing a color invariant index; calculating the shadow intensity field A binary cross entropy penalty with the observed shadow mask; Adding the binary cross entropy loss as a physical constraint term to a total loss function, and updating network parameters through back propagation to adjust the pixel-level height prediction graph.
  7. 7. The method according to claim 1, wherein in the step S4, the frame field uses PolyVector notation: representing the two orthogonal principal directions of each pixel point as a fourth order polynomial Is a root of (2); the network predicts complex coefficients of the polynomial And And solving the main tangential direction according to the coefficient to solve the problem of the rotational symmetry of the direction field.
  8. 8. The method according to claim 1, wherein in step S4, the energy minimization optimization energy function comprises alignment energy And smoothing energy : The alignment energy The direction for driving the contour line segments is aligned with the main direction of the frame field; The smoothing energy For maintaining the smoothness of the profile; in combination with corner recognition, a regular polygon containing horizontal, vertical or specific angle line segments is reconstructed.
  9. 9. The method according to claim 1, wherein the implementation of step S5 includes: Performing example segmentation on the region in the building vector outline to obtain an independent building example height map; Classifying primitives of the building instance, determining roof types, and defining a set of geometric parameters to be optimized The parameter set At least comprises building foundation height, ridge relative height, roof gradient, rotation azimuth angle and geometric center offset; construction of a micro-height generatable function Generating a parameterized height map; Calculating L2 loss of the parameterized height map and the building example height map, and iteratively updating the geometric parameter set by using an Adam optimizer Until the loss converges.
  10. 10. A satellite image building element parameterized reconstruction system based on coupling of physical geometric constraints and differential rendering, comprising: the data preprocessing and enhancing module is used for executing data analysis and shadow simulation enhancement; the multi-scale multi-task feature coding module is used for executing feature extraction and multi-task prediction; The physical constraint height deduction module is used for executing micro shadow rendering and height calibration; The frame field contour regularization module is used for executing frame field calculation and contour optimization; The LOD2 parameterized modeling module is used for performing inverse micro-parameter fitting.

Description

Satellite image building element parameterized reconstruction method and system based on coupling of physical geometric constraint and differential rendering Technical Field The invention belongs to the technical field of intersection of artificial intelligence and a three-dimensional Geographic Information System (GIS), and particularly relates to a satellite image building element parameterized reconstruction method and system based on coupling of physical geometric constraint and differential rendering. Background With the rapid development of earth observation technology, the optical satellite image with the resolution of Mi Jigao is a main data source for acquiring earth surface information, and plays an important role in urban digital transformation and digital twin construction. Although three-dimensional reconstruction based on monocular satellite images has the advantages of wide coverage, low cost and short revisit period compared with the high-cost airborne laser radar (LiDAR) and huge-calculation-amount oblique photogrammetry technology, the task has inherent depth ambiguity in recovering a three-dimensional structure from two-dimensional projection, and belongs to the typical pathological inversion problem. The existing monocular reconstruction scheme based on deep learning mainly has a plurality of technical bottlenecks. Firstly, in the aspect of feature extraction, the traditional convolutional neural network is limited by a local receptive field, global context information of a large building is difficult to capture, a segmentation result is often broken, and the common visual transducer has high computational complexity and is difficult to be directly applied to a large-breadth remote sensing image. Secondly, in the aspect of altitude estimation, the existing method is mostly regarded as pixel-level regression driven by pure data, ignoring physical geometric constraints among the sun, the building and the shadow, leading to extremely weak generalization capability of the model under different illumination conditions or new scenes, and the altitude of the estimated performance often violates physical common sense. In addition, the post-processing and modeling procedures of the prior art have obvious defects. The edges of the probability map output by the semantic segmentation network are generally fuzzy and zigzag, and the traditional post-processing algorithm only focuses on point fitting errors and ignores geometric topological rules such as right angles, parallelism and the like specific to buildings, so that the generated vector contours are irregular and even have topological errors. Meanwhile, the parameterized modeling flow is often a split step method, a LOD 2-level fine model is difficult to generate, and the parameterized modeling flow is extremely easy to be interfered by noise to generate wrong plane segmentation, so that error accumulation is serious. Therefore, an integrated technical solution that can integrate global perception, physical constraint and geometric regularization is needed to meet the requirements of fully automatic, high-precision and physically interpretable three-dimensional reconstruction. Disclosure of Invention In order to solve the technical problems, the embodiment of the application provides a satellite image building element parameterization reconstruction method based on coupling of physical geometric constraint and differential rendering, which is characterized by comprising the following steps: Step S1, preprocessing and enhancing multisource data, acquiring a monocular high-resolution optical satellite image of a target area, analyzing a solar altitude angle, a solar azimuth angle and satellite observation parameters in image element data, and performing physical enhancement processing on the image; Step S2, constructing a Swin transform multitask neural network, constructing a backbone network and a multitask decoding end based on the Swin transform, inputting the processed image into the network, and outputting a building semantic segmentation map, a pixel level height prediction map and a geometric outline frame field in parallel; Step S3, altitude deduction under physical constraint, namely, an optical imaging physical model is introduced to construct a micro shadow rendering layer, and the pixel level altitude prediction graph is reversely calibrated by utilizing the consistency of theoretical shadows and actual observation shadows based on the sun altitude angle and the sun azimuth angle; Step S4, based on the geometric outline standard frame field, guiding outline evolution by using an active skeleton model, and generating a building vector outline with geometric rule characteristics through energy minimization optimization; And S5, fitting a parameterized model, adopting an analysis-synthesis strategy, solving the geometric parameters of the building roof by using a micro-rendering technology and a gradient descent method based on the build