CN-122023201-A - Dynamic shelter restoration method and system based on continuous street view panoramic image

CN122023201ACN 122023201 ACN122023201 ACN 122023201ACN-122023201-A

Abstract

The invention discloses a dynamic shelter restoration method and a system based on continuous street view panoramic images, the method comprises the steps of firstly obtaining a target panoramic image A and a reference panoramic image B to be repaired, and generating an original pixel-level shielding mask. And secondly, extracting matching points between panoramic images, realizing cross-view geometric alignment of the images, and obtaining a target perspective view C and a reference perspective view D through perspective reprojection. And then inputting the target perspective C and the reference perspective D into a three-dimensional reconstruction frame, and performing three-dimensional point cloud reimaging by using a depth inspection mechanism under the camera pose of the target perspective C. And finally, carrying out image restoration on the three-dimensional point cloud re-imaging result, restoring to a panoramic coordinate system, splicing with the original panoramic image, and outputting a de-occlusion panoramic image. The invention ensures that the repair result accords with the authenticity and geometric consistency of the geographic space, and effectively solves the problems of overlapping conflict and visual tearing during multi-view projection fusion.

Inventors

JIANG SHIJIE
GUAN FANGLI
Guan Zelin

Assignees

杭州电子科技大学

Dates

Publication Date: 20260512
Application Date: 20260416

Claims (10)

1. A dynamic shelter restoration method based on continuous street view panoramic images is characterized by comprising the following steps: S1, acquiring a target panoramic image A and a reference panoramic image B to be repaired, and generating an original pixel-level shielding mask; s2, extracting matching points between the target panoramic image A and the reference panoramic image B by adopting a feature matching algorithm, and calculating a homography matrix to realize cross-view geometric alignment of the reference image and the target image; s3, obtaining a target perspective view C and a reference perspective view D through perspective reprojection based on an original pixel level shielding mask; S4, inputting the target perspective view C and the reference perspective view D into a three-dimensional reconstruction frame, projecting pixel coordinates to a three-dimensional world coordinate system, and eliminating three-dimensional point clouds corresponding to the shielding mask according to the original pixel-level shielding mask; s5, performing three-dimensional point cloud re-imaging by using a depth inspection mechanism under the camera pose of the target perspective view C; And S6, carrying out image restoration on the three-dimensional point cloud re-imaging result, restoring to a panoramic coordinate system, splicing with the original panoramic image, and outputting a final de-occlusion panoramic image.
2. The method for repairing the dynamic occlusion object based on the continuous street view panoramic image according to claim 1, wherein the step S1 is specifically implemented by acquiring a target panoramic image A to be repaired and at least one reference panoramic image B at a nearby position, performing semantic analysis on the target panoramic image A by using a semantic segmentation model after de-distortion processing on the target panoramic image A, identifying and segmenting the dynamic occlusion objects, and respectively generating an original pixel-level occlusion mask of each dynamic occlusion object.
3. The method is characterized in that the step S2 is specifically implemented by respectively carrying out resolution scaling on a target panoramic image A and a reference panoramic image B to obtain scaled images corresponding to the target panoramic image A and the reference panoramic image B, obtaining matching coordinates of the two scaled images and confidence thresholds corresponding to the matching coordinates by utilizing a feature matching algorithm, restoring the matching coordinates to the resolution of the target panoramic image A and the reference panoramic image B through a proportionality coefficient, selecting high-reliability matching points higher than the set confidence threshold for the matching coordinates, judging that the reference image is unavailable if the number of the matching points is smaller than the preset threshold, estimating a homography matrix H of the reference panoramic image B to the target panoramic image A through the high-reliability matching points by utilizing a random sample consensus algorithm RANSAC algorithm, carrying out matrix multiplication on pixel point coordinates of the reference panoramic image B and the homography matrix H to obtain coordinates projected to the target panoramic image A, and carrying out resampling by adopting bilinear interpolation to find pixel alignment relation between the target panoramic image A and the reference panoramic image B, and realizing geometric alignment.
4. The method for repairing dynamic occlusion based on continuous street view panoramic images according to claim 3, wherein the step S3 is specifically implemented by taking the geometric center of an original pixel-level occlusion mask with the largest area in the target panoramic image A as a visual anchor point, and obtaining a target perspective C and a reference perspective D through perspective reprojection based on the target panoramic image A and the reference panoramic image B aligned across-view geometry.
5. The method for repairing dynamic occlusion of panoramic images based on continuous streetscapes of claim 4, wherein the perspective re-projection is specifically implemented as follows: normalized center coordinates of an original pixel-level shielding mask with the largest area in the target panoramic image A Mapped to a horizontal yaw angle in a spherical coordinate system Shaft Vertical pitch angle Shaft For the panoramic image corresponding to the target panoramic image A and the reference panoramic image B, Is the width and the height of the panoramic image, Is the spherical longitude and spherical latitude corresponding to the panoramic image pixel, Is the coordinates of each pixel point in the panoramic image; Calculation of The offset of the axis normalized coordinate and the central axis of the image is multiplied by 360 degrees of the corresponding full-scale visual angle range to deduce the horizontal yaw angle of the target area under the spherical space Calculating The offset of the axis normalized coordinate and the central axis of the image is multiplied by 180 degrees of the corresponding full-scale visual angle range to deduce the vertical pitch angle of the target area under the spherical space Calculating the corresponding spherical longitude And corresponding spherical latitude Converting longitude and latitude into three-dimensional vector on unit sphere Firstly taking latitude Cosine value of (2) and longitude Cosine values of (a) are multiplied by corresponding three-dimensional vectors At the position of Component of axial direction, three-dimensional coordinates Component of latitude Is directly determined, corresponding to a three-dimensional vector Projection size in vertical upward direction, corresponding to three-dimensional vector At the position of Component of axial direction, three-dimensional coordinates Component in latitude The cosine value is the projection modular length of the horizontal plane and the longitude Corresponding to three-dimensional vector by sine value multiplication of (a) At the position of A component in the axial direction; Constructing a rotation matrix, performing spherical rotation on the panoramic image, and aligning the center of the rotated image field with the space direction corresponding to the yaw angle and the pitch angle So that the panoramic image winds around Rotation of the shaft Yaw rotation matrix Represented as Pitch rotation matrix So that the panoramic image rotates around the Z axis Pitch rotation matrix Represented as ; The method for calculating the composite rotation matrix combining yaw rotation and pitch rotation comprises the steps of firstly making a yaw rotation matrix And pitch rotation matrix To obtain a forward rotation matrix ; And finally, generating a distortion-free partial perspective view which is aligned with the geometric center of the original pixel-level shielding mask with the largest area and accords with the perspective rule of human eyes through linear interpolation.
6. The method for repairing dynamic occlusion of panoramic images based on continuous streetscape of claim 5, wherein the specific implementation procedure of step S4 is as follows: the depth map predicted by the visual geometry base transducer model VGGT and the camera internal and external parameters are utilized; the three-dimensional coordinates in world coordinates are composed of pixel point coordinates Multiplying the three-dimensional point cloud with the inverse matrix of the camera internal reference matrix, multiplying the depth value corresponding to the pixel point, multiplying the inverse matrix of the camera rotation external reference matrix, and finally subtracting the product of the inverse matrix of the rotation external reference matrix and the translation external reference vector to reconstruct the three-dimensional point cloud; And screening the point cloud by using a numerical validity checking mechanism, removing non-numerical NaN or infinite invalid points in the point cloud, marking three-dimensional points falling in a mask area as invalid points according to the original pixel-level shielding mask generated in the step S3, and removing the invalid points.
7. The method for repairing dynamic occlusion of panoramic image based on continuous streetscape of claim 6, wherein said performing three-dimensional point cloud re-imaging by depth inspection mechanism in step S5 comprises: According to the three-dimensional point cloud reconstructed in the step S4 and the corresponding camera pose, projecting the three-dimensional point cloud to an imaging plane of a target virtual camera; when multiple view angles are projected to the same pixel position of the pose of the target perspective camera, reserving pixel textures with minimum depth values and confidence degrees larger than a preset threshold, and adopting a 3D guiding 2D projection strategy for the part which is not filled by the point cloud.
8. The method of claim 7, wherein the 3D guided 2D projection strategy comprises: For the connected domain which is not judged as the shielding blind zone, taking all anchor points as initial seeds, executing a space-limited growth algorithm, setting a maximum radius limit in the growth process, and judging the areas which cannot be covered by growth in the mask after the growth is finished as the shielding blind zones because the distance from the anchor points exceeds a distance threshold value, wherein all the shielding blind zones do not adopt a 3D guiding 2D projection strategy; Recording pixel positions of all projected points in a reference perspective, calculating a homography matrix projected to a target perspective corresponding to each reference perspective, performing matrix point multiplication on the homography matrix and the reference perspective to obtain coarse projection, and calculating displacement weights of each anchor point to each non-anchor point projection point by taking each projected point as a control anchor point, wherein the displacement weights are equal to natural constants To the negative power of (2), the exponential portion being the sum of two terms, the first term being the first Coarse projection two-dimensional coordinates of each pixel point to be calculated And the first Projection coordinates of each anchor point on two-dimensional image plane Is divided by the smoothing parameter, the second term is the second Predicted depth values of each pixel to be calculated And the first Three-dimensional depth information for each anchor point Dividing the square of the difference value of (2) by the depth sensitivity factor; Then the nearest neighbor of the coordinates of the projection point is searched through K nearest neighbor KNN The anchor points are used for carrying out normalized summarization on the displacement weights and realizing the realization through coordinate compensation And Local displacement in direction: First, the Displacement residual of each anchor point Equal to the coordinates of the anchor point in the reference perspective Subtracting its coordinates in the current view First, a third step Final projection coordinates of each pixel point to be calculated on target panoramic image A Coarse projection coordinates equal to the pixel point Add one to all belong to Of individual pixels Individual neighbor anchors, i.e. using normalized displacement weights Multiplying the displacement residual of the corresponding anchor point Then adding all the results to obtain a weighted sum term; For the first Final projection coordinates of each pixel point to be calculated on the target panoramic image A, and calculating a predicted depth value of the pixel point If the depth difference is larger than the set threshold value, the projection point is considered to be a false projection point, and the point is removed.
9. The method for repairing dynamic occlusion of panoramic images based on continuous streetscape of claim 8, wherein the specific implementation procedure of step S6 is as follows: S6.1, carrying out image restoration on a three-dimensional point cloud reimaging result, and carrying out smoothing treatment on the boundary of the shielding mask to reduce texture tearing feeling, wherein the method comprises the following steps: Residual filling, namely gradually filling the uncovered shielding residual holes and the uncovered shielding blind areas from the edge of the repair area to the inside, and performing background synthesis by using an image repair algorithm for restoring the damaged part by using pixel information of the undamaged area; the boundary is smooth, namely, an alpha weight graph is generated by carrying out Gaussian blur processing on the shielding mask and is realized through eclosion fusion; s6.2, restoring the target perspective C after image restoration to a panoramic coordinate system through inverse geometric transformation, splicing the target perspective C with the target panoramic image A, and outputting a final de-occlusion panoramic image.
10. The system comprises a pixel level shielding mask generation module, an original pixel level shielding mask generation module, a target panoramic image A, a reference panoramic image B, a target panoramic image A and a target panoramic image B, wherein the pixel level shielding mask generation module is used for acquiring a target panoramic image A to be repaired and at least one reference panoramic image B at an adjacent position; the cross-view geometric alignment module is used for extracting matching points between the target panoramic image A and the reference panoramic image B by adopting a feature matching algorithm, and calculating a homography matrix to realize cross-view geometric alignment of the reference image and the target image; The perspective view generation module is used for obtaining a target perspective view C and a reference perspective view D through perspective re-projection based on the original pixel-level shielding mask; The three-dimensional world coordinate system projection module is used for inputting the target perspective view C and the reference perspective view D into a three-dimensional reconstruction frame, projecting pixel coordinates to the three-dimensional world coordinate system, and eliminating the three-dimensional point cloud corresponding to the shielding mask according to the original pixel-level shielding mask; The three-dimensional point cloud re-imaging module is used for carrying out three-dimensional point cloud re-imaging by utilizing a depth inspection mechanism under the camera pose of the target perspective view C; The de-occlusion panoramic image output module is used for repairing the image of the three-dimensional point cloud re-imaging result, restoring to a panoramic coordinate system, splicing with the original panoramic image and outputting a final de-occlusion panoramic image.

Description

Dynamic shelter restoration method and system based on continuous street view panoramic image Technical Field The invention belongs to the technical field of computer vision and digital image processing, and particularly relates to a dynamic shelter restoration method and system based on continuous street view panoramic images. Background Panoramic street view images are important data bases for smart city analysis, three-dimensional city reconstruction and automatic driving navigation. However, in the city street view image acquisition process, dynamic targets such as pedestrians, vehicles and the like on the road inevitably cause vision occlusion. For smart city analysis, pedestrians, vehicles and the like introduce a large amount of noise into data, so that the accuracy and robustness of a final model or algorithm can be reduced, meanwhile, the vision quality of the panoramic image is influenced by the shielding, and the geographic elements (such as pavement textures, building bases and traffic signs) of the bottom layer are covered, so that a large amount of common panoramic images are difficult to provide effective high-quality data for three-dimensional modeling. The existing panoramic image occlusion region restoration method mainly comprises (1) a method based on traditional image patching (INPAINTING), wherein texture blurring and geometric dislocation often occur due to lack of multi-view geometric constraint when large-area occlusion is processed, and (2) a method based on generation of an countermeasure network (GAN), wherein the visual effect is good, but the authenticity of generated content is difficult to guarantee. In addition, the panoramic image has serious high-latitude distortion under the equidistant columnar projection format, and the accuracy can be obviously reduced due to the fact that feature matching and target segmentation are directly carried out in the panoramic field. At present, a systematic scheme for repairing real data at multiple angles by utilizing continuous streetscapes, effectively inhibiting distortion and realizing three-dimensional space consistency repair is lacking. Disclosure of Invention Aiming at the problems of lack of authenticity, inconsistent space, distortion sensitivity, hard repairing texture and the like in the process of repairing panoramic occlusion in the prior art, the invention provides a dynamic occlusion repairing method and system based on continuous street view panoramic images, which solve the problems of no real data reference, lack of multi-view geometric constraint and image distortion repairing in the prior art by combining multi-view adaptive perspective conversion with three-dimensional reconstruction. In one aspect of the invention, a method for repairing a dynamic shelter based on a panoramic image of a continuous street view is provided, which comprises the following specific steps: S1, panoramic image preprocessing, namely acquiring a target panoramic image A to be repaired and at least one reference panoramic image B at an adjacent position, carrying out semantic analysis on the target panoramic image A by utilizing a semantic segmentation model after de-distortion processing on the target panoramic image A, identifying and segmenting dynamic shielding targets, and respectively generating an original pixel-level shielding mask of each dynamic shielding target. And S2, cross-view feature matching and geometric alignment, namely extracting matching points between the target panoramic image A and the reference panoramic image B by adopting a feature matching algorithm, selecting high-confidence matching points, calculating a homography matrix of the reference panoramic image B projected to the target panoramic image A, and performing projection resampling on the reference panoramic image B by utilizing the homography matrix to realize cross-view geometric alignment of the reference image and the target image. And S3, self-adaptive perspective reprojection, namely, based on the fact that the geometric center of the original pixel-level shielding mask with the largest area in the target panoramic image A is a visual anchor point, the target panoramic image and the aligned reference panoramic image are subjected to perspective reprojection to form a target perspective C and a reference perspective D. And S4, three-dimensional reconstruction and shielding elimination of the street view, namely inputting a target perspective view C and a reference perspective view D into a three-dimensional reconstruction frame, projecting pixel coordinates to a three-dimensional world coordinate system through depth map calculation, and eliminating three-dimensional point clouds corresponding to shielding masks according to original pixel-level shielding masks. S5, performing three-dimensional point cloud re-imaging by using a depth inspection mechanism under the camera pose of the target perspective view C, and adopting a 3D guiding 2D projection strate