CN-115294443-B - Orthographic image generation method and device and electronic equipment

CN115294443BCN 115294443 BCN115294443 BCN 115294443BCN-115294443-B

Abstract

The invention provides an orthographic image generation method, an orthographic image generation device and electronic equipment, wherein the method comprises the steps of extracting characteristics of a current image, obtaining key points in the current image, and determining road sign points matched with the key points in the current image; and if the orthographic property is larger than a threshold value, projecting the key frame image onto a ground plane, segmenting the projected key frame image into a plurality of image tiles, and fusing each image tile and an orthographic value tile of each image tile to generate an orthographic image of each image tile. The invention can generate high-quality orthographic images in real time so as to accurately determine the position of a measuring object from the orthographic images.

Inventors

GAO GUANGZE
YUAN MENGKE
MENG WEILIANG
GUO JIANWEI
HE QUANBIN
XU SHIBIAO
ZHANG XIAOPENG

Assignees

中国科学院自动化研究所
北方天途航空技术发展（北京）有限公司

Dates

Publication Date: 20260508
Application Date: 20220427

Claims (9)

1. An orthographic image generation method, comprising: Extracting features of a current image, obtaining key points in the current image, and determining landmark points matched with the key points in the current image; Determining whether the current image is a key frame image or not based on the landmark points, if so, optimizing local map information corresponding to the key frame image based on GPS errors and reprojection errors, and determining orthographic property of the key frame image based on the optimized local map, wherein the local map information comprises all the key frame images, all the landmark points and similar transformation from a visual coordinate system to a geographic coordinate system on the local map; If the orthographic property is greater than a threshold value, projecting the key frame image onto a ground plane, segmenting the projected key frame image into a plurality of image tiles, and fusing each image tile and an orthographic value tile of each image tile to generate an orthographic image of each image tile; The local map information corresponding to the key frame image is optimized based on the following formula: ; ; Wherein, the Representing all of the key frame images, Representing key frame images Is used as a key point of all the effective points, The Huber function is represented by a representation of the Huber function, The information matrix is represented by a matrix of information, Is a parameter used to counterbalance the projection error and GPS error, The extraction of the translated portion from the transformation is shown, Representing the error of the re-projection, Represent the first Frame key frame image No. The key points corresponding to the road marking points, Represent the first The location of the landmark points in world coordinates, Representing a transformation from a lie algebraic form to a matrix form, Representing a transformation from a camera coordinate system to an image coordinate system, a function Control tolerance to GPS errors: ; Wherein, the Indicating that the GPS error is to be taken, A threshold value representing GPS error.
2. The orthographic image generation method of claim 1, wherein the determining a landmark point in the current image that matches the keypoint comprises: determining the initial pose of the current camera corresponding to the current image based on camera parameters and a motion model; determining initial road sign points matched with the key points in the current image based on the initial pose of the current camera; and optimizing the initial pose of the current camera based on the matched key points and the initial road mark points, and removing the initial road mark points with the matching degree smaller than the preset matching degree with the key points from the initial road mark points after optimizing to obtain the road mark points.
3. The orthographic image generation method according to claim 2, wherein the determining a current initial pose of the camera corresponding to the current image based on camera parameters and a motion model comprises: determining a camera translation vector in the camera parameters based on a similar transformation of the current image from a visual coordinate system to a geographic coordinate system, and GPS information of the current image; determining a camera rotation matrix in the camera parameters based on the rigid body transformation representation of the motion model, translation vectors of the current image and the previous image, and a rotation matrix of the previous image; The current camera initial pose is determined based on the camera translation vector and the camera rotation matrix.
4. A method of generating an orthographic image according to any one of claims 1 to 3, wherein said determining whether the current image is a key frame image based on the landmark points comprises: And determining the current image as the key frame image under the condition that the number of the landmark points is larger than a preset number or the geographic deviation between the landmark points of the current image and the landmark points of the previous image is larger than a preset value.
5. A method of generating an orthographic image as claimed in any one of claims 1 to 3, wherein said fusing each image tile and the orthographic value tile of each image tile to generate an orthographic image of each image tile comprises: generating a corresponding Laplacian pyramid based on each image tile; generating a corresponding Gaussian pyramid based on the orthographic value tile; And fusing the Laplacian pyramid and the Gaussian pyramid layer by layer from a high layer to a low layer to generate an orthographic image of each image tile.
6. The method of generating an orthographic image according to claim 5, wherein the fusing the laplacian pyramid and the gaussian pyramid layer by layer from a higher layer to a lower layer to generate an orthographic image for each image tile comprises: Under the condition that an overlapping area exists between the image tile and the orthographic value tile, the orthographic value of each pixel of each layer of the Laplacian pyramid and the Gaussian pyramid is the orthographic value corresponding to the maximum orthographic value tile, and the pixel value of each pixel is the image pixel corresponding to the maximum orthographic value tile; the orthographic image is recovered from the laplacian pyramid based on the orthographic value and the pixel value of each pixel.
7. An orthographic image generation method according to any one of claims 1 to 3, wherein the orthographic value tile is used to characterize an orthographic value of a projected key frame image, the orthographic value of the projected key frame image being determined based on the following formula: ; Wherein, the Orthographic values representing the projected key frame images, Representing the corresponding altitude at which the camera was taking, Representing camera direction of view the included angle of the normal direction of the ground, The resolution of the orthogram is represented, Representing half the diagonal pixel length of the key frame image, Representing the pixel distance of the pixel to the center of the image.
8. An orthographic image generating apparatus, comprising: the matching unit is used for extracting the characteristics of the current image, acquiring key points in the current image, and determining road mark points matched with the key points in the current image; The optimizing unit is used for determining whether the current image is a key frame image or not based on the road mark points, if so, optimizing local map information corresponding to the key frame image based on GPS errors and reprojection errors, and determining the orthographic property of the key frame image based on the optimized local map, wherein the local map information comprises all the key frame images, all the road mark points and similar transformation from a visual coordinate system to a geographic coordinate system on the local map; the generation unit is used for projecting the key frame image onto a ground plane, segmenting the projected key frame image into a plurality of image tiles, and fusing each image tile and the orthographic value tiles of each image tile to generate an orthographic image of each image tile if the orthographic is larger than a threshold; The local map information corresponding to the key frame image is optimized based on the following formula: ; ; Wherein, the Representing all of the key frame images, Representing key frame images Is used as a key point of all the effective points, The Huber function is represented by a representation of the Huber function, The information matrix is represented by a matrix of information, Is a parameter used to counterbalance the projection error and GPS error, The extraction of the translated portion from the transformation is shown, Representing the error of the re-projection, Represent the first Frame key frame image No. The key points corresponding to the road marking points, Represent the first The location of the landmark points in world coordinates, Representing a transformation from a lie algebraic form to a matrix form, Representing a transformation from a camera coordinate system to an image coordinate system, a function Control tolerance to GPS errors: ; Wherein, the Indicating that the GPS error is to be taken, A threshold value representing GPS error.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the orthographic image generation method of any one of claims 1 to 7 when the program is executed by the processor.

Description

Orthographic image generation method and device and electronic equipment Technical Field The present invention relates to the field of computer vision, and in particular, to an orthographic image generating method, an orthographic image generating device, and an electronic device. Background Due to the limited field of view and relatively low flying height of the unmanned aerial vehicle, the real-time generation of orthographic images using aerial images of unmanned aerial vehicles is widely used. At present, an orthographic image is generated by a method of a motion restoration structure (Structure from motion, sfM) based on multi-view solid geometry, but the method involves a large amount of operations, cannot be completed in real time, and has low efficiency. In addition, an orthographic image is generated by adopting an instant positioning and map construction (simultaneous localization AND MAPPING, SLAM) method, but the method can cause drift of the pose of a camera and reduction of fusion quality, and cannot keep good orthographic property, so that the position accuracy of a measured object in the generated orthographic image is lower. Disclosure of Invention The invention provides an orthographic image generation method, an orthographic image generation device and electronic equipment, which are used for solving the defect of low position accuracy of a measurement object in an orthographic image generated in the prior art. The invention provides an orthographic image generation method, which comprises the following steps: extracting features of a current image, obtaining key points in the current image, and determining road mark points matched with the key points in the current image; Determining whether the current image is a key frame image or not based on the landmark points, if so, optimizing local map information corresponding to the key frame image based on GPS errors and reprojection errors, and determining orthographic property of the key frame image based on the optimized local map, wherein the local map information comprises all the key frame images, all the landmark points and similar transformation from a visual coordinate system to a geographic coordinate system on the local map; And if the orthographic property is greater than a threshold value, projecting the key frame image onto a ground plane, segmenting the projected key frame image into a plurality of image tiles, and fusing each image tile and the orthographic value tiles of each image tile to generate an orthographic image of each image tile. According to the method for generating the orthographic image provided by the invention, the determining of the landmark points matched with the key points in the current image comprises the following steps: determining the initial pose of the current camera corresponding to the current image based on camera parameters and a motion model; determining initial road sign points matched with the key points in the current image based on the initial pose of the current camera; and optimizing the initial pose of the current camera based on the matched key points and the initial road mark points, and removing the initial road mark points with the matching degree smaller than the preset matching degree with the key points from the initial road mark points after optimizing to obtain the road mark points. According to the method for generating an orthographic image provided by the invention, the determining of the current initial pose of the camera corresponding to the current image based on camera parameters and a motion model comprises the following steps: determining a camera translation vector in the camera parameters based on a similar transformation of the current image from a visual coordinate system to a geographic coordinate system, and GPS information of the current image; determining a camera rotation matrix in the camera parameters based on the rigid body transformation representation of the motion model, translation vectors of the current image and the previous image, and a rotation matrix of the previous image; The current camera initial pose is determined based on the camera translation vector and the camera rotation matrix. According to the orthographic image generating method provided by the invention, the determining whether the current image is a key frame image based on the landmark points comprises the following steps: And determining the current image as the key frame image under the condition that the number of the landmark points is larger than a preset number or the geographic deviation between the landmark points of the current image and the landmark points of the previous image is larger than a preset value. According to the orthographic image generation method provided by the invention, the local map information corresponding to the key frame image is optimized based on the following formula: ei,j＝pi,j-π(exp(ξi)Pj) Where N represents all key frame images, O i represents all