CN-122023587-A - Multi-camera correction and boundary multi-scale seamless splicing method and system
Abstract
The invention relates to the technical field of image processing and discloses a multi-camera correction and boundary multi-scale seamless splicing method and system; the method comprises the steps of obtaining a punctuation pair set between a video frame of each camera and a reference overlook base map, obtaining a nonlinear geometric mapping function from a camera pixel coordinate system to the reference overlook base map coordinate system through fitting of a basis function model, resampling and correcting the video frame of each camera to the reference overlook base map coordinate system in real time by utilizing the geometric mapping function, carrying out multi-scale feature alignment on the obtained overlook corrected frames of each camera in an overlapping area to estimate local refinement transformation, carrying out multi-scale frequency band fusion on one overlook corrected frame of each adjacent overlook corrected frame after the local refinement transformation, and generating a panoramic spliced video frame. The invention provides a multi-camera correction and boundary multi-scale seamless splicing method based on sparse punctuation constraint, which can realize real-time panoramic image.
Inventors
- FANG HANG
- LI HOUQIANG
- ZHOU WENGANG
- LI LI
Assignees
- 中国科学技术大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260410
Claims (10)
- 1. A multi-camera correction and boundary multi-scale seamless splicing method is characterized by comprising the following steps: acquiring a punctuation pair set between a video frame of each deployed camera and a reference top-down base map; based on the punctuation pair set, a nonlinear geometric mapping function from a camera pixel coordinate system to a reference top-down base map coordinate system is obtained through fitting of a basis function model; The video frames of all the cameras are resampled and corrected in real time to a reference overlook base map coordinate system by utilizing a geometric mapping function, so that overlook correction frames of all the cameras are obtained; and (3) carrying out multi-scale feature alignment on overlook correction frames of each path of cameras in an overlapping area to estimate local refinement transformation, carrying out multi-scale frequency band fusion on one overlook correction frame of each adjacent two overlook correction frames after the local refinement transformation, and generating panoramic spliced video frames with consistent geometry and continuous brightness.
- 2. The method for multi-camera rectification and boundary multi-scale seamless stitching according to claim 1, wherein the acquiring the set of punctuation pairs between the video frames of each deployed camera and the reference top-down base map specifically comprises: Deployment of the device Road camera, the first The road camera is at moment Is the image frame of A reference top-down bottom view is provided The corresponding coordinate system is marked as a reference top-down base-map coordinate system Camera pixel coordinate system The pixel points of the camera are Reference is made to the top-down base-map coordinate system Is as follows , Is that Is defined by the abscissa and the ordinate of (c), Is that Is the abscissa and ordinate of (2); First, the Punctuation pair set of road camera , Representing the index of the kth punctuation pair, Represent the first The total number of punctuation pairs of cameras of the road; Wherein, the Is the first Camera pixels in the pairs of punctuation, Is the first Reference top-down base map reference points in pairs of individual punctuation points, For the confidence weight of the kth punctuation pair, The transpose is represented by the number, Is that Is defined by the abscissa and the ordinate of (c), Is that And the abscissa and ordinate of (c).
- 3. The multi-camera correction and boundary multi-scale seamless stitching method according to claim 2, wherein the fitting of the basis function model based on the punctuation pair set to obtain a nonlinear geometric mapping function from a camera pixel coordinate system to a reference top-down base map coordinate system specifically comprises: normalizing the pixel coordinates of the camera and the reference overlook base map coordinates; Representing the geometric mapping function by using a basis function vector containing polynomial terms; parameters of the basis function vector are solved by minimizing an objective function that combines a robust loss function and a regularization term.
- 4. A method of multi-camera rectification and boundary multi-scale seamless stitching according to claim 3, wherein the expressing the geometric mapping function by using basis function vectors containing polynomial terms specifically comprises: Basis function vector is Contains to a specified order Polynomial terms of (2): ; representing the abscissa and the ordinate of the pixel point of the camera after normalization processing; The geometric mapping function is written as: ; ; Wherein, the Is a parameter of the basis function vector to be estimated.
- 5. The multi-camera rectification and boundary multi-scale seamless stitching method according to claim 4, wherein said solving parameters of said basis function vector by minimizing an objective function combining a robust loss function and a regularization term, specifically comprises: The objective function is: ; Wherein, the In order to be a robust loss function, For the regularization coefficient(s), Is F2 norm; for normalization after processing The abscissa and the ordinate of the pixel points of the camera in the coordinate point pair; and solving the objective function by adopting an iterative re-weighted least square method, and adaptively selecting the order of polynomial terms of the basis function vector according to the number and the distribution of punctuation pairs through a model order self-adaptive mechanism.
- 6. The multi-camera correction and boundary multi-scale seamless splicing method according to claim 1, wherein the method for correcting the video frames of each camera to a reference top-view base map coordinate system by utilizing a geometric mapping function to obtain top-view correction frames of each camera specifically comprises the following steps: based on the geometric mapping function or the inverse of the geometric mapping function, a dense mapping lookup table from the reference top-down base map coordinates to camera pixel coordinates is pre-computed, and the top-down correction frame is generated by querying the dense mapping lookup table and performing an interpolation operation.
- 7. The multi-camera rectification and boundary multi-scale seamless stitching method according to claim 6, wherein the generating the top-view rectification frame by pre-computing a dense mapping lookup table from reference top-view base map coordinates to camera pixel coordinates based on geometric mapping functions or inverse geometric mapping functions, and querying the dense mapping lookup table and performing interpolation operations specifically comprises: in a reference top-down base map coordinate system corresponding to the reference top-down base map The discrete grid is defined above Defining an inverse sampling function : ; For referencing a top-down base-map coordinate system Reference to the top-down bottom-view reference point Is defined by the abscissa and the ordinate of (c), Is the first A camera pixel coordinate system of the road camera, Is the first A geometric mapping function corresponding to the road camera; constructing a dense deformation field of each path of cameras: , Wherein, the Is the first The road camera is at moment Image frames of (a) Is represented by the horizontal axis of sequential sampling, Is the first The road camera is at moment Image frames of (a) Is provided with a continuous sampling ordinate of the sample, Is a dense mapping look-up table; Top-view correction frame of (2) The method comprises the following steps: 。
- 8. The multi-camera rectification and boundary multi-scale seamless stitching method according to claim 1, wherein the performing multi-scale feature alignment on the top-view rectification frames of each path of cameras in the overlapping area to estimate the local refinement transformation specifically comprises: Constructing a multi-scale feature pyramid for each overlooking correction frame; Sequentially optimizing parameters of local refinement transformation on a plurality of scales from coarse to fine of the feature pyramid, wherein the local refinement transformation is a translation model, an affine model, a homography model or a segmented local model.
- 9. The multi-camera correction and boundary multi-scale seamless splicing method according to claim 1, wherein after one top-view correction frame of each two adjacent top-view correction frames is subjected to local refinement transformation, multi-scale frequency band fusion is performed to generate a panoramic spliced video frame with consistent geometry and continuous brightness, and the method specifically comprises the following steps: Two adjacent top-view correction frames are 、 Correcting the frame in the plane view Obtaining the overlook correction frame after the refinement transformation through the local refinement transformation Constructing a weight graph; Is that And Respectively constructing Laplacian pyramids 、 And construct Gaussian pyramid for weight map ; Is the first Index of the layer pyramid(s), Is that Laplacian pyramid of (2) The image of the layer is a layer image, Is that Laplacian pyramid of (2) The image of the layer is a layer image, Gaussian pyramid as weight map A layer image; Weighted fusion is performed on each layer of pyramid: ; representing the first of the fused Laplacian pyramids A layer image; And reconstructing the fused Laplacian pyramid to obtain a final panoramic spliced video frame.
- 10. A computer system comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 9 when the computer program is executed.
Description
Multi-camera correction and boundary multi-scale seamless splicing method and system Technical Field The invention relates to the technical field of image processing, in particular to a multi-camera correction and boundary multi-scale seamless splicing method and system. Background With the large-scale deployment of security systems, multi-source surveillance videos have become high-density space-time data streams. However, in reality, each camera picture is in a pixel coordinate system independent of each other, and lacks a unified spatial reference and a continuous global view angle, so that cross-lens target positioning, event backtracking and linkage treatment are highly dependent on manual experience and point location memory, and visual and integral situation understanding is difficult to form. The application focuses on the core problem that under the condition of given reference overlooking base image (or orthographic image) and multi-channel wide-angle video, the dense geometric mapping from each channel of video pixel coordinate to a unified overlooking coordinate system is learned, and the real-time fusion output of geometric consistency, continuous brightness and stable time sequence of multi-channel pictures is realized under the unified coordinate system. Firstly, the mapping is obviously nonlinear due to wide-angle distortion and view angle difference, and the precision and generalization of the traditional homography/calibration model are difficult to be compatible under a complex scene; secondly, the supervision information is usually from a small number of sparse corresponding points and contains noise, and how to realize robust parameter estimation and popularize to full-frame dense mapping is a core for determining splicing accuracy; thirdly, exposure difference, dynamic target shielding and boundary artifact exist in the overlapping area of the multipath video, and how to complete multi-scale seamless fusion and restrain time sequence flicker under real-time constraint directly influences the usability of 'what you see is what you get'. Disclosure of Invention In order to solve the technical problems, the invention provides a multi-camera correction and boundary multi-scale seamless splicing method and system. In order to solve the technical problems, the invention adopts the following technical scheme: In a first aspect, the present invention provides a method for correcting multiple cameras and seamlessly splicing boundaries in multiple dimensions, including: acquiring a punctuation pair set between a video frame of each deployed camera and a reference top-down base map; based on the punctuation pair set, a nonlinear geometric mapping function from a camera pixel coordinate system to a reference top-down base map coordinate system is obtained through fitting of a basis function model; The video frames of all the cameras are resampled and corrected in real time to a reference overlook base map coordinate system by utilizing a geometric mapping function, so that overlook correction frames of all the cameras are obtained; and (3) carrying out multi-scale feature alignment on overlook correction frames of each path of cameras in an overlapping area to estimate local refinement transformation, carrying out multi-scale frequency band fusion on one overlook correction frame of each adjacent two overlook correction frames after the local refinement transformation, and generating panoramic spliced video frames with consistent geometry and continuous brightness. In one embodiment, the acquiring the set of punctuation pairs between the video frame of each camera and the reference top-down bottom map specifically includes: Deployment of the device Road camera, the firstThe road camera is at momentIs the image frame ofA reference top-down bottom view is providedThe corresponding coordinate system is marked as a reference top-down base-map coordinate systemCamera pixel coordinate systemThe pixel points of the camera areReference is made to the top-down base-map coordinate systemIs as follows,Is thatIs defined by the abscissa and the ordinate of (c),Is thatIs the abscissa and ordinate of (2); First, the Punctuation pair set of road camera,Representing the index of the kth punctuation pair,Represent the firstThe total number of punctuation pairs of cameras of the road; Wherein, the Is the firstCamera pixels in the pairs of punctuation,Is the firstReference top-down base map reference points in pairs of individual punctuation points,For the confidence weight of the kth punctuation pair,The transpose is represented by the number,Is thatIs defined by the abscissa and the ordinate of (c),Is thatAnd the abscissa and ordinate of (c). In one embodiment, the fitting by a basis function model based on the punctuation pair set to obtain a nonlinear geometric mapping function from a camera pixel coordinate system to a reference top-down base map coordinate system specifically includes: normalizing the pixel co