CN-116266356-B - Panoramic video transition rendering method and device and computer equipment

CN116266356BCN 116266356 BCN116266356 BCN 116266356BCN-116266356-B

Abstract

The application relates to a panoramic video transition rendering method, a panoramic video transition rendering device, a computer device, a storage medium and a computer program product. The method comprises the steps of carrying out feature matching on image areas which are not blocked by each of two adjacent frames of images of at least one image group in an image group set respectively to obtain matching point pairs between the image areas which are not blocked by each of the two adjacent frames of images, optimizing translation amount between the two adjacent frames of images according to the matching point pairs between the image areas which are not blocked by each of the two adjacent frames of images, and determining the advancing direction of a virtual camera at the corresponding moment of each original video frame according to the optimized translation amount between the two adjacent frames of images. In addition, the optimized translation amount between every two adjacent frames of images is subjected to histogram statistical analysis, so that the problem of poor calculation of the advancing direction of a scene with weak texture and approximate static is effectively solved.

Inventors

YUAN WENLIANG
CHEN CONG
XIE LIANG
JIANG WENJIE

Assignees

影石创新科技股份有限公司

Dates

Publication Date: 20260505
Application Date: 20211217

Claims (13)

1. A panoramic video transition rendering method, the method comprising: acquiring an image group set, wherein each image group in the image group set is obtained by extracting frames from an original video, and each original video frame in the original video is synthesized by a plurality of fish-eye images; Determining a plurality of shielding area masks according to at least one image group in the image group set, and determining an image area where each frame of image in the image group set is not shielded according to the shielding area masks; Respectively carrying out feature matching on each non-occluded image area of each two adjacent frames of images of at least one image group in the image group set to obtain a matching point pair between each non-occluded image area of each two adjacent frames of images; Optimizing the translation amount between every two adjacent frames of images according to the matching point pairs between the image areas where the two adjacent frames of images are not blocked, and determining the advancing direction of the virtual camera at the corresponding moment of each original video frame according to the optimized translation amount between the two adjacent frames of images, wherein the translation amount of the virtual camera at the corresponding moment of each original video frame is obtained through interpolation processing of the translation amount between the two adjacent frames of images, and is used as the advancing direction of the virtual camera at the corresponding moment of each original video frame; Calculating a rotation matrix of the virtual camera at the corresponding moment of each original video frame according to the advancing direction of the virtual camera at the corresponding moment of each original video frame; and performing transition rendering on each original video frame according to the rotation matrix of the virtual camera at the corresponding moment of each original video frame and the rotation amount of the panoramic camera relative to the world coordinate system when shooting each original video frame.
2. The method of claim 1, wherein determining a number of occlusion region masks from at least one of the set of image sets comprises: Partitioning a plurality of paths of fisheye images corresponding to each frame of image of at least one image group in the image group set to obtain a partitioned area set corresponding to at least one image group in the image group set; determining a maximum gray average value according to the gray average value of each segmented region in the segmented region set, and calculating the difference value between the gray average value of each segmented region and the maximum gray average value; taking a blocking area corresponding to a difference value larger than a preset threshold value in all difference values as a target blocking area, wherein the target blocking area is a shielding area in a fisheye image corresponding to the target blocking area; and determining a plurality of shielding area masks according to shielding areas in the fisheye image corresponding to each frame of image.
3. The method according to claim 1, wherein the performing feature matching on the image areas that are not occluded in each of the two adjacent frames of images of at least one image group in the image group set to obtain matching point pairs between the image areas that are not occluded in each of the two adjacent frames of images respectively includes: Extracting features of the non-shielded image areas in each frame of image of at least one image group in the image group set to obtain feature points in the non-shielded image areas in each frame of image of at least one image group in the image group set; And carrying out feature point matching on feature points in the image areas where each two adjacent images of at least one image group in the image group set are not blocked, so as to obtain matching point pairs corresponding to each two adjacent images of at least one image group in the image group set.
4. A method according to claim 3, wherein the feature extraction of the non-occluded image area in each frame of image of at least one image group in the set of image groups comprises: And for any image area which is not blocked, taking the any image area which is not blocked as a current image area, and extracting feature points in the current image area, wherein the extraction result meets a preset condition, and the preset condition comprises that each two adjacent feature points are equidistant or the ratio between the area formed by surrounding all the extracted feature points and the area of the current image area is larger than a preset threshold value.
5. A method according to claim 3, wherein before the feature point matching is performed on the feature points in the image area where each two adjacent images of at least one image group in the image group set are not blocked, the method further comprises: And screening the feature points in the non-occluded image area in each frame of image of at least one image group in the image group set based on a random sampling algorithm.
6. The method according to claim 1, wherein determining the advancing direction of the virtual camera at the corresponding time of each original video frame according to the optimized translation amount between every two adjacent frames of images comprises: determining a preset direction vector of each two adjacent frames of images under a camera coordinate system according to the translation before optimization and the translation after optimization; determining a real main direction corresponding to each sub-direction sequence according to the preset direction vector of each two adjacent frames of images under a camera coordinate system; and determining the advancing direction of the virtual camera at the corresponding moment of each original video frame according to the real main direction corresponding to each sub-direction sequence.
7. The method according to claim 6, wherein determining the preset direction vector of each two adjacent frames of images in the camera coordinate system according to the translation before the optimization between each two adjacent frames of images and the translation after the optimization between each two adjacent frames of images comprises: Weighting according to the translation amount before optimization between every two adjacent frames of images and the translation amount after optimization between every two adjacent frames of images to obtain the integrated translation amount between every two adjacent frames of images; Converting the integrated translation quantity between every two adjacent frames of images into a camera coordinate system to obtain a direction vector of the every two adjacent frames of images in the camera coordinate system; and calculating an included angle between the direction vector of each two adjacent frames of images under the camera coordinate system and each preset direction vector, and determining the preset direction vector in which the direction vector of each two adjacent frames of images under the camera coordinate system falls according to the included angle corresponding to each two adjacent frames of images.
8. The method of claim 7, wherein the determining the true principal direction corresponding to each sub-direction sequence based on the preset direction vector of each two adjacent frames of images in the camera coordinate system comprises: segmenting the direction vector sequence based on a time sequence in the direction vector sequence to obtain a plurality of sub-direction sequences; determining the times that each sub-direction sequence falls on each preset direction vector according to the preset direction vector in which the direction vector of each adjacent two frames of images falls under a camera coordinate system, and taking the preset direction vector with the largest falling times of each sub-direction sequence as the main direction corresponding to each sub-direction sequence; If the total number of times that the main direction corresponding to each sub-direction sequence falls into is within the preset number of times range, the main direction corresponding to each sub-direction sequence is taken as the real main direction corresponding to each sub-direction sequence.
9. The method according to claim 6, wherein determining the advancing direction of the virtual camera at the corresponding time of each original video frame according to the real main direction corresponding to each sub-direction sequence comprises: smoothing and interpolating the real main direction corresponding to each sub-direction sequence to obtain a direction vector corrected at the corresponding moment of each original video frame; converting the direction vector corrected at the corresponding moment of each original video frame into a world coordinate system to obtain the direction vector at the corresponding moment of each original video frame; And taking the direction vector of the corresponding moment of each original video frame as the advancing direction of the virtual camera at the corresponding moment of each original video frame.
10. A panoramic video transition rendering device, said device comprising: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring an image group set, each image group in the image group set is obtained by extracting frames from original video, and each original video frame in the original video is synthesized by multiple paths of fisheye images; The first determining module is used for determining a plurality of shielding area masks according to at least one image group in the image group set, and determining an image area where each frame of image in the image group set is not shielded according to the shielding area masks; The second determining module is used for respectively carrying out feature matching on the image areas which are not blocked by each two adjacent frames of images of at least one image group in the image group set to obtain matching point pairs between the image areas which are not blocked by each two adjacent frames of images; The third determining module is used for optimizing the translation amount between every two adjacent frames of images according to the matching point pairs between the image areas where the two adjacent frames of images are not blocked, and determining the advancing direction of the virtual camera at the corresponding moment of each original video frame according to the optimized translation amount between the two adjacent frames of images, wherein the translation amount of the virtual camera at the corresponding moment of each original video frame is obtained through interpolation processing of the translation amount between the two adjacent frames of images, and is used as the advancing direction of the virtual camera at the corresponding moment of each original video frame; the calculation module is used for calculating a rotation matrix of the virtual camera at the corresponding moment of each original video frame according to the advancing direction of the virtual camera at the corresponding moment of each original video frame; And the rendering module is used for performing transition rendering on each original video frame according to the rotation matrix of the virtual camera at the corresponding moment of each original video frame and the rotation amount of the panoramic camera relative to the world coordinate system when shooting each original video frame.
11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 9 when the computer program is executed.
12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 9.
13. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 9.

Description

Panoramic video transition rendering method and device and computer equipment Technical Field The present application relates to the field of panoramic video processing technology, and in particular, to a panoramic video transition rendering method, apparatus, computer device, storage medium, and computer program product. Background In the related art, when a panoramic video is rendered, an absolute view angle of a fixed panoramic camera or a view angle of a smoothly rotated panoramic camera is generally used as a view angle of a virtual camera. However, when the absolute view angle of the panoramic camera is fixed as the view angle of the virtual camera, the fixed view angle may not be the view angle of direct interest to the user, and when the view angle of the panoramic camera is smoothly rotated as the view angle of the virtual camera, most of the view angles may not be the view angle of interest to the user. In practice, the user will generally pay more attention to the front of the panoramic camera. However, the method for rendering panoramic video in the related art cannot meet the requirements of users. In order to solve the problems, a patent named as an automatic view angle adjusting panoramic video rendering method discloses an automatic view angle adjusting panoramic video rendering method, which comprises the steps of acquiring rotation amount of a panoramic camera relative to a world coordinate system when shooting a current video frame and multiple paths of fisheye images corresponding to the current video frame and the previous video frame of the panoramic video, respectively extracting corner points of the multiple paths of fisheye images corresponding to the previous video frame of the panoramic video, respectively acquiring a corner sequence to be tracked, respectively tracking the corner sequence to be tracked, acquiring matching point pairs to be tracked in the fisheye images corresponding to the current video frame and the previous video frame, optimizing displacement amount of the current video frame of the panoramic camera relative to the previous video frame according to the matching point pairs, obtaining the optimized displacement amount, taking the optimized displacement amount as a advancing direction of a virtual camera, calculating a rotation matrix of the current virtual camera, and utilizing rotation amount of the panoramic camera relative to the world coordinate system and rotation of the current virtual camera when shooting the current video frame to perform transition on the current video frame. Although the problem of user's demand has been solved, but panoramic camera is that camera is taken or is carried on motorcycle, car to the both hands of general people when actually shooting, because panoramic camera's shooting lens adopts the fisheye lens, panoramic camera's shooting range is 360 degrees, and the regional object that can be by people's hand or on-vehicle support camera of unavoidable panoramic camera in shooting process lens can be blocked from, leads to appearing the mismatching at the in-process of characteristic matching for virtual camera's advancing direction is inaccurate. Disclosure of Invention In view of the foregoing, it is desirable to provide a panoramic video transition rendering method, apparatus, computer device, storage medium, and computer program product that can improve the accuracy of the heading direction of a virtual camera. In a first aspect, the application provides a panoramic video transition rendering method. The method comprises the following steps: Acquiring an image group set, wherein each image group in the image group set is obtained by extracting frames from original video, and each original video frame in the original video is synthesized by multiple paths of fisheye images; determining a plurality of shielding area masks according to at least one image group in the image group set, and determining an image area which is not shielded by each frame of image in the image group set according to the shielding area masks; Respectively carrying out feature matching on each non-shielded image area of each two adjacent frames of images of at least one image group in the image group set to obtain a matching point pair between each non-shielded image area of each two adjacent frames of images; Optimizing the translation amount between every two adjacent frames of images according to the matching point pairs between the image areas where the two adjacent frames of images are not blocked, and determining the advancing direction of the virtual camera at the corresponding moment of each original video frame according to the optimized translation amount between every two adjacent frames of images; according to the advancing direction of the virtual camera at the corresponding moment of each original video frame, calculating a rotation matrix of the virtual camera at the corresponding moment of each original video frame; And performing transiti