CN-119835536-B - Video picture splicing method, device, equipment and storage medium

CN119835536BCN 119835536 BCN119835536 BCN 119835536BCN-119835536-B

Abstract

The embodiment of the application provides a video picture splicing method, device, equipment and storage medium, which are used for acquiring video pictures to be spliced, which are shot by a camera, identifying image feature points from the video pictures to be spliced, matching spatial feature points corresponding to the image feature points according to the image feature points, screening feature point pairs consisting of the image feature points and the spatial feature points to obtain target feature point pairs, calculating external parameters of the camera according to the target feature point pairs, and splicing the video pictures to be spliced according to the external parameters of the camera to obtain spliced target video pictures. According to the embodiment of the application, the image characteristic points of the video pictures to be spliced are identified, the corresponding spatial characteristic points are matched, the characteristic point pairs formed by the image characteristic points and the spatial characteristic points are screened, coarse noise points generated when the matched characteristic points are extracted are filtered, so that the calculation accuracy of camera external parameters under severe environments is improved, and the success rate and the accuracy of the subsequent video picture splicing are ensured.

Inventors

MENG QINGCHUN

Assignees

中电信人工智能科技(北京)有限公司

Dates

Publication Date: 20260512
Application Date: 20241212

Claims (8)

1. A method of video picture stitching, the method comprising: acquiring video pictures to be spliced, which are shot by a camera; identifying image characteristic points from the video pictures to be spliced; According to the image feature points, matching the spatial feature points corresponding to the image feature points, wherein the spatial feature points are obtained by measuring the spatial coordinates of the real objects corresponding to the picture features in advance; screening feature point pairs formed by the image feature points and the space feature points to obtain target feature point pairs; calculating the external parameters of the camera according to the target characteristic point pairs; splicing the video pictures to be spliced according to the external parameters of the camera to obtain spliced target video pictures; the identifying the image feature points from the video pictures to be spliced comprises the following steps: performing de-distortion treatment on the video pictures to be spliced to obtain de-distorted video pictures; Identifying picture characteristics in the video picture after de-distortion by adopting an identification model, wherein the identification model is a model obtained by training a training image containing the picture characteristics; obtaining the image characteristic points according to the picture characteristics; The obtaining the image feature point according to the picture feature comprises the following steps: cutting the video picture after de-distortion according to the picture characteristics to obtain a characteristic small picture comprising the picture characteristics; extracting characteristic corner points of the characteristic small image as the image characteristic points; the matching of the spatial feature points corresponding to the image feature points according to the image feature points comprises: acquiring the spatial feature points corresponding to the picture features; the method comprises the steps of obtaining a reference image containing reference feature points, wherein a first pairing relation exists between a coordinate system of the reference image and a space coordinate system, and the space coordinate system is the coordinate system where the space feature points are located; Matching the characteristic corner points with the reference characteristic points to obtain a second pairing relation between the characteristic corner points and the reference characteristic points; and matching the spatial feature points according to the first pairing relation and the second pairing relation.
2. The method according to claim 1, wherein said screening feature point pairs formed by the image feature points and the spatial feature points to obtain target feature point pairs includes: Setting iteration times; And filtering noise point pairs in the feature point pairs by adopting a random sample consistency algorithm according to the iteration times to obtain target feature point pairs.
3. The method of claim 1, wherein after said computing the external parameters of the camera from the target feature point pairs, the method further comprises: Classifying the target characteristic point pairs to obtain a first characteristic point pair, a second characteristic point pair and a third characteristic point pair; Calculating a first re-projection error according to the first characteristic point pair, the second characteristic point pair and the third characteristic point pair; if the first re-projection error is larger than a preset error threshold, calculating a second re-projection error according to the first characteristic point pair and the second characteristic point pair; if the second re-projection error is larger than the preset error threshold, calculating a third re-projection error according to the first characteristic point pair; and if the third re-projection error is greater than the preset error threshold, correcting the external parameters of the camera into external parameters calculated according to the previous frame of video picture.
4. The method according to claim 1, wherein the camera's external parameters include a rotation matrix and a translation matrix, and the splicing the video frames to be spliced according to the camera's external parameters, to obtain a spliced target video frame, includes: Carrying out affine transformation on the video picture after de-distortion by adopting the rotation matrix and the translation matrix to obtain a transformed video picture; Obtaining an overlapping area of the transformed video picture according to the translation matrix; And splicing the transformed video pictures according to the overlapping area to obtain the spliced target video picture.
5. The method of claim 1, applied to a coal mine monitoring system, wherein the video frames to be spliced comprise a coal mining machine, the coal mining machine comprises a machine body structure and an identification card on the machine body, the identification card is provided with a unique number, and the identifying of the image feature points from the video frames to be spliced comprises: Identifying the machine body structure and the number of the identification card from the video pictures to be spliced; acquiring position coordinates of the machine body structure and the identification card in the video pictures to be spliced respectively; and identifying the image characteristic points from the video pictures to be spliced according to the position coordinates.
6. A video picture stitching apparatus, the apparatus comprising: the video picture acquisition module is used for acquiring video pictures to be spliced, which are shot by the camera; The image characteristic point identification module is used for identifying image characteristic points from the video pictures to be spliced; The spatial feature point calculation module is used for matching the spatial feature points corresponding to the image feature points according to the image feature points, wherein the spatial feature points are obtained by measuring the spatial coordinates of the real objects corresponding to the picture features in advance; the characteristic point pair screening module is used for screening characteristic point pairs formed by the image characteristic points and the space characteristic points to obtain target characteristic point pairs; the external parameter calculation module is used for calculating external parameters of the camera according to the target characteristic point pairs; the video picture splicing module is used for splicing the video pictures to be spliced according to the external parameters of the camera to obtain spliced target video pictures; the image feature point identification module comprises: The de-distortion processing sub-module is used for performing de-distortion processing on the video pictures to be spliced to obtain de-distorted video pictures; The recognition model processing submodule is used for recognizing picture characteristics in the video picture after de-distortion by adopting a recognition model, wherein the recognition model is a model obtained by training by using a training image containing the picture characteristics; the image feature point extraction submodule is used for obtaining the image feature points according to the picture features; the image feature point extraction sub-module is specifically configured to: cutting the video picture after de-distortion according to the picture characteristics to obtain a characteristic small picture comprising the picture characteristics; extracting characteristic corner points of the characteristic small image as the image characteristic points; The spatial feature point matching module comprises: the spatial feature point acquisition sub-module is used for acquiring the spatial feature points corresponding to the picture features; the system comprises a reference image acquisition sub-module, a reference image acquisition sub-module and a reference image acquisition sub-module, wherein the reference image acquisition sub-module is used for acquiring a reference image containing reference feature points, a first pairing relation exists between a coordinate system of the reference image and a space coordinate system, and the space coordinate system is the coordinate system where the space feature points are located; The characteristic angular point matching sub-module is used for matching the characteristic angular points with the reference characteristic points to obtain a second matching relationship between the characteristic angular points and the reference characteristic points; And the spatial feature point pairing submodule is used for matching the spatial feature points according to the first pairing relation and the second pairing relation.
7. An electronic device is characterized by comprising a processor; and Memory having executable code stored thereon that, when executed, causes the processor to perform the video picture stitching method of one or more of claims 1-5.
8. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform the video picture stitching method of one or more of claims 1-5.

Description

Video picture splicing method, device, equipment and storage medium Technical Field The present application relates to the field of video processing technologies, and in particular, to a video frame stitching method and apparatus, an electronic device, and a storage medium. Background The realization of production environments without manual supervision under dangerous environments such as mines is an important development direction in the field of mining. By introducing advanced automation technology and artificial intelligence, the working state of the coal mining machine can be monitored in real time, and the personal injury risk faced by miners in the mine is obviously reduced, so that the working condition is improved and the artificial resources are saved. In recent years, monocular cameras have received increasing attention because of their ease of installation and low cost. The monocular camera video stitching technique captures device pictures of different viewing angles by utilizing a multi-camera system and automatically stitches one or more wide fields of view or panoramas by a specific image algorithm. The method not only remarkably reduces the equipment cost, but also is suitable for more flexible application scenes. Although video stitching techniques based on monocular cameras have made certain advances in the relevant arts, challenges remain in terms of real-time performance, stitching accuracy, and coping with complex environmental changes. In the application of coal mine monitoring, the lifting of coal ash generated in the working process of the coal mining machine can interfere the shooting picture of a camera, so that more dead spots appear when the characteristic points of the machine body of the coal mining machine are extracted for carrying out a matching algorithm, the calculation accuracy of camera external parameters is further affected, and finally splicing failure or obvious splicing marks can be possibly caused. Disclosure of Invention The embodiment of the application provides a video picture splicing method to solve or at least partially solve the problems. Correspondingly, the embodiment of the application also provides a video picture splicing device, electronic equipment and a storage medium, which are used for ensuring the realization and application of the method. In order to solve the above problems, an embodiment of the present application discloses a video frame stitching method, which includes: acquiring video pictures to be spliced, which are shot by a camera; identifying image characteristic points from the video pictures to be spliced; according to the image feature points, matching the space feature points corresponding to the image feature points; screening feature point pairs formed by the image feature points and the space feature points to obtain target feature point pairs; calculating the external parameters of the camera according to the target characteristic point pairs; And splicing the video pictures to be spliced according to the external parameters of the camera to obtain spliced target video pictures. In an optional embodiment of the present application, the identifying the image feature point from the video frames to be stitched includes: performing de-distortion treatment on the video pictures to be spliced to obtain de-distorted video pictures; Identifying picture characteristics in the video picture after de-distortion by adopting an identification model, wherein the identification model is a model obtained by training a training image containing the picture characteristics; And obtaining the image characteristic points according to the picture characteristics. In an optional embodiment of the present application, the obtaining the image feature point according to the picture feature includes: cutting the video picture after de-distortion according to the picture characteristics to obtain a characteristic small picture comprising the picture characteristics; extracting the image feature points of the feature small map; the matching of the spatial feature points corresponding to the image feature points according to the image feature points comprises: acquiring the spatial feature points corresponding to the picture features; the method comprises the steps of obtaining a reference image containing reference feature points, wherein a first pairing relation exists between a coordinate system of the reference image and a space coordinate system, and the space coordinate system is the coordinate system where the space feature points are located; Matching the characteristic corner points with the reference characteristic points to obtain a second pairing relation between the characteristic corner points and the reference characteristic points; and matching the spatial feature points according to the first pairing relation and the second pairing relation. In an optional embodiment of the present application, the screening the feature point pair formed by