CN-121999032-A - Robot multi-camera positioning optimization method, device and storage medium

CN121999032ACN 121999032 ACN121999032 ACN 121999032ACN-121999032-A

Abstract

The application provides a robot multi-camera positioning optimization method, a device and a storage medium, which are applied to the field of automatic control, wherein the method comprises the steps of obtaining at least one first key frame of a first camera in a specific time period, wherein the first key frame reflects the key pose of the first camera; the method comprises the steps of obtaining at least one second key frame of a second camera in a specific time period, wherein the second key frame reflects the key pose of the second camera, carrying out beam adjustment optimization on the first key frame and the second key frame to obtain a first pose corresponding to the first camera and a second pose corresponding to the second camera, and updating the historical robot pose of the robot in the specific time period according to the first pose and the second pose.

Inventors

CUI ZHIPENG

Assignees

浙江白马科技有限公司

Dates

Publication Date: 20260508
Application Date: 20241103

Claims (11)

1. A multi-camera positioning optimization method for a robot, the robot comprising a first camera and at least one second camera, the method comprising: Acquiring at least one first key frame of the first camera in a specific time period; acquiring at least one second key frame of the second camera in a specific time period; Performing beam adjustment optimization on the first key frame and the second key frame to obtain a first pose corresponding to the first camera and a second pose corresponding to the second camera; And updating the robot pose of the robot in the specific time period according to the first pose and the second pose.
2. The method of claim 1, wherein the performing beam adjustment optimization on the first key frame and the second key frame comprises: Acquiring pixel coordinate information of the first key frame and the second key frame, wherein the pixel coordinate information comprises pixel coordinates of feature points of the first key frame and the second key frame; The pixel coordinate information and the pixel coordinate estimation are subjected to difference, a reprojection error function is established, the pixel coordinate estimation comprises a first prediction variable and a second prediction variable, the first prediction variable at least comprises the pose variable of the first key frame, and the second prediction variable at least comprises the pose variable of the second key frame; And iteratively solving by using a nonlinear least square method based on the reprojection error function, the pixel coordinates of the first key frame and the second key frame characteristic points, the first prediction variable and the second prediction variable to obtain the first pose and the second pose.
3. The method of claim 1, wherein prior to the acquiring at least one first keyframe of the first camera over a particular period of time, the method further comprises: extracting feature points of a first real-time frame of the first camera and feature points of a second real-time frame of the second camera; performing feature point matching on the first real-time frame and the second real-time frame and a previous real-time frame of the first real-time frame and the second real-time frame respectively to obtain a feature matching relation; And according to the characteristic matching relation, calculating to obtain a first real-time pose corresponding to the first real-time frame and a second real-time pose corresponding to the second real-time frame by using a pose algorithm.
4. The method of claim 2, wherein said differencing the pixel coordinate information with the pixel coordinate estimate to establish a reprojection error function comprises: Generating a transformation matrix according to the position transformation relation of the first camera and the second camera; and converting the pose variable of the second key frame into an expression of the transformation matrix and the pose variable of the first key frame.
5. A method according to claim 3, wherein after calculating a first real-time pose corresponding to the first real-time frame and a second real-time pose corresponding to the second real-time frame according to the feature matching relationship using a pose algorithm, the method further comprises: Acquiring the first real-time pose and the second real-time pose corresponding to the first real-time frame and the second real-time frame respectively as poses to be fused; acquiring a position transformation relation among the first camera, the second camera and the robot; According to the position transformation relation, carrying out pose optimization on the pose to be fused based on a fusion algorithm to obtain an optimized pose of the robot, wherein the fusion algorithm comprises an extended Kalman filtering algorithm; And updating the real-time pose of the robot based on the optimized pose.
6. The method of claim 2, wherein prior to beam-balancing optimization of the first and second keyframes, the method further comprises: Judging whether a first key frame and a second key frame have a common-view key frame or not, and acquiring the common-view key frame and a plurality of first key frames and second key frames before the common-view key frame under the condition that the common-view key frame exists, wherein the common-view key frame is a first key frame and a second key frame with image similarity; The performing beam adjustment optimization on the first key frame and the second key frame includes: And carrying out beam adjustment optimization on the common view key frame, and a plurality of first key frames and second key frames before the common view key frame to obtain a first pose corresponding to the first camera and a second pose corresponding to the second camera.
7. The method of claim 6, wherein the obtaining the co-view key frame in the presence of the co-view key frame and the plurality of first key frames and the second key frames preceding the co-view key frame comprises: calculating word bag model vectors of the first key frames as first vectors; calculating word bag model vectors of the plurality of second key frames to be used as a plurality of second vectors; Performing similarity calculation on the first vector and the second vector to obtain a similarity result; Acquiring the corresponding first key frame and second key frame as the common-view key frame under the condition that the similarity result indicates that the first vector is similar to the second vector; and acquiring a plurality of first key frames and second key frames before the common view key frame.
8. The method of claim 1, wherein updating the historical robot pose of the robot over the particular time period based on the first pose and the second pose comprises: acquiring a position transformation relation among the first camera, the second camera and the robot; According to the position transformation relation, converting the first pose or the second pose into an optimized pose of the robot; and updating the historical robot pose of the robot in the specific time period based on the optimized pose of the robot.
9. A robotic multi-camera positioning optimization apparatus, the positioning optimization apparatus comprising: a first acquisition module for acquiring at least one first key frame of the first camera in a specific time period; a second acquisition module, configured to acquire at least one second keyframe of the second camera in a specific time period; the optimization module is used for carrying out beam adjustment optimization on the first key frame and the second key frame to obtain a first pose corresponding to the first camera and a second pose corresponding to the second camera; And the updating module is used for updating the historical robot pose of the robot in the specific time period according to the first pose and the second pose.
10. A computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the robotic multi-camera localization optimization method of any one of claims 1-8.
11. A robotic multi-camera positioning optimization apparatus, comprising: A processor and a memory, wherein at least one instruction, at least one program, a code set or an instruction set is stored in the memory, and the at least one instruction, the at least one program, the code set or the instruction set is loaded and executed by the processor to implement the robot multi-camera positioning optimization method according to any one of claims 1 to 8.

Description

Robot multi-camera positioning optimization method, device and storage medium Technical Field The present application relates to the field of robot positioning, and in particular, to a method and apparatus for selecting a multi-camera positioning key frame of a robot, and a storage medium. Background With the development of science and technology, robots are increasingly applied to human production and life, are often used for carrying, inspecting, cleaning and the like of articles instead of manpower, and are generally provided with sensors, a navigation and positioning system and a control algorithm, so that the robots can sense the surrounding environment and perform corresponding actions. At present, various robots exist in the market to assist people to finish work, such as a sweeper, a mower, a dust collector and the like, and convenience is provided for the production and the living of people. One of the keys for realizing the intellectualization and automation of the robot is a robot positioning technology, and the vision positioning system is widely used in robot positioning due to the factors of wide application scene, low cost and the like. The prior art visual positioning system is mainly composed of a single set of visual sensors, such as a single monocular camera, a single binocular camera or a single depth camera. However, since the vision positioning system mainly relies on the vision sensor to perform environmental feature recognition so as to realize pose calculation of the vision positioning system, when a single set of vision sensor of the robot is shielded by an obstacle or no obvious feature exists in the range of the field of view, the pose of the robot can not be obtained accurately, so that the vision positioning system of the robot can not work effectively, the whole operation of the robot is affected, and under the condition that the robot is provided with a plurality of sets of vision sensors, how to reasonably select real-time frame images obtained by the vision sensors as key frames so as to utilize the key frames to perform optimal calculation on the pose of the robot is a problem to be solved in the present stage. Disclosure of Invention In view of the above, the embodiments of the present application provide a method, an apparatus, and a storage medium for selecting a multi-camera positioning key frame of a robot, which can reasonably select key frames acquired by multiple cameras when the robot performs multi-camera positioning, so as to perform optimization calculation on the pose of the robot by using the key frames, and improve the accuracy of robot positioning. The technical scheme is as follows: In a first aspect, an embodiment of the present application provides a method for selecting a multi-camera positioning key frame of a robot, where the method includes performing key frame selection on a real-time frame of a first camera to obtain a plurality of first key frames, and generating a first key frame sequence; Performing key frame selection on the real-time frames of the second camera to obtain a plurality of second key frames, and generating a second key frame sequence; and performing optimization pose calculation based on the first key frame sequence and the second key frame sequence to obtain a plurality of first poses corresponding to the first camera and a plurality of second poses corresponding to the second camera, wherein the plurality of first poses and the plurality of second poses are used for optimizing the robot pose of the robot in a specific time period. Further, the second key frame sequence includes a second camera real-time frame at the same time instant as the plurality of first key frames. Further, the performing key frame selection on the real-time frames of the first camera to obtain a plurality of first key frames, and generating a first key frame sequence includes: Performing feature matching screening on the real-time frames of the first camera to obtain a plurality of first key frames, and generating a first key frame sequence; The key frame selection is performed on the real-time frames of the second camera to obtain a plurality of second key frames, and the generation of a second key frame sequence comprises the following steps: and selecting the real-time frame of the second camera at the same moment as the first key frames as the second key frames. Further, the performing key frame selection on the real-time frames of the first camera to obtain a plurality of first key frames, and generating a first key frame sequence includes: Performing feature matching screening on the real-time frames of the first camera to obtain a plurality of first key frames, and generating a first key frame sequence; The key frame selection is performed on the real-time frames of the second camera to obtain a plurality of second key frames, and the generation of a second key frame sequence comprises the following steps: And performing feature matching screening