US-12626397-B2 - Simultaneous localization and mapping (SLAM) method performed by electronic devices, electronic devices, and storage media
Abstract
A method performed by an electronic device includes: acquiring a search image based on a query image; acquiring first spatial features of the query image and second spatial features of the search image; and estimating a relative pose between the query image and the search image based on the first spatial features and the second spatial features.
Inventors
- Xiongfeng PENG
- Zhihua Liu
- Qiang Wang
- Yuntae Kim
Assignees
- SAMSUNG ELECTRONICS CO., LTD.
Dates
- Publication Date
- 20260512
- Application Date
- 20230206
- Priority Date
- 20220225
Claims (19)
- 1 . A simultaneous localization and mapping (SLAM) method performed by an electronic device, the method comprising: acquiring a search image based on a query image; acquiring first spatial features of the query image and second spatial features of the search image; acquiring feature matching results by matching at least one of the first spatial features with at least one of the second spatial features; and estimating a relative pose between the query image and the search image based on the feature matching results, wherein the acquiring feature matching results comprises: generating one or more first feature matching pairs by performing a coarse matching operation based on the first spatial features and the second spatial features, and generating one or more second feature matching pairs by performing a fine matching operation based on the first spatial features and the second spatial features.
- 2 . The SLAM method of claim 1 , wherein the first spatial features and the second spatial features each comprise three dimensional (3D) point sets, and wherein the acquiring the first spatial features and the second spatial features each comprises: extracting image feature points that comprise image keypoints and feature descriptors; and estimating the 3D point sets by performing stereo matching on the image feature points.
- 3 . The SLAM method of claim 1 , wherein the acquiring the feature matching results comprises generating the one or more first feature matching pairs between results of clustering of the query image and results of clustering of the search image by clustering the 3D point sets of the query image and the 3D point sets of the search image.
- 4 . The SLAM method of claim 3 , wherein the generating of the one or more first feature matching pairs comprises: determining one or more first cubes by clustering the 3D point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; determining first cluster centroids of the respective first cubes and second cluster centroids of the respective second cubes; determining the second cluster centroids that respectively match the first cluster centroids; and determining the one or more first feature matching pairs based on the first cluster centroids and the second cluster centroids determined to match each other.
- 5 . The SLAM method of claim 3 , wherein the acquiring the feature matching results further comprises acquiring the one or more second feature matching pairs between the 3D point sets of the query image and the 3D point sets of the search image by performing nearest neighbor search and mutual verification on 3D points of the one or more first feature matching pairs, and wherein the determining the relative pose comprises determining the relative pose based on the one or more second feature matching pairs.
- 6 . The SLAM method of claim 5 , wherein the feature matching results further comprises third feature matching pairs, wherein the relative pose comprises a coarse relative pose and a fine matching pose, wherein the acquiring the feature matching results further comprises: estimating the coarse relative pose between the query image and the search image based on the one or more second feature matching pairs; and determining the third feature matching pairs between the 3D point sets of the query image and the 3D point sets of the search image by projecting the 3D point sets of the search image onto a coordinate system of the query image according to the coarse relative pose, and wherein the determining of the fine relative pose comprises determining the relative pose based on the third feature matching pairs.
- 7 . The SLAM method of claim 1 , wherein the determining of the relative pose comprises: estimating a prior relative pose between the query image and the search image based on the feature matching results; determining local points of the search image corresponding to keypoints of the query image based on the prior relative pose, and generating point matching pairs based on the local points corresponding to the keypoints; and estimating the relative pose based on the point matching pairs.
- 8 . The SLAM method of claim 1 , further comprising acquiring an optimized global map by optimizing a current global map based on the relative pose.
- 9 . The SLAM method of claim 8 , wherein the acquiring the optimized global map comprises: determining pose drift information based on the relative pose; and acquiring the optimized global map by determining an optimization strategy based on the pose drift information and optimizing the current global map according to the optimization strategy.
- 10 . The SLAM method of claim 9 , wherein the acquiring the optimized global map further comprises: acquiring the optimized global map by adjusting a prior global map through incremental bundle adjustment when the pose drift information satisfies a preset error condition; or acquiring the optimized global map by adjusting the prior global map through full bundle adjustment when the pose drift information does not satisfy the error condition.
- 11 . The SLAM method of claim 10 , wherein the acquiring the optimized global map further comprises: acquiring a first global map by optimizing a multi-degree-of-freedom pose of a keyframe of the prior global map based on the relative pose; and acquiring the optimized global map by optimizing a keyframe pose and map points of the first global map through whole bundle adjustment.
- 12 . The SLAM method of claim 1 , wherein the generating of the one or more first feature matching pairs comprises: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; and determining the one or more first feature matching pairs based on a comparison between the one or more first cubes and the one or more second cubes.
- 13 . The SLAM method of claim 1 , wherein the generating of the one or more first feature matching pairs comprises: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; determining first cluster centroids of the respective one or more first cubes and second cluster centroids of the respective one or more second cubes; and determining the one or more first feature matching pairs based on a comparison between the first cluster centroids and the second cluster centroids.
- 14 . An electronic device comprising: at least one processor; a memory; and at least one application program stored in the memory and configured to be executed by the at least one processor, the at least one application program being configured to perform simultaneous localization and mapping (SLAM) by: acquiring a search image based on a query image, acquiring first spatial features of the query image and second spatial features of the search image, acquiring feature matching results by matching at least one of the first spatial features with at least one of the second spatial features, and estimating a relative pose between the query image and the search image based on the feature matching results, wherein the acquiring feature matching results comprises: generating one or more first feature matching pairs by performing a coarse matching operation based on the first spatial features and the second spatial features, and generating one or more second feature matching pairs by performing a fine matching operation based on the first spatial features and the second spatial features.
- 15 . The electronic device of claim 14 , wherein the at least one application program is further configured to generate the one or more first feature matching pairs by: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; and determining the one or more first feature matching pairs based on a comparison between the one or more first cubes and the one or more second cubes.
- 16 . The electronic device of claim 14 , wherein the at least one application program is further configured to generate the one or more first feature matching pairs by: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; determining first cluster centroids of the respective one or more first cubes and second cluster centroids of the respective one or more second cubes; and determining the one or more first feature matching pairs based on a comparison between the first cluster centroids and the second cluster centroids.
- 17 . A non-transitory computer-readable storage medium having recorded thereon a program for executing simultaneous localization and mapping method comprising: acquiring a search image based on a query image, acquiring first spatial features of the query image and second spatial features of the search image, acquiring feature matching results by matching at least one of the first spatial features with at least one of the second spatial features, and estimating a relative pose between the query image and the search image based on the feature matching results, wherein the acquiring feature matching results comprises: generating one or more first feature matching pairs by performing a coarse matching operation based on the first spatial features and the second spatial features, and generating one or more second feature matching pairs by performing a fine matching operation based on the first spatial features and the second spatial features.
- 18 . The non-transitory computer-readable storage medium of claim 17 , wherein the generating of the one or more first feature matching pairs comprises: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; and determining the one or more first feature matching pairs based on a comparison between the one or more first cubes and the one or more second cubes.
- 19 . The non-transitory computer-readable storage medium of claim 17 , wherein the generating of the one or more first feature matching pairs comprises: determining one or more first cubes by clustering three dimensional (3D) point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; determining first cluster centroids of the respective one or more first cubes and second cluster centroids of the respective one or more second cubes; and determining the one or more first feature matching pairs based on a comparison between the first cluster centroids and the second cluster centroids.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application is based on and claims priority under 35 U.S.C. § 119 to Chinese Patent Application No. 202210178991.7 filed on Feb. 25, 2022, in the China National Intellectual Property Administration, and Korean Patent Application No. 10-2022-0072356, filed on Jun. 14, 2022, in the Korean Intellectual Property Office, the disclosures of which are incorporated by reference herein in their entireties. BACKGROUND 1. Field The present disclosure relates to simultaneous localization and mapping (SLAM), and more particularly, to methods performed by electronic devices, electronic devices, and computer-readable storage media. 2. Description of the Related Art Simultaneous localization and mapping (SLAM) refers to a technique for creating/describing a real-time three dimensional (3D) map of a space in which a device is located and detecting the pose (location and attitude) of the device by using a camera and a sensor, such as a laser radar of the device. Due to camera calibration errors and limited feature matching accuracy, unavoidable cumulative errors occur during visual SLAM. To address this, a SLAM system may additionally include a loop closing (LC) module. The LC module reduces cumulative errors by identifying a common view relationship between the current frame and a prior frame and optimizing a global map, thereby realizing drift-free simultaneous localization. In general, current SLAM techniques use a method of establishing visual constraints through feature matching or the like, and then calculating the relative pose between a query image and a search image to optimize a global map. However, this method causes relatively large visual variations and requires a relatively long period of time for optimizing a global map. Thus, it is needed to optimize current LC modules of SLAM systems. SUMMARY Provided are methods performed by electronic devices, electronic devices, and computer-readable storage media. Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments of the disclosure. The present disclosure provides methods performed by electronic devices, electronic devices, computer-readable storage media, and technical solutions therefor as follows. A method performed by an electronic device includes: acquiring a search image based on a query image; acquiring first spatial features of the query image and second spatial features of the search image; and estimating a relative pose between the query image and the search image based on the first spatial features and the second spatial features. The first spatial features and the second spatial features each include three dimensional (3D) point sets. The acquiring the first spatial features and the second spatial features each includes: extracting image feature points that include image keypoints and feature descriptors; and estimating the 3D point sets by performing stereo matching on the image feature points. The estimating the relative pose includes acquiring feature matching results by matching at least one of the first spatial features with at least one of the second spatial features; and determining the relative pose based on the feature matching results. The feature matching results comprise first feature matching pairs. The acquiring the feature matching results comprises generating the first feature matching pairs between results of clustering of the query image and results of clustering of the search image by clustering the 3D point sets of the query image and the 3D point sets of the search image. The generating of the first feature matching pairs includes: determining one or more first cubes by clustering the 3D point sets of the query image; determining one or more second cubes by clustering the 3D point sets of the search image; determining first cluster centroids of the respective first cubes and second cluster centroids of the respective second cubes; determining the second cluster centroids that respectively match the first cluster centroids; and determining the first feature matching pairs based on the first cluster centroids and the second cluster centroids determined to match each other. The feature matching results further include second feature matching pairs. The acquiring the feature matching results further comprises acquiring second feature matching pairs between the 3D point sets of the query image and the 3D point sets of the search image by performing nearest neighbor search and mutual verification on 3D points of the first feature matching pairs. The determining the relative pose comprises determining the relative pose based on the second feature matching pairs. The feature matching results further include third feature matching pairs. The relative pose includes a coarse relative pose and a fine matching pose. The acquiring the feature matching results further incl