US-12624952-B2 - Image-based cooperative simultaneous localization and mapping system and method
Abstract
An image-based cooperative simultaneous localization and mapping system and method by using rendezvous. The image-based cooperative simultaneous localization and mapping method in a multi-platform system preforms: estimating a camera posture on the basis of an image input through a camera in a single SLAM scheme and manufacturing an local map by each of multiple platforms; extracting a non-static feature from the image and managing same; transmitting, to a ground station, the camera posture, the local map, and the non-static feature as platform data; determining, by the ground station, whether a rendezvous situation has occurred between one of the multiple platforms and the remaining platforms, on the basis of the platform data; and when the rendezvous situation occurs, fusing the local map received from two or more rendezvoused platforms into a global map.
Inventors
- Hyoun Jin Kim
- Young Seok Jang
Assignees
- SEOUL NATIONAL UNIVERSITY R&DB FOUNDATION
Dates
- Publication Date
- 20260512
- Application Date
- 20210316
- Priority Date
- 20201202
Claims (10)
- 1 . A collaborative visual simultaneous localization and mapping method on a multiple platform system, comprising: estimating, by each of a plurality of platforms providing time-synchronized data associated with movement of the plurality of the platforms, a camera pose and generating a local map based on an image input through a camera in a single simultaneous localization and mapping (SLAM); extracting, managing, and tracking non-static features from images of a dynamic object being imaged; transmitting the camera pose, the local map and the non-static features as platform data to a ground station, which is an integrated management system; calculating with the time-synchronized data, by the ground station, based on the platform data and utilizing the tracking of the non-static features, whether there is a rendezvous situation between one of the plurality of platforms and remaining platforms of the plurality of platforms, wherein the calculating the rendezvous situation between the one of the plurality of platforms and the remaining platforms comprises: by identifying the one of the plurality of platforms as an observing platform, comparing a movement of the non-static features over time in the image received from the observing platform with motion of the remaining platforms based on the camera pose and local maps received from the remaining platforms, wherein the local maps include the local map transmitted by each of the remaining platforms, wherein if any of the remaining platforms have similar movements above a threshold, it is assumed to be in the rendezvous situation with the observing platform, and a non-static feature from an image of the dynamic object is matched and utilized; and if there is the rendezvous situation, fusing the local maps received from the observing platform and one or more of the remaining platforms that have rendezvoused into a global map.
- 2 . The method of claim 1 , wherein the comparing the movement of the non-static features over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms comprises: designing an optimization problem; and determining that the non-static feature is an observation value that points to the platform when a convergence error of the optimization problem is small enough.
- 3 . The method of claim 2 , wherein the comparing the movement of the non-static features over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms comprises: finding a convergence solution for the optimization problem by using an alternating minimization algorithm.
- 4 . The method of claim 2 , wherein the comparing the movement of the non-static features over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms comprises: map fusing, by a map fusion module, by using the feature, the local maps and the pose data for the camera pose, both generated by the observing platform and an observed platform, into a global map.
- 5 . The method of claim 4 , wherein the comparing the movement of the non-static features over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms comprises: estimating a similarity transformation value between the local maps to fuse into the global map.
- 6 . A collaborative visual simultaneous localization and mapping system in which each of a plurality of platforms estimates a camera pose and generating a local map based on an image input through a camera in a single simultaneous localization and mapping (SLAM), comprising: a ground station configured for receiving a platform data from each of a plurality of platforms providing time-synchronized data associated with movement of the plurality of the platforms, analyzing the platform data and generating a global map, wherein the ground station comprises a platform matching module configured for receiving the platform data from each of the plurality of platforms and determining, based on the platform data, whether there is a rendezvous situation between one of the plurality of platforms and remaining platforms of the plurality of platforms; and a map fusion module configured for fusing local maps received from the two or more platforms that have rendezvoused into a global map if there is the rendezvous situation, wherein each of the plurality of platforms extracts, manages, and tracks non-static features from images of a dynamic object being imaged with the movement of the platforms in order to calculate with the time-synchronized data if there is the rendezvous situation, wherein the platform data comprises the camera pose, the local map and a non-static feature from an image of the dynamic object, and wherein the platform matching module identifies one platform as an observing platform, compares a movement of the non-static feature over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms, and if any of the platforms have similar movements above a threshold, assumes the any of the platforms to be in the rendezvous situation with the observing platform, and matches the non-static feature.
- 7 . The collaborative visual simultaneous localization and mapping system of claim 6 , wherein the platform matching module is configured for designing an optimization problem and determining that the non-static feature is an observation value that points to the platform when a convergence error of the optimization problem is small enough.
- 8 . The collaborative visual simultaneous localization and mapping system of claim 6 , wherein the map fusion module is configured for estimating a similarity transformation value between the local maps that are generated by the observing platform and an observed platform by using the non-static feature to fuse into the global map.
- 9 . A moving platform for performing localization and mapping, comprising: an image input unit configured for receiving an image of a surrounding captured by a camera; a camera pose estimation unit configured for extracting and matching a feature from the image, and estimating a camera pose from the matched feature; a local map generation unit configured for generating a local map of an area in which the platform is located and traveled based on the image and the camera pose; a non-static feature management unit for extracting, managing, and tracking non-static features, among the features, from images of a dynamic object being imaged by the moving platform; and a communication unit configured for transmitting time-synchronized platform data associated with movement of the platform including the camera pose, the local map and a non-static feature from an image of the dynamic object to a ground station, wherein the ground station calculates with the time-synchronized platform data, based on platform data and utilizing the tracking of the non-static features, whether there is a rendezvous situation between one of a plurality of platforms that includes the moving platform and remaining platforms of the plurality of platforms, wherein the calculating the rendezvous situation between the one of the plurality of platforms and the remaining platforms comprises: by identifying the one of the plurality of platforms as an observing platform, comparing a movement of the non-static features over time in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and local maps received from the remaining platforms, wherein if any of the remaining platforms have similar movements above a threshold, it is assumed to be in the rendezvous situation with the observing platform, and a non-static feature from an image of the dynamic object is matched and utilized.
- 10 . The platform of claim 9 , wherein the image has successive frames, wherein the non-static feature management unit tracks and manages the non-static features in the image in successive frames.
Description
This application is a national stage application of PCT/KR2021/003219 filed on Mar. 16, 2021, which claims priority to Korean Patent Application No. 10-2020-0166198 filed on Dec. 2, 2020. The disclosure of each of the foregoing applications is incorporated herein by reference in its entirety. FIELD OF INVENTION The present invention relates to collaborative visual simultaneous localization and mapping (collaborative SLAM) system and method. RELATED ARTS In general, simultaneous localization and mapping (SLAM) refers to a technology that uses a single mobile platform or device to create a map of one's surroundings and estimate one's location within the map. SLAM is a key technology for autonomous vehicles and robots and for virtual reality (VR)/augmented reality (AR). When multiple platforms are operated, a collaborative SLAM technology based on information exchange between platforms or terminals can be used to effectively map a large area while estimating the location of each of them. However, for this single SLAM system in order to be extended to the collaborative SLAM for multiple platforms, a map fusion technology is required to integrate the information collected by each platform. Map fusion requires either inter-robot loop detection, where platforms are looking at the same place, or rendezvous, where platforms appear in the field of view of other platforms and look at each other. Existing rendezvous-based map fusion techniques have limitations such as using visual markers to confirm the identity of the observed robot (Korean Registered Patent No. 10-1976241) or requiring distance information from stereo cameras or RGB-D sensors. SUMMARY Technical Objectives The present invention is intended to provide a collaborative visual simultaneous localization and mapping system and method that enable position estimation of multiple platforms or make an integrated map using a rendezvous situation in an environment in which multiple platforms such as autonomous vehicles, virtual/augmented/mixed reality terminals, and so on are operating. The present invention is intended to provide a collaborative visual simultaneous localization and mapping system and method using rendezvous that does not use any markers or additional information to identify platforms, can solve the problem of rendezvous-based map fusion even with a lightweight and inexpensive monocular camera, and enables efficient collaborative SLAM using non-static features that are usually discarded. Other advantages and objectives will be easily appreciated through description below. Technical Solutions According to one aspect of the present invention, there is provided a collaborative visual simultaneous localization and mapping method on a multiple platform system. The method includes estimating, by each of a plurality of platforms, a camera pose and generating a local map based on an image input through a camera in a single simultaneous localization and mapping (SLAM), extracting and managing a non-static feature from the image, transmitting the camera pose, the local map and the non-static feature as platform data to a ground station, which is an integrated management system, determining, by the ground station, based on the platform data, whether there is a rendezvous situation between one of the plurality of platforms and the remaining platforms and if there is the rendezvous situation, fusing the local maps received from the two or more platforms that have rendezvoused into a global map. In one embodiment, the determining the rendezvous situation between one of the plurality of platforms and the remaining platforms includes, by identifying one platform as an observing platform, comparing a movement of the non-static feature in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms, wherein if there is a platform with similar movements above a threshold, it is assumed to be in the rendezvous situation with the observing platform, and the non-static feature is matched and utilized. In one embodiment, the comparing a movement of the non-static feature in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms includes designing an optimization problem and determining that the non-static feature is an observation value that points to the platform when a convergence error of the optimization problem is small enough. In one embodiment, the comparing a movement of the non-static feature in the image received from the observing platform with the motion of the remaining platforms based on the camera pose and the local maps received from the remaining platforms includes finding a convergence solution for the optimization problem by using an alternating minimization algorithm In one embodiment, the comparing a movement of the non-stat