CN-116030130-B - Hybrid semantic SLAM method in dynamic environment

CN116030130BCN 116030130 BCN116030130 BCN 116030130BCN-116030130-B

Abstract

The invention discloses a hybrid semantic SLAM method in a dynamic environment, which comprises the following steps of firstly obtaining semantic information in the dynamic environment by processing RGB and depth images in a vision-based module, realizing decoupling of dynamic and static objects in the environment and generating static semantic point cloud information, secondly generating geometric information of a robot motion plane according to information of radars and IMUs in a 2D radar-based module, thirdly, carrying out fusion processing on the semantic information and the geometric information in a semantic fusion module, generating a local semantic octree and a grid map, and finishing construction of a global hybrid semantic map by continuously updating occupation probability of voxels and grids. The invention effectively builds the hybrid semantic map in a dynamic environment, can be directly used for controlling and navigating the mobile robot, and can be used for improving the capacity of the mobile robot for reasoning, man-machine interaction and intelligent decision-making based on the semantic.

Inventors

ZHENG CHEN
QIN XIANSHENG
DU YUYANG
WANG JIAN
WANG ZHANXI
SUN TENGFEI
WANG KANGNING
An Yushu
WU HAOYU
BAI JING

Assignees

西北工业大学

Dates

Publication Date: 20260505
Application Date: 20221229

Claims (3)

1. The hybrid semantic SLAM method under the dynamic environment is characterized by comprising the following steps of: Step 1, in a vision module, semantic information in a dynamic environment is obtained by processing RGB and depth images, decoupling of dynamic and static objects in the environment is achieved, and static semantic point cloud information is generated; step2, in the 2D radar module, generating geometric information of a robot motion plane according to 2D radar geometric information and IMU information fusion; and 3, in a semantic fusion module, carrying out fusion processing on the static semantic point cloud information generated in the step one and the geometric information of the robot motion plane obtained in the step 2 to generate a local semantic octree and a grid map, continuously updating the occupation probability of voxels and grids, and completing the construction of a global hybrid semantic map, wherein the method comprises the following substeps: step 3.1, transforming matrix according to the relative pose of the camera coordinate system relative to the robot coordinate system And a relative pose transformation matrix of the 2D radar coordinate system with respect to the robot coordinate system Obtaining a relative pose transformation matrix of the camera coordinate system relative to the radar coordinate system , = ; Step 3.2, carrying out integrated processing on the static semantic point cloud information generated in the step 1 and the geometric information of the robot motion plane obtained in the step 2, and respectively generating a local semantic octree map and a grid map; step 3.3, based on the relative pose transformation matrix of the camera coordinate system obtained in the step 3.1 relative to the radar coordinate system Aligning the semantic octree with the grid map; Step 3.4, updating the voxel occupation probability in the local semantic octree map and the grid occupation probability in the grid map in real time based on the octree map and the grid map generated in the step 3.3; the voxel updating method in the semantic octree map specifically comprises the following steps: At the position of At the moment, the depth values of the voxels are respectively Then at Probability of temporal voxel occupancy The method comprises the following steps: ; In the semantic octree map Probability of temporal voxel occupancy When the current time of the voxel is greater than the threshold value, the current time of the voxel is an occupied state; the grid occupation probability updating method in the grid map specifically comprises the following steps: The state of the grid at the last moment is recorded as The current laser radar observation result is that Then a grid occupies the state Updated occupancy state The method comprises the following steps: Wherein the method comprises the steps of Indicating an idle state in which the data is in a state, Representing an occupied state; when the one grid occupies the state Updated state If the value of the grid is smaller than the threshold value, the grid is in a state considered as idle; and 3.5, generating a global mixed semantic map based on the updated octree map and the grid map obtained in the step 3.4.
2. The hybrid semantic SLAM method in a dynamic environment of claim 1, wherein step 1 specifically comprises the sub-steps of: Step 1.1, calibrating the installation position of a camera on a robot to obtain a relative pose transformation matrix of a camera coordinate system relative to the robot coordinate system, and marking the relative pose transformation matrix as ; Step 1.2, in a tracking thread, acquiring each frame of RGB image and a corresponding depth image through a mobile robot platform, extracting Oriented FAST and Rotated BRIEF feature points from the RGB image, acquiring depth information corresponding to each frame of RGB image in the depth image, and tracking the pose of a camera based on semantic information; Step 1.3, in a semantic acquisition thread, performing semantic segmentation on the acquired key frame RGB image based on Yolact to generate a category, confidence coefficient and semantic label of a corresponding pixel, obtaining a mask of a dynamic object, and removing dynamic feature points positioned in the dynamic mask in step 1.2 based on the dynamic mask; step 1.4, generating and optimizing a local map in a local map building thread; and step 1.5, in the semantic generation thread, correcting the depth information of the dynamic region based on the semantic information and the depth information obtained in the step 1.2, obtaining a corrected depth image, and generating semantic point cloud information.
3. The hybrid semantic SLAM method in a dynamic environment of claim 1, wherein step 2 specifically comprises the sub-steps of: Step 2.1, calibrating the installation position of the 2D radar on the robot to obtain a relative pose transformation matrix of a radar coordinate system relative to the robot coordinate system, and marking the relative pose transformation matrix as ; 2.2, Fusing the 2D radar geometric information and the IMU information by adopting an extended Kalman filtering method to realize the positioning of the robot and obtain the geometric information of a robot motion plane; And 2.3, completing real-time tracking of the radar pose based on the geometric information of the robot motion plane.

Description

Hybrid semantic SLAM method in dynamic environment Technical Field The invention belongs to the technical field of robots, and particularly relates to a hybrid semantic SLAM method in a dynamic environment. Background Synchronous positioning and mapping (SLAM) is a key technology for autonomous navigation of mobile robots and automatic driving automobiles in unknown environments, and has been widely applied to the perception of geometric environments by mobile robots. Since semantic information in the environment is the key of mobile robot reasoning, decision making, man-machine cooperation and performing advanced tasks, research on semantic SLAM methods is gradually performed at present, and the semantic SLAM methods are classified into vision-based, radar-based and multisensor fusion-based methods. The visual-based method extracts semantic information in the image by adopting a deep learning method, eliminates dynamic features in the environment and then positions the dynamic features, so that a geometric map with higher precision is established. Some methods simultaneously utilize semantic information to construct a semantic map, but the map constructed by the methods lacks information of a robot moving plane and cannot be directly used for controlling and navigating the robot. The radar-based method is divided into a 2D radar-based method and a 3D radar-based method, wherein the method can only construct a geometric map of an environment and does not have semantic information. After the process of extracting, dividing, clustering and the like is carried out on the point cloud, the 3D semantic point cloud can be obtained, so that a semantic point cloud map is constructed, but the efficiency of the method is very low compared with that of a vision-based method, and the real-time requirement cannot be met; The method based on multi-sensor fusion can effectively construct a high-precision map by utilizing sensor information such as vision, radar, an Inertial Measurement Unit (IMU) and the like, and only focuses on more effective extraction and processing on environment geometric information, but does not fully utilize semantic information in the environment to construct a semantic map, so that the context-based reasoning capacity and intelligent decision capacity of the robot are improved. Disclosure of Invention In order to overcome the defects of the prior art, the invention provides a hybrid semantic SLAM method in a dynamic environment. The method comprises the steps of screening an image frame through an ORB-SLAM3 frame in a visual module to obtain a key frame, extracting semantics of the key frame, decoupling dynamic and static objects, generating local semantic point cloud information, fusing 2D radar geometric information and IMU information in a 2D radar module to obtain the pose of a more accurate robot, fusing the semantic information and the geometric information in a semantic fusion module to generate a semantic octree map and a grid map, further constructing a local map, updating the map according to the probability of occupying the grid, and finally generating a global mixed semantic map. The method can remarkably improve the efficiency and precision of constructing the semantic map by the mobile robot, provides geometric information for motion control of the mobile robot, and provides critical semantic information for semantic-based reasoning, man-machine interaction and intelligent decision of the mobile robot. The invention solves the technical problems by adopting the technical scheme that the hybrid semantic SLAM method in a dynamic environment comprises the following steps: Step 1, acquiring semantic information in an environment in a vision module, decoupling dynamic and static objects in the environment, and generating static semantic point cloud information at the same time; Step 2, in the 2D radar module, obtaining the geometric information of a robot moving plane; And 3, constructing a global mixed semantic map in a semantic fusion module. Further, step1 comprises the sub-steps of: Step 1.1, calibrating the mounting position of a camera on a robot to obtain a relative pose transformation matrix of a camera coordinate system relative to the robot coordinate system, and marking the relative pose transformation matrix as T CR; step 1.2, in a tracking thread, acquiring each frame of RGB image and a corresponding depth image thereof through a mobile robot platform, extracting ORB characteristic points from the RGB image, extracting depth information in the corresponding depth image, and tracking the pose of a camera based on semantic information; step 1.3, in a semantic acquisition thread, performing semantic segmentation on the acquired key frame RGB image based on Yolact to generate a category, confidence and color label of a corresponding pixel, obtaining a mask of a dynamic object, and removing the corresponding feature points in step 1.2 based on the dynamic mask; Step 1.4, generat