CN-121977549-A - Multi-source fusion navigation method and system based on three-dimensional target level dynamic and static decoupling

CN121977549ACN 121977549 ACN121977549 ACN 121977549ACN-121977549-A

Abstract

The invention relates to the technical field of navigation and provides a multi-source fusion navigation method and system based on three-dimensional target-level dynamic and static decoupling, wherein the multi-source fusion navigation method and system comprises the steps of guiding point cloud clustering based on two-dimensional target information in a target scene image, and generating a three-dimensional target detection result with spatial positions and categories; the method comprises the steps of identifying dynamic targets, judging constraints based on dynamic characteristics, eliminating dynamic points in visual observation and laser radar observation, constructing visual characteristic constraints and laser radar point cloud geometric constraints based on obtained static observation information, fusing motion constraints provided by an inertial measurement unit, estimating system pose, and constructing and updating a map with dynamic and static decoupling based on observation consistency constraints according to pose estimation results and static observation information after eliminating the dynamic points. The method takes three-dimensional target-level dynamic and static decoupling as a core, realizes visual semantic guidance point cloud targeting and cross-modal dynamic point consistent elimination, and inhibits dynamic interference and map pollution in unified optimization and consistent mapping.

Inventors

Chai Dashuai
SHANG XUECHENG
WANG HAOZE
WANG KUNLIN
YAN RUIJIE
LI JINGQI
NING YIPENG
WANG XIQI
SANG WENGANG
DING RUI
WANG ZHIWEI

Assignees

山东建筑大学

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (10)

1. The multi-source fusion navigation method based on three-dimensional target level dynamic and static decoupling is characterized by comprising the following steps of: acquiring an image of a target scene and laser radar point cloud data, guiding point cloud clustering based on two-dimensional target information in the image, generating a three-dimensional target detection result with spatial positions and categories, and tracking a three-dimensional target; According to the tracking and identifying dynamic targets, based on dynamic characteristic discrimination constraint, eliminating dynamic points in visual observation and laser radar observation to obtain static observation information; Based on the obtained static observation information, constructing a visual feature constraint and a laser radar point cloud geometric constraint, fusing the motion constraint provided by an inertial measurement unit, constructing a multisource fused nonlinear optimization model, and solving to obtain a pose estimation result of the navigation system; And carrying out dynamic and static decoupling map construction and updating based on the observation consistency constraint according to the pose estimation result at the current moment and static observation information after removing the dynamic points.
2. The method for multi-source fusion navigation based on three-dimensional object level dynamic and static decoupling of claim 1, wherein the process of generating three-dimensional object detection results with spatial positions and categories comprises the following steps: Acquiring an image of a target scene and laser radar point cloud data, and carrying out target identification on the image by utilizing a target detection network to acquire a two-dimensional boundary box and a two-dimensional visual detection target class; Mapping a two-dimensional boundary frame into a three-dimensional space according to preset camera internal and external parameters to form a view cone region, and carrying out clustering processing on point clouds based on the view cone region to obtain candidate three-dimensional targets; and carrying out semantic consistency screening and target boundary box optimization on the candidate three-dimensional targets based on two-dimensional visual detection target category information obtained by target identification of the images, and generating a three-dimensional boundary box and target category information as a three-dimensional target detection result.
3. The multi-source fusion navigation method based on three-dimensional object-level dynamic and static decoupling of claim 2, wherein the method for clustering point clouds based on the view cone region to obtain candidate three-dimensional objects comprises the following steps: preprocessing input laser radar point cloud data; performing density clustering on the point cloud data in the view cone region by adopting a DBSCAN algorithm to form a candidate cluster set, and obtaining a candidate three-dimensional target; and carrying out semantic consistency screening and target boundary box optimization on candidate three-dimensional targets based on target category information obtained by target identification, and generating a three-dimensional boundary box and target category information as a three-dimensional target detection result.
4. The method for multi-source fusion navigation based on three-dimensional object-level dynamic and static decoupling according to claim 3, wherein the method for generating three-dimensional bounding boxes and object category information as three-dimensional object detection results by semantic consistency screening and object bounding box optimization comprises the following steps: calculating consistency scores of the clusters and the two-dimensional visual detection target categories based on the two-dimensional visual detection target category information, carrying out semantic consistency screening on point cloud clustering results, and fitting to obtain an initial three-dimensional boundary frame based on the screened clusters; And (3) optimizing the initial three-dimensional boundary frame, adjusting the central position and the size, enabling the boundary frame to be consistent with the geometric characteristics of the clustering point cloud, and aligning the boundary frame with the two-dimensional visual detection target category information to obtain a three-dimensional target detection result.
5. The multi-source fusion navigation method based on three-dimensional object-level dynamic and static decoupling of claim 1, wherein the method is characterized in that dynamic points in visual observation and laser radar observation are removed to obtain static observation information based on dynamic feature discrimination constraint according to a tracking and identification dynamic object, and comprises the following steps: constructing dynamic feature discrimination constraint for visual feature points in the image based on the three-dimensional depth consistency change of the projection area of the three-dimensional detection frame of the dynamic target and the feature points, and identifying and eliminating potential dynamic points to obtain static visual feature points; And constructing a motion characteristic discrimination constraint for points in the laser radar point cloud based on the three-dimensional detection frame space constraint of the dynamic target and the consistency change of the radial speed of the point cloud and the target prediction speed, and identifying and eliminating potential dynamic points to obtain the static radar point cloud.
6. The multi-source fusion navigation method based on three-dimensional object-level dynamic and static decoupling of claim 5, wherein the method for constructing dynamic feature discrimination constraints on visual feature points in images based on three-dimensional depth consistency changes of projection areas of three-dimensional detection frames of dynamic objects and feature points, identifying and eliminating potential dynamic points and obtaining static visual feature points comprises the following steps: projecting a three-dimensional detection frame of the dynamic target to a current image plane to form a two-dimensional dynamic region mask; the pixel coordinates in the image fall into visual feature points in the two-dimensional dynamic region mask, and the visual feature points are judged to be dynamic points and removed; for the visual feature points which do not fall into the two-dimensional dynamic area mask, calculating the three-dimensional positions of the feature points by adopting a multi-view geometric method as actual positions, and predicting the current positions based on the previous time positions and the current camera pose; and calculating depth residual errors of the predicted position and the actual position, and if the residual errors exceed a threshold value, taking the current feature points as potential dynamic points and eliminating the potential dynamic points to obtain static visual feature points meeting static assumption.
7. The method for multi-source fusion navigation based on three-dimensional target-level dynamic and static decoupling of claim 5, wherein the method for constructing motion feature discrimination constraints for points in a laser radar point cloud based on three-dimensional detection frame space constraints of a dynamic target and consistency changes of the radial speed of the point cloud and the target prediction speed, and identifying and eliminating potential dynamic points comprises the following steps: Converting the laser radar point cloud data of the current frame into the same local coordinate system based on the tracking and identifying dynamic targets, and marking the current frame as candidate dynamic point cloud if a certain point falls into any dynamic target detection frame; Calculating the observed radial velocity of the candidate dynamic point cloud, calculating residual errors with the projection velocity of the predicted target velocity in the sight direction, and if the velocity residual errors are lower than a threshold value, taking the current candidate dynamic point as a dynamic point and removing the dynamic point; and for the point cloud which does not fall into any dynamic target three-dimensional detection frame, eliminating the points with radial speed deviating from the static assumption as dynamic points.
8. The method for multi-source fusion navigation based on three-dimensional object level dynamic and static decoupling of claim 5, wherein the static observation information comprises static visual feature points and static laser radar point clouds; for the static visual feature points after being removed, constructing a visual re-projection residual error item based on the re-projection error as visual feature constraint; And constructing a point cloud geometric residual error based on geometric consistency as a laser radar point cloud geometric constraint for the static laser radar point cloud after being removed.
9. Multisource fusion navigation system based on three-dimensional target level dynamic and static decoupling, which is characterized by comprising: The three-dimensional target detection module is configured to acquire an image of a target scene and laser radar point cloud data, guide point cloud clustering based on two-dimensional target information in the image, generate a three-dimensional target detection result with spatial positions and categories, and track the three-dimensional target; The dynamic point eliminating module is configured to eliminate dynamic points in visual observation and laser radar observation based on dynamic characteristic discrimination constraint according to the tracked and identified dynamic targets to obtain static observation information; The pose estimation module is configured to construct visual feature constraint and laser radar point cloud geometric constraint based on the obtained static observation information, fuse motion constraint provided by the inertial measurement unit, construct a multisource fused nonlinear optimization model, and obtain a pose estimation result of the navigation system after solving; And the map construction and updating module is configured to perform dynamic and static decoupling map construction and updating based on the observation consistency constraint according to the pose estimation result at the current moment and static observation information after the dynamic points are removed.
10. The multi-source fusion navigation system based on three-dimensional target-level dynamic and static decoupling is characterized by comprising a data acquisition device and a processor, wherein the processor is configured to execute the steps of the multi-source fusion navigation method based on three-dimensional target-level dynamic and static decoupling as claimed in any one of claims 1 to 8.

Description

Multi-source fusion navigation method and system based on three-dimensional target level dynamic and static decoupling Technical Field The invention relates to the technical field of navigation, in particular to a multisource fusion navigation method and system based on three-dimensional target level dynamic and static decoupling. Background The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art. SLAM (Simultaneous Localization AND MAPPING, synchronous positioning and map building) technology is a core supporting means for realizing autonomous system navigation and environment perception of intelligent driving, mobile robots, unmanned aerial vehicle mapping and the like. The traditional SLAM system depends on a single sensor, but in a real complex environment, various sensors have inherent limitations, and the dual requirements of high precision and robustness are difficult to meet. The vision sensor can provide rich image textures and semantic information, is favorable for feature extraction and scene understanding, is easily influenced by factors such as illumination change, motion blurring, texture sparseness and the like, and has poor stability. The lidar has the ability of accurate ranging and illumination resistance, which is superior in terms of construction geometry, but may still produce ranging distortion in the face of glass reflection, rain and fog shielding or sparse areas, and lacks semantic understanding ability. Although the Inertial Measurement Unit (IMU) can provide high-frequency motion prior and is suitable for dynamic compensation in a short time, drift is easy to generate, and long-term accurate positioning cannot be independently realized. To overcome the above problems, a multi-source fusion SLAM technique has been developed. The existing laser radar/vision/inertia fusion SLAM framework can improve positioning accuracy and robustness to a certain extent, the core of the framework is still limited to joint estimation of geometric features, and the framework lacks efficient and accurate perception capability for dynamic targets in a real scene. Moving targets such as vehicles, pedestrians, riders and the like are commonly present in a dynamic environment, and the targets can damage the space-time consistency of laser point clouds and visual features, so that feature matching errors, point cloud degradation, unstable state estimation and even system divergence are caused. The existing multi-source SLAM system generally lacks space-time consistency modeling between vision and a laser radar in a dynamic scene, so that a dynamic target is difficult to realize collaborative identification and rejection in a vision and point cloud mode, and the problem of unstable positioning is further caused. The existing multi-source SLAM system generally utilizes vision and laser radar observation results to carry out fusion processing respectively in a map updating stage, and lacks effective distinction and control on dynamic target observation obtained by different sensors, so that a moving target is wrongly written into a map under multi-mode observation, and map pollution problems such as 'ghosts' or 'trails' are gradually formed. Disclosure of Invention In order to solve the problems, the invention provides a multi-source fusion navigation method and system based on three-dimensional target-level dynamic and static decoupling, which are used for generating and tracking three-dimensional targets with categories by guiding point cloud clustering through two-dimensional target semantic information, realizing the removal of dynamic points with consistent cross-modal of a vision-laser radar on the basis, integrating static observation after removal and IMU motion constraint into nonlinear optimization, and simultaneously completing map construction and updating of dynamic and static decoupling by combining observation consistency constraint, thereby improving the stability of positioning and map construction under a dynamic scene and inhibiting map pollution. In order to achieve the above purpose, the present invention adopts the following technical scheme: the invention provides a multisource fusion navigation method based on three-dimensional target-level dynamic and static decoupling, which comprises the following steps: acquiring an image of a target scene and laser radar point cloud data, guiding point cloud clustering based on two-dimensional target information in the image, generating a three-dimensional target detection result with spatial positions and categories, and tracking a three-dimensional target; According to the tracking and identifying dynamic targets, based on dynamic characteristic discrimination constraint, eliminating dynamic points in visual observation and laser radar observation to obtain static observation information; Based on the obtained static observation information, constructing a