CN-122023526-A - Model-based segmentation and 6D pose estimation iteration enhancement method, device and storage medium

CN122023526ACN 122023526 ACN122023526 ACN 122023526ACN-122023526-A

Abstract

The invention relates to a model-based segmentation and 6D pose estimation iteration enhancement method, equipment and a storage medium, and belongs to the technical field of image segmentation and three-dimensional point cloud processing. The method comprises the steps of carrying out workpiece segmentation on an RGB image of a workpiece to obtain an initial segmentation mask, forming a workpiece point cloud according to the current segmentation mask and an initial perception depth map of the workpiece, obtaining a current estimated 6D pose based on alignment of a workpiece CAD model and the workpiece point cloud, reversely rendering the CAD model of the workpiece by utilizing the current estimated 6D pose of the workpiece under a camera coordinate system and intrinsic parameters of a camera, taking a visible region in the depth map obtained by rendering as an optimized segmentation mask, and iteratively executing the steps until convergence or preset iteration times are reached. Compared with the prior art, the method realizes the mutual promotion of the segmentation result and the 6D pose estimation through the cycle of segmentation, point cloud, pose estimation, CAD alignment feedback and segmentation enhancement, and improves the recognition and positioning precision of the industrial mechanical arm on the workpiece.

Inventors

Request for anonymity

Assignees

上海优复博智能科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260129

Claims (10)

1. The iterative enhancement method for model-based segmentation and 6D pose estimation is characterized by comprising the following steps of: s1, performing workpiece segmentation on an RGB image of a workpiece by adopting a deep learning segmentation model to obtain an initial segmentation mask; S2, forming a workpiece point cloud according to the current segmentation mask and an initial perception depth map of the workpiece; S3, aligning a CAD model of the workpiece with a workpiece point cloud, and adopting deep learning pose estimation model coarse positioning and ICP algorithm fine positioning to obtain a current estimated 6D pose; S4, reversely rendering a CAD model of the workpiece through a three-dimensional rendering library by utilizing the current estimated 6D pose of the workpiece and the camera internal parameters under a camera coordinate system to obtain a rendered depth map, determining a visible region of the workpiece according to a pixel region with valid depth values in the rendered depth map, and using the visible region as an optimized segmentation mask for subsequent processing; And S5, iteratively executing the steps S2-S4, taking the obtained optimized segmentation mask as an input segmentation mask of the step S2 in a new iteration, taking the initial perception depth map of the workpiece as an input depth map of the step S2 in the new iteration, and repeatedly iterating until convergence or reaching the preset iteration times.
2. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S1, the deep learning segmentation model is one or more of SAM, UNet, UNet ++, mask R-CNN.
3. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S1, the RGB image of the workpiece is obtained by a camera.
4. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S2, the initial perceived depth map of the workpiece is obtained by a camera.
5. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S2, a depth map of a workpiece is segmented according to a current segmentation mask, and back projection is performed to form a workpiece point cloud according to pixel values of an image obtained by segmentation of the depth map of the workpiece and intrinsic parameters of a camera.
6. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S3, the deep learning pose estimation model is one or more of PoseCNN, PVNet, OVE D, foundation Pose.
7. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein in step S4, the three-dimensional rendering library is pyrender; the step S4 specifically comprises the following steps: And generating a rendered depth map by using the currently estimated 6D pose and camera intrinsic parameters through a camera projection algorithm in pyrender on a CAD model of the workpiece, and taking a visible region in the rendered depth map as an optimized segmentation mask.
8. The iterative enhancement method for model-based segmentation and 6D pose estimation according to claim 1, wherein after step S5, the following steps are performed: according to the convergence or the estimated pose obtained when the preset iteration times are reached, the estimated pose is the final 6D pose estimation result of the algorithm, wherein the convergence judgment condition is that the average depth error value of the rendering depth map and the initial perception depth map of the workpiece in the visible region is smaller than a specified threshold value; the mean depth error value is calculated as follows: Wherein the method comprises the steps of Represents the visible area, u represents the pixel, Represents the depth value calculated from the estimated pose, Representing the actual depth value.
9. An electronic device, comprising: A memory storing computer readable instructions, and A processor executing computer readable instructions stored in the memory to implement the model-based segmentation and 6D pose estimation iterative enhancement method according to any of claims 1 to 8.
10. A computer readable storage medium having stored therein computer readable instructions for execution by a processor in an electronic device to implement the model-based segmentation and 6D pose estimation iterative enhancement method of any of claims 1 to 8.

Description

Model-based segmentation and 6D pose estimation iteration enhancement method, device and storage medium Technical Field The invention relates to the technical field of image segmentation and three-dimensional point cloud processing, in particular to a model-based segmentation and 6D pose estimation iteration enhancement method, equipment and a storage medium. Background In industrial robot operations, the precise positioning of workpieces is the core of welding, assembly, gripping and intelligent interaction tasks. With the popularization of RGBD cameras and three-dimensional sensors, estimating the pose (position+pose) of an object 6D by image segmentation and point cloud reconstruction is a common technical route. The prior art mainly comprises three stages: The method comprises a segmentation stage (1) for segmenting a workpiece area from an RGB or RGBD image, (2) a point cloud stage for generating a corresponding point cloud according to a depth map and extracting a workpiece point cloud, and (3) a pose estimation stage for aligning the workpiece point cloud with a workpiece CAD model to obtain a 6D pose. However, the existing method is usually one-time forward reasoning and has the limitation that the segmentation error can be directly transmitted to pose estimation, so that the point cloud is incomplete or contains background noise. In an industrial scene, factors such as reflection, shielding, complex weld joint structure, large gesture change and the like exist on the surface of a workpiece, so that the segmentation is unstable, point cloud noise is difficult to process, and a 6D pose estimation result is difficult to meet the requirement of precise operation of a robot. That is, the following problems exist in the workpiece alignment method in the existing industrial vision: 1. the segmentation error can not be corrected by the subsequent process, and the point cloud extraction is unstable; 2. the 6D pose estimation is affected by the segmentation result, and the alignment result is inaccurate; 3. The CAD model prior is underutilized and cannot be used to guide segmentation corrections. Therefore, a collaborative iteration enhancement mechanism for segmentation and alignment is needed, namely segmentation is reversely optimized by utilizing CAD prior and pose estimation results, and then the optimized segmentation is improved in point cloud quality, so that the pose estimation accuracy is further improved. Disclosure of Invention The invention aims to overcome the problems in the prior art, and realizes the mutual promotion of a segmentation result and the 6D pose estimation through a circulating structure of segmentation, point cloud, pose estimation, CAD alignment feedback and segmentation enhancement, thereby obviously improving the recognition and positioning precision of an industrial mechanical arm on a workpiece. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: As a first aspect of the present invention, there is provided a model-based segmentation and 6D pose estimation iterative enhancement method, comprising the steps of: s1, initially dividing to obtain a workpiece area, namely performing workpiece division on RGB images of a workpiece by adopting a deep learning division model to obtain an initial division mask; s2, generating and extracting a workpiece point cloud based on the depth map, wherein the workpiece point cloud is formed according to the current segmentation mask and the initial perception depth map of the workpiece; S3, aligning the CAD model and the workpiece point cloud, namely aligning the CAD model and the workpiece point cloud based on the workpiece, and obtaining the current estimated 6D pose by adopting deep learning pose estimation model coarse positioning and ICP algorithm fine positioning; S4, reversely rendering a CAD model of the workpiece through a three-dimensional rendering library by utilizing the current estimated 6D pose of the workpiece and the camera internal parameters under a camera coordinate system to obtain a rendered depth map, determining a visible region of the workpiece according to a pixel region with valid depth values in the rendered depth map, and using the visible region as an optimized segmentation mask for subsequent processing; and S5, performing iteration steps S2-S4, namely performing iteration steps S2-S4, taking the obtained optimized segmentation mask as an input segmentation mask of the step S2 in a new iteration round, taking the initial perception depth map of the workpiece as an input depth map of the step S2 in the new iteration round, and repeating the iteration until convergence or reaching the preset iteration times. Further, in step S4, the depth map of the CAD model is reversely rendered by using the currently estimated 6D pose of the workpiece in the camera coordinate system and the camera intrinsic parameters, so as to achieve the effect of optimizing the segmentation. Fu