CN-122023928-A - Image and point cloud feature fusion linear high-light overflow small target detection method and electronic equipment

CN122023928ACN 122023928 ACN122023928 ACN 122023928ACN-122023928-A

Abstract

The invention discloses a method for detecting a linear high-light overflow small target by fusing an image and point cloud characteristics and electronic equipment. The method comprises the steps of synchronously obtaining microscopic images and three-dimensional point cloud data of the same position, respectively extracting multi-scale visual features of the images, extracting geometric features of the point cloud based on multi-radius neighborhood analysis, projecting the extracted bimodal features to a unified aerial view space, carrying out depth fusion through bidirectional cross attention and a self-adaptive gating fusion mechanism, generating a background suppression mask by utilizing the geometric features of the point cloud to suppress background interference of a bonding pad, a substrate and the like, inputting the processed features into an anchor-frame-free three-dimensional detection head, and directly returning three-dimensional boundary frame parameters of each bonding wire. According to the invention, through the depth fusion and complementation of the image and the point cloud, the industrial detection problems of highlight overflow, weak texture, sparse point cloud, complex background interference and the like are effectively solved, and the high-precision and high-robustness three-dimensional positioning and measurement of the bonding wire are realized.

Inventors

WANG YINGZHI
CHEN JIONG
WANG HONGZHE
LI DONGYUE
LI YANG
ZHANG CHAO
SUN JIAYU

Assignees

长春理工大学
吉林省博辉科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260205

Claims (9)

1. The linear high-light overflow small target detection method for fusing the image and the point cloud features is characterized by comprising the following steps of: acquiring microscopic images and three-dimensional point cloud data acquired at the same position; extracting multi-scale features from the microscopic image to generate image multi-scale features; extracting multi-scale geometric features from the three-dimensional point cloud data to generate point cloud geometric features; Projecting the image multi-scale features to a bird's-eye view space to generate image bird's-eye view features; Projecting the point cloud geometric features to the aerial view space to generate point cloud aerial view features; performing depth fusion on the image aerial view characteristic and the point cloud aerial view characteristic to generate a fused aerial view characteristic; Generating a background suppression mask based on the point cloud geometric features, and performing background suppression on the fused aerial view features by using the background suppression mask; Inputting the fusion aerial view characteristics after background inhibition to an Anchor-Free three-dimensional detection head, and predicting to obtain three-dimensional boundary frame parameters of each bonding wire, wherein the three-dimensional boundary frame parameters comprise center point coordinates, sizes, orientations and confidence degrees.
2. The method of claim 1, wherein the multi-scale feature extraction of microscopic images comprises: Extracting multi-scale image features through a convolutional neural network comprising a four-level residual structure; Enhancing and fusing the multi-scale image features through a feature pyramid network; And weighting the fused multi-scale features through a hierarchical attention mechanism to generate final image multi-scale features.
3. The method of claim 1, wherein the multi-scale geometrical feature extraction of the three-dimensional point cloud data comprises: for each point in the point cloud, searching for a neighborhood point under a plurality of preset radius scales, wherein the neighborhood point comprises three radiuses: =50 μm for capturing line structure local morphology; =100 μm for capturing the bonding wire shape stable dimensions; =200 μm for capturing global bending tendency; Aiming at the neighborhood point set under each scale, calculating a covariance matrix of the neighborhood point set and carrying out eigenvalue decomposition to obtain eigenvalues, wherein the neighborhood covariance matrix is as follows: ; Wherein, the All points in the neighborhood are summed, For the number of neighborhood points to be used to calculate the average, Is the vector outer product; Calculating a geometric descriptor representing local geometric characteristics based on the characteristic values, wherein the geometric descriptor comprises linearity and flatness; And splicing the geometric descriptors under a plurality of scales, and fusing through an attention mechanism to generate the point cloud geometric features.
4. The method of claim 1, wherein the projecting the image multi-scale features into the aerial view space using a dense UV to BEV projection method comprises: For each pixel in the image, back projecting it to the intersection point of the world coordinate system with the reference plane of the specified height according to the camera internal and external parameters; Mapping the intersection points to grid coordinates of the aerial view; And filling the image features corresponding to the pixels into the aerial view grid to generate the image aerial view features.
5. The method of claim 1, wherein the depth fusion is combined with a gated fusion by bi-directional cross-attention, comprising: Taking the image aerial view characteristic and the point cloud aerial view characteristic as query vectors and key value vectors respectively, and performing bidirectional cross attention calculation to obtain point cloud enhancement characteristics and image enhancement characteristics respectively; Generating an image credibility map and a point cloud credibility map through a convolution network according to the image enhancement features and the point cloud enhancement features; And carrying out weighted summation on the image enhancement features and the point cloud enhancement features based on the image reliability map and the point cloud reliability map, and generating the fused aerial view features.
6. The method of claim 1, wherein generating a background suppression mask based on the point cloud geometry comprises: identifying a linear structure region and a plane background region according to the linearity in the point cloud geometric characteristics; generating a mask corresponding to the aerial view space, wherein a mask value corresponding to the linear structure region is set as an enhancement weight, and a mask value corresponding to the plane background region is set as a suppression weight; and weighting the fused aerial view features element by utilizing the mask to realize background suppression.
7. The method of claim 1, wherein the Anchor-Free three-dimensional inspection head comprises a classification branch and a regression branch; the classifying branches are used for predicting the probability of whether each aerial view grid has a bonding wire center or not; The regression branches are used for regressing parameters of the three-dimensional bounding box, wherein the regressing of the parameters comprises two-dimensional offset and height values of a regression center point by using a smooth L1 loss and a size value, and predicting an orientation angle by using a mode based on classification and residual regression.
8. The method according to claim 1, wherein the method is optimized by end-to-end training, and the composite loss function used comprises classification loss for supervising the bond wire center point prediction, regression loss for supervising the three-dimensional bounding box center point and size regression, direction classification loss and direction residual loss for supervising the orientation angle prediction; The total loss function is: 。
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 8 when the program is executed by the processor.

Description

Image and point cloud feature fusion linear high-light overflow small target detection method and electronic equipment Technical Field The invention relates to the technical field of integrated circuit detection, in particular to a method for detecting a linear highlight overflow small target by fusing an image and point cloud characteristics and electronic equipment. Background Hybrid integrated circuits are widely used in high reliability scenarios such as aerospace, precision electronics, etc., where internal interconnections typically rely on wire bonding processes. The bonding wire diameter is typically 20-30 microns, which has extremely high requirements on the accuracy and stability of the detection system. Current detection relies mainly on: 1. Manual microscopic examination, namely low efficiency and poor stability; 2. the algorithm based on the single-mode image is very sensitive to reflection, high light and shielding and has no three-dimensional information; 3. and in the traditional point cloud characteristic algorithm, bonding wires are sparse and slender, and geometric information is easy to lose after voxelization. The prior art is difficult to achieve high precision, high robustness and high efficiency at the same time in complex industrial environments. The invention provides a bonding wire detection method for image-point cloud depth fusion, which utilizes geometric complementarity of image textures and point clouds to improve detection robustness and accuracy, and can be used for stable detection in scenes such as high light reflection areas, uneven illumination, weak image textures, local sparsity of the point clouds and the like. Disclosure of Invention The technical scheme for solving the technical problems is that the invention provides a linear highlight overflow small target detection method for fusing an image and point cloud characteristics, which comprises the following steps: acquiring microscopic images and three-dimensional point cloud data acquired at the same position; extracting multi-scale features from the microscopic image to generate image multi-scale features; extracting multi-scale geometric features from the three-dimensional point cloud data to generate point cloud geometric features; Projecting the image multi-scale features to a bird's-eye view space to generate image bird's-eye view features; Projecting the point cloud geometric features to the aerial view space to generate point cloud aerial view features; performing depth fusion on the image aerial view characteristic and the point cloud aerial view characteristic to generate a fused aerial view characteristic; Generating a background suppression mask based on the point cloud geometric features, and performing background suppression on the fused aerial view features by using the background suppression mask; Inputting the fusion aerial view characteristics after background inhibition to an Anchor-Free three-dimensional detection head, and predicting to obtain three-dimensional boundary frame parameters of each bonding wire, wherein the three-dimensional boundary frame parameters comprise center point coordinates, sizes, orientations and confidence degrees. Further, the multi-scale feature extraction of the microscopic image includes: Extracting multi-scale image features through a convolutional neural network comprising a four-level residual structure; Enhancing and fusing the multi-scale image features through a feature pyramid network; And weighting the fused multi-scale features through a hierarchical attention mechanism to generate final image multi-scale features. Further, the multi-scale geometrical feature extraction of the three-dimensional point cloud data includes: for each point in the point cloud, searching for a neighborhood point under a plurality of preset radius scales, wherein the neighborhood point comprises three radiuses: =50 μm for capturing line structure local morphology; =100 μm for capturing the bonding wire shape stable dimensions; =200 μm for capturing global bending tendency; aiming at the neighborhood point set under each scale, calculating a covariance matrix of the neighborhood point set and carrying out eigenvalue decomposition to obtain eigenvalues, wherein the neighborhood covariance matrix is ; Wherein, the All points in the neighborhood are summed,For the number of neighborhood points to be used to calculate the average,Is the vector outer product; Calculating a geometric descriptor representing local geometric characteristics based on the characteristic values, wherein the geometric descriptor comprises linearity and flatness; And splicing the geometric descriptors under a plurality of scales, and fusing through an attention mechanism to generate the point cloud geometric features. Further, the projecting the image multi-scale features into the aerial view space adopts a dense UV to BEV projection method, which comprises: For each pixel in the image, back projecting it to th