CN-121767167-B - Visible light image guided laser radar depth complement in-memory index method and device

CN121767167BCN 121767167 BCN121767167 BCN 121767167BCN-121767167-B

Abstract

The invention provides a method and a device for deep complement in-memory indexing of a visible light image-guided laser radar, and relates to the technical field of digital image processing. The method comprises the steps of extracting and splicing RGB features and depth features, obtaining depth neighborhood feature values corresponding to a plurality of space neighborhood directions for depth center points in a spliced feature map, classifying the space neighborhood directions according to space orientations, generating quantized input indexes corresponding to the space neighborhood directions according to the depth center feature values, the RGB center feature values and the depth neighborhood feature values, inquiring a pre-built lookup table in parallel according to classification results and the quantized input indexes to obtain intermediate fusion results of the space neighborhood directions, summing the intermediate fusion results of all the space neighborhood directions to obtain fusion output features, and pre-calculating fusion output of all the indexes by utilizing a trained RGBD fusion network to construct the pre-built lookup table. Thus, the calculation amount and the reasoning delay are greatly reduced.

Inventors

LAI RUI
LU ZIXUAN
GUAN JUNTAO
LI DONG
MA RUI
ZHU ZHANGMING

Assignees

西安电子科技大学

Dates

Publication Date: 20260512
Application Date: 20260304

Claims (10)

1. The method for the depth complement in-memory index of the visible light image-guided laser radar is characterized by comprising the following steps of: performing feature extraction on the RGB image to be complemented and the laser radar depth image, and splicing the extracted RGB features and the extracted depth features along the channel dimension to obtain a spliced feature map; for a depth center point in the spliced feature map, obtaining depth neighborhood feature values corresponding to a plurality of spatial neighborhood directions in the spliced feature map, and classifying the plurality of spatial neighborhood directions according to spatial orientations to obtain a classification result; generating a quantized input index corresponding to each spatial neighborhood direction according to the depth center characteristic value, the RGB center characteristic value and the depth neighborhood characteristic value of the depth center point; According to the classification result and the corresponding quantized input index, a pre-built lookup table matched with the category in the classification result is queried in parallel to obtain an intermediate fusion result of each spatial neighborhood direction, wherein the pre-built lookup table is pre-built by utilizing a trained RGBD fusion network, quantizing depth center features, RGB center features and depth neighborhood features into integers with preset bit widths, splicing the integers into indexes, pre-computing fusion output corresponding to all indexes, and constructing the pre-built lookup table; and summing the intermediate fusion results in all the space neighborhood directions to obtain fusion output characteristics of the depth center point, wherein the fusion output characteristics are used for generating a final complement depth map.
2. The method of claim 1, wherein the stitching the extracted RGB features with the extracted depth features along the channel dimension to obtain a stitched feature map comprises: And alternately arranging the RGB features and the depth features in the channel dimension by adopting an interval arrangement strategy to obtain the spliced feature map.
3. The method of claim 1, wherein for the depth center point in the stitching feature map, obtaining depth neighborhood feature values corresponding to a plurality of spatial neighborhood directions in the stitching feature map, and classifying the plurality of spatial neighborhood directions according to spatial orientations, to obtain classification results, includes: dividing a plurality of space neighborhood directions of the depth center point into four orthogonal directions and four diagonal directions according to the space orientation; based on preset translation and splicing operation, depth neighborhood characteristic values corresponding to the four orthogonal directions and the four diagonal directions are respectively obtained; Splicing depth neighborhood feature values corresponding to the four orthogonal directions along a first new dimension to form an S-type neighborhood feature cube; And splicing depth neighborhood feature values corresponding to the four diagonal directions along a second new dimension to form a class C neighborhood feature cube, wherein the class S neighborhood feature cube and the class C neighborhood feature cube form a classification result of the plurality of space neighborhood directions.
4. The method of claim 3, wherein the obtaining depth neighborhood feature values corresponding to the four orthogonal directions and the four diagonal directions based on the preset translation and stitching operations, respectively, comprises: filling the depth features in the spliced feature map to obtain a filled feature map; translating the filled feature map according to a first group of translation vectors so as to enable the depth neighborhood feature pixels in the four orthogonal directions to coincide with the coordinates of the depth center feature, and obtaining depth neighborhood feature values corresponding to the four orthogonal directions; And translating the filled feature map according to a second group of translation vectors so as to enable the depth neighborhood feature pixels in the four diagonal directions to coincide with the coordinates of the depth center feature, and obtaining depth neighborhood feature values corresponding to the four diagonal directions.
5. The method of claim 1, wherein generating a quantization input index corresponding to each spatial neighborhood direction from the depth center feature value, the RGB center feature value, and the depth neighborhood feature value of the depth center point comprises: quantizing the depth center feature value into a first integer value having a first preset bit width; Quantizing the RGB center feature value into a second integer value having the first preset bit width; quantizing depth neighborhood characteristic values corresponding to the spatial neighborhood directions into third integer values with the first preset bit widths; And splicing the first integer value, the second integer value and the third integer value to generate quantized input indexes corresponding to the directions of the spatial neighbors.
6. A method according to claim 3, wherein the classification result includes a class S neighborhood type and a class C neighborhood type, the class S neighborhood type being a type corresponding to the four orthogonal direction depth neighborhood feature values, the class C neighborhood type being a type corresponding to the four diagonal direction depth neighborhood feature values, the pre-build lookup table including a class S pre-build lookup table and a class C pre-build lookup table; And according to the classification result and the corresponding quantized input index, searching a pre-built lookup table matched with the category in the classification result in parallel to obtain an intermediate fusion result of each spatial neighborhood direction, wherein the method comprises the following steps: According to the S-class neighborhood type or the C-class neighborhood type, synchronously transmitting the corresponding quantized input index to the corresponding S-class pre-built lookup table or the corresponding C-class pre-built lookup table for parallel retrieval; And reading out the intermediate fusion results of the corresponding spatial neighborhood directions from the S-type pre-construction lookup table or the C-type pre-construction lookup table.
7. The method of claim 6, wherein the training process of the trained RGBD fusion network comprises: Respectively calculating affinity matrixes corresponding to the class S neighborhood type and the class C neighborhood type through an affinity sensing module; And carrying out weighted fusion on the depth neighborhood feature and the depth center feature by using the affinity matrix through a propagation calculation module so as to complete training of the RGBD fusion network.
8. The method of claim 7, wherein the affinity sensing module comprises a rotation integration operation module, a mask depth convolution module and a point convolution module which are sequentially connected, and wherein the calculating, by the affinity sensing module, affinity matrices corresponding to the class S neighborhood type and the class C neighborhood type respectively comprises: The S-class neighborhood type and the C-class neighborhood type feature data are respectively input into the rotation integrated operation module so as to rotate the same class neighborhood feature data to a preset reference direction, calculate by utilizing shared network parameters, reversely rotate a calculation result back to an original azimuth, and output middle features after direction correction; Inputting the direction corrected intermediate feature to the mask depth convolution module to perform convolution operation on the direction corrected intermediate feature by using a convolution kernel of a preset size and a position mask, outputting a local convolution feature based on three feature pixels, wherein the position mask is configured to enable the convolution operation to be effective at pixel positions corresponding to the depth center feature, the RGB center feature and the depth neighborhood feature; And inputting the local convolution characteristics to the point convolution module to perform nonlinear transformation on the local convolution characteristics through a plurality of continuous point-by-point convolution layers and interlayer nonlinear activation functions so as to generate an affinity matrix corresponding to the current neighborhood direction.
9. The method of claim 7, wherein said weighting the depth neighborhood feature and the depth center feature by a propagation computation module using the affinity matrix comprises: multiplying the affinity matrix with a corresponding depth neighborhood characteristic value to obtain an affinity neighborhood characteristic; weighting and adding the affinity neighborhood feature and the depth center feature through a learnable weight parameter to obtain a primary fusion feature; And performing propagation calculation operation on the primary fusion features, and performing cross-channel information fusion through a plurality of continuous point-by-point convolution layers and nonlinear activation functions between layers to output unidirectional fusion features.
10. The utility model provides a visible light image-guided laser radar degree of depth is accomplished and is deposited internal index device which characterized in that includes: The extraction and splicing module is used for extracting features of the RGB image to be complemented and the laser radar depth image, and splicing the extracted RGB features and the extracted depth features along the channel dimension to obtain a spliced feature map; The classification module is used for acquiring depth neighborhood feature values corresponding to the plurality of spatial neighborhood directions in the spliced feature map for the depth center point in the spliced feature map, classifying the plurality of spatial neighborhood directions according to the spatial orientation, and obtaining a classification result; The generation module is used for generating a quantized input index corresponding to each space neighborhood direction according to the depth center characteristic value, the RGB center characteristic value and the depth neighborhood characteristic value of the depth center point; The parallel query module is used for parallelly querying a pre-built lookup table matched with the category in the classification result according to the classification result and the corresponding quantized input index so as to obtain an intermediate fusion result in each space neighborhood direction, wherein the pre-built lookup table is pre-built by utilizing a trained RGBD fusion network to quantize depth center features, RGB center features and depth neighborhood features into integers with preset bit widths, splicing the integers into indexes, pre-computing fusion output corresponding to all indexes, and constructing the pre-built lookup table; and the summation module is used for summing the intermediate fusion results in all the space neighborhood directions to obtain fusion output characteristics of the depth center point, and the fusion output characteristics are used for generating a final complement depth map.

Description

Visible light image guided laser radar depth complement in-memory index method and device Technical Field The invention relates to the technical field of digital image processing, in particular to a method and a device for depth complement in-memory indexing of a visible light image-guided laser radar. Background The depth complement task of the laser radar aims at reconstructing a high-quality and dense depth map by combining corresponding high-resolution color (RGB) images on the basis of sparse point clouds. High quality dense depth information is critical for numerous downstream applications such as three-dimensional reconstruction, autopilot environment awareness, robotic navigation and obstacle avoidance, augmented reality or virtual reality scene understanding and interaction, and the like. In the Depth complement neural network, a visible light-Depth (RGBD) fusion method is one of core components for determining overall performance. The RGBD fusion method is responsible for extracting semantic and texture features provided by RGB images and depth information of a depth map, and cooperatively integrating the information of the two modes through a specific fusion strategy, so that a guiding network accurately fills a missing region in the depth map and restores fine depth details. Currently, the existing depth completion method has significantly advanced in improving the completion precision, and can be mainly classified into a method based on a spatial propagation network, a method based on an attention mechanism and a method based on multi-scale feature interaction. These methods typically design complex RGBD fusion modules to capture the correlation of cross-modal features, such as repeatedly updating neighborhood features with iterative spatial propagation networks, or introducing global attention mechanisms to enhance long-range dependencies. However, these high performance fusion strategies generally suffer from excessive computational complexity. For example, the amount of computation of fusion modules based on multi-scale feature interaction methods is often as high as billions times (Giga multiple-Accumulate Operations, GMACs), which can be handled by workstations equipped with high-performance graphics processors (Graphics Processing Unit, GPUs), but can cause serious reasoning delays on computing resource-constrained edge-side devices such as embedded autopilot computing platforms and mobile robot controllers. Disclosure of Invention The embodiment of the invention aims to provide a method and a device for deep complement in-memory indexing of a visible light image-guided laser radar, which solve the problem of reasoning delay caused by large calculation amount in the prior art. In order to solve the technical problems, the embodiment of the invention provides the following technical scheme: The first aspect of the invention provides a method for indexing in-memory depth complement of a visible light image-guided laser radar, which comprises the following steps: performing feature extraction on the RGB image to be complemented and the laser radar depth image, and splicing the extracted RGB features and the extracted depth features along the channel dimension to obtain a spliced feature map; for a depth center point in the spliced feature map, obtaining depth neighborhood feature values corresponding to a plurality of space neighborhood directions in the spliced feature map, and classifying the plurality of space neighborhood directions according to the space orientation to obtain a classification result; Generating quantized input indexes corresponding to each space neighborhood direction according to the depth center characteristic value, the RGB center characteristic value and the depth neighborhood characteristic value of the depth center point; According to the classification result and the corresponding quantization input index, a pre-constructed lookup table matched with the category in the classification result is queried in parallel to obtain an intermediate fusion result of each space neighborhood direction; Summing the intermediate fusion results in all the space neighborhood directions to obtain fusion output characteristics of the depth center point, wherein the fusion output characteristics are used for generating a final complement depth map; the pre-built lookup table is pre-built by utilizing a trained RGBD fusion network to quantize depth center features, RGB center features and depth neighborhood features into integers with preset bit widths, splicing the integers into indexes, and pre-computing fusion output corresponding to all the indexes to build the pre-built lookup table. The second aspect of the present invention provides a visible light image-guided laser radar depth-complement in-memory index device, comprising: The extraction and splicing module is used for extracting features of the RGB image to be complemented and the laser radar depth image, and splicing the e