CN-122023691-A - Underwater long-distance high-precision three-dimensional imaging method and system

CN122023691ACN 122023691 ACN122023691 ACN 122023691ACN-122023691-A

Abstract

The invention relates to an underwater long-distance high-precision three-dimensional imaging method and system, wherein the method comprises the following steps of S100, acquiring gating images of n different time gates; S200, inputting the n gating images with different time gates into n encoders with independent parameters respectively for feature extraction to obtain n groups of feature representations, S300, fusing the n groups of feature representations by using a cross-level feature fusion mechanism to generate fusion features, S400, decoding the fusion features to generate a pixel-level depth map. Through the arrangement, the feature extraction capability and the depth reconstruction precision are remarkably improved, and the complementary information of multiple images is effectively integrated through a cross-level feature fusion mechanism.

Inventors

ZHOU RUIXIANG
CHEN WEI
HE WEI
SHE RONGBIN
JIAO GUOHUA

Assignees

中国科学院深圳先进技术研究院

Dates

Publication Date: 20260512
Application Date: 20251230

Claims (10)

1. An underwater remote high-precision three-dimensional imaging method is characterized by comprising the following steps of: S100, acquiring n gating images of different time gates; s200, respectively inputting the n gating images with different time gates into n encoders with independent parameters for feature extraction to obtain n groups of feature representations; s300, fusing n groups of feature representations by adopting a cross-level feature fusion mechanism to generate fusion features; s400, decoding the fusion characteristic to generate a pixel-level depth map.
2. The method according to claim 1, wherein S100 specifically comprises: S110, acquiring an underwater two-dimensional image, and acquiring an initial depth estimation image based on the two-dimensional image; S120, generating n gating images with different time gates according to the initial depth estimation graph based on laser gating imaging and/or m code channel coding strategies.
3. The method according to claim 2, wherein the laser-gated imaging in S120 specifically comprises: s121, emitting laser pulses to an underwater scene; S122, adjusting exposure time windows of the gating cameras, and respectively collecting reflected light at n different time points to obtain n gating images corresponding to different distance layers.
4. The method of claim 3, wherein the m-track encoding strategy in S120 causes the target to exhibit different intensity relationships in the n images by adjusting the gating time to expose at different time slices.
5. The method according to any of claims 1-4, wherein the encoder in S200 is a DINOv2 encoder and/or the fused features are decoded in S400 using a dense predictive converter de-dock.
6. The method of claim 5, wherein the cross-level feature fusion mechanism in S300 performs element-wise summation fusion of feature maps output by n encoders at the same level.
7. The method according to any one of claims 1-4 or 6, wherein the method is based on a deep learning model, and wherein the training method of the deep learning model comprises the steps of: Constructing a training data set containing n Zhang Xuantong images and corresponding real depth labels; inputting the training data set into a deep learning model to obtain a predicted depth map; calculating loss between the predicted depth map and the real depth label; And updating parameters of the deep learning model according to the loss.
8. The method according to claim 7, wherein the calculating of losses between the predicted depth map and the real depth labels uses an L1 loss function and/or the training depth learning model uses an adaptive moment estimation optimizer for end-to-end optimization.
9. An underwater remote high precision three dimensional imaging system for implementing the method of any of claims 1 to 8, the system comprising: The data preprocessing module is used for acquiring n gating images of different time gates; the feature extraction module comprises n independent DINOv encoders which are used for extracting the feature representation of each gating image; The feature fusion module is used for fusing the plurality of feature representations to generate fusion features; And the depth prediction module comprises a dense prediction converter solution terminal and is used for generating a pixel-level depth map according to the fusion characteristic decoding.
10. The system of claim 9, wherein the data preprocessing module comprises: A pulsed laser for providing a time modulated light source; the gate control device is based on the micro-channel plate structural design and is used for precisely controlling the exposure time; The synchronous control unit is realized by adopting an FPGA control board and is used for ensuring time sequence matching between laser emission and camera exposure; And the image acquisition processing unit is used for gating the acquisition, storage and preliminary processing of the images.

Description

Underwater long-distance high-precision three-dimensional imaging method and system Technical Field The invention relates to the technical field of underwater imaging, in particular to an underwater long-distance high-precision three-dimensional imaging method and system. Background The underwater optical three-dimensional imaging technology is a core technology in the fields of ocean scientific research, resource exploration, engineering construction, underwater operation and the like, and aims to realize a key task from macroscopic submarine topography mapping to microscopic target three-dimensional reconstruction through an optical means. The technology is widely applied to specific scenes such as underwater robot visual navigation, submarine pipeline and facility detection, underwater archaeological and biological observation and the like. However, the complex underwater environment poses serious challenges for optical imaging, firstly, the water body has strong absorption effect (especially red light wave band) on light, so that the light energy is rapidly attenuated, the effective imaging distance is shortened, and meanwhile, suspended particles in the water can cause light scattering, so that the image is blurred, the contrast is reduced, and obvious blue-green color bias is generated. Secondly, with the increase of the detection distance and the increase of the turbidity of the water body, the backward scattering effect can obviously increase imaging interference, and effective data reduction, error increase and target contour blurring are caused in the three-dimensional reconstruction process. Finally, acquisition of underwater data sets (particularly gating imaging data sets) is extremely difficult, on one hand, the data acquisition needs to rely on special hardware such as expensive laser gating cameras, high-power pulse lasers and the like, the cost of a single set of equipment is usually up to hundreds of thousands of yuan, the single set of equipment needs to be deployed in an underwater experimental field or an actual marine environment and is limited by factors such as weather, sea conditions and underwater visibility, a single acquisition period can be up to days to weeks, on the other hand, the depth annotation of the underwater scene lacks reliable reference standards, the laser radar is easily scattered and interfered under water to cause data distortion, and manual annotation is difficult to cover all pixels of a complex scene, so that the existing disclosed underwater data sets (such as underwater robot target detection data sets (Underwater Robot Picking Contest and URPC)) mainly comprise two-dimensional images, and lack of matched accurate depth labels and multi-frame gating sequences, so that the training requirement of a depth learning model on large-scale and high-quality paired data cannot be met. In addition, the current underwater gating three-dimensional imaging technology faces the double dilemma of 'the traditional algorithm has weak noise immunity and poor suitability for deep learning underwater environment', so that the existing scheme cannot simultaneously meet the three-dimensional imaging requirement of 'long-distance, high precision and strong robustness' underwater. Firstly, although the traditional algorithm (such as a 12T coding method) has a clear physical basis, the noise immunity is weak, the depth reconstruction accuracy is drastically reduced in an underwater complex environment with remarkable water body scattering and optical attenuation, and the depth reconstruction accuracy is poor especially under long-distance and high-turbidity imaging conditions. Secondly, although the noise suppression aspect of the existing deep learning method (such as Gated & lt 2 & gt 2 Gated) is improved, the model structure is mainly designed aiming at the atmospheric environment, the damage of underwater global scattering to the consistency of the image sequence is not fully considered, key mechanisms such as cycle consistency constraint and the like are invalid in the underwater scene, and the model suitability and the robustness are insufficient. Disclosure of Invention Aiming at the problems, the invention provides an underwater long-distance high-precision three-dimensional imaging method and system. The method extracts depth features of a plurality of gating images in parallel through a plurality of encoders, combines a cross-level fusion mechanism to reserve time sequence information and physical constraint, adopts an end-to-end optimization training strategy to improve the utilization efficiency of the features, optimizes a data set construction strategy to relieve the problems of data scarcity and high cost, and ensures high-precision reconstruction and simultaneously gives consideration to calculation efficiency. The method and the device can remarkably improve the accuracy, robustness and generalization capability of underwater depth reconstruction, and provide reliab