CN-122023163-A - Underwater image enhancement and restoration method based on double-domain self-adaptive fusion module

CN122023163ACN 122023163 ACN122023163 ACN 122023163ACN-122023163-A

Abstract

The invention discloses an underwater image enhancement and restoration method based on a dual-domain self-adaptive fusion module, which comprises the following steps of feature extraction, module replacement, frequency decomposition, high-frequency processing, low-frequency modulation, frequency domain reconstruction, normalization processing, statistical enhancement, nonlinear enhancement, dual-source gating fusion, multi-scale enhancement, network backflow and image restoration. The invention has obvious innovation points in network structure position selection, frequency modeling branch processing strategy, normalized branch modeling mode and fusion mechanism design, and can effectively improve the fineness, stability and robustness of underwater image restoration.

Inventors

WANG XIAOLIANG
CAO LU
LIU YUZHEN
ZHOU XIAOLAN
SHU HONGMEI
BAI LIANG
CAO ZONGJING
WANG MENG
Liao Yanqiao

Assignees

湖南科技大学

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (10)

1. The underwater image enhancement and restoration method based on the double-domain self-adaptive fusion module is characterized by comprising the following steps of: s1, extracting characteristics, namely extracting a multi-scale characteristic diagram from an input underwater image to obtain characteristic representations of different levels of an encoder in a U-Net network; S2, replacing a module, namely replacing the third layer downsampling of a U-Net network backbone and the corresponding third layer upsampling of the U-Net network backbone with a double-domain self-adaptive fusion module D2AFM to form a double-module structure symmetrical at a coding end and a decoding end, wherein the D2AFM comprises a frequency modeling branch, a normalization enhancement branch and a gating fusion unit; S3, frequency decomposition, namely sending the input characteristics of the encoding end D2AFM into a frequency modeling branch to perform two-dimensional discrete wavelet transform DWT, and decomposing the characteristics into a low-frequency sub-band and three high-frequency sub-bands; s4, high-frequency processing, namely applying a fixed soft threshold function to the high-frequency sub-band to inhibit noise, and enhancing effective high-frequency details in a channel weighting mode to obtain high-frequency enhancement characteristics; s5, low-frequency modulation, namely introducing a leachable modulation factor into a low-frequency sub-band, and dynamically adjusting the contribution of low-frequency information in the fusion process according to the degradation condition and the content of an input image so as to extract low-frequency characteristics; S6, reconstructing a frequency domain, namely reconstructing the high-frequency enhancement feature and the low-frequency feature through inverse wavelet transform IDWT and taking an absolute value to obtain an output feature of a frequency modeling branch; s7, normalization processing, namely synchronously feeding the input features of the encoding end D2AFM into a normalization enhancement branch to obtain the features after preliminary normalization; s8, carrying out statistics enhancement, namely carrying out global average pooling on the features after preliminary normalization, carrying out residual addition on the features and the input features of the D2AFM at the coding end, and carrying out group normalization treatment to obtain the features of the statistics enhancement; S9, non-linear enhancement, namely inputting the characteristics of statistical enhancement into a lightweight deep feed-forward network DFFN to obtain the output characteristics of a normalized enhancement branch; Step S10, double-source gating fusion, wherein a gating fusion unit performs weighted fusion on the output characteristics of the frequency modeling branch and the output characteristics of the normalized enhancement branch, and self-adaptively adjusts the contribution of the two branches according to the input characteristic distribution to obtain the final output of the encoding end D2 AFM; S11, multi-scale enhancement, namely, sending the final output of the encoding end D2AFM into a bottleneck layer to enhance multi-scale representation; and step S12, network reflow and image restoration, namely sending the output of the bottleneck layer to a decoder for processing, sending the output of a decoding end D2AFM to a subsequent decoder for reconstructing a characteristic diagram step by step, and outputting a restored underwater image to complete the overall enhancement and restoration process.
2. The method for enhancing and restoring an underwater image based on a two-domain adaptive fusion module according to claim 1, wherein in the step S3, the frequency modeling branch inputs features to the encoding end D2AFM Performing a two-dimensional discrete wavelet transform DWT to Decomposition into low frequency subbands First high frequency sub-band Second high frequency sub-band Third high frequency sub-band , , In the real number domain of the number, In order to provide the number of channels, And (3) with The height and width of the feature map respectively, Containing the information of the global smoothness, 、、 Vertical, horizontal, diagonal structural details are captured, respectively.
3. The method for enhancing and restoring an underwater image based on the two-domain adaptive fusion module according to claim 2, wherein in the step S4, three high-frequency sub-bands are spliced in a channel dimension, and a splicing result is that , , Representing the splicing operation, then applying a fixed soft threshold operator to obtain sparse features , Τ is the threshold value, and, Is a soft threshold function, is used for thresholding the input signal, The sign function is represented by a sign function, Indicating that the operation is to take the maximum value, Representing absolute value taking operation, then for sparse features Applying channel weighting to obtain high-frequency enhancement features 。
4. The method for enhancing and restoring an underwater image based on the two-domain adaptive fusion module according to claim 3, wherein in the step S6, high-frequency enhancement features are used And low frequency characteristics Reconstructing spatial features by inverse wavelet transform IDWT and taking absolute values to obtain output features of frequency modeling branches , , The operation of taking the absolute value is indicated, Representing the inverse wavelet transform.
5. The method for enhancing and restoring an underwater image based on a two-domain adaptive fusion module according to claim 4, wherein in the step S7, input features of the encoding end D2AFM are obtained Synchronous feeding of normalized enhancement branches, first employing a3 x 3 convolutional layer pair Extracting local receptive fields, then applying group normalization to the 3×3 convolution layer output, and finally introducing a ReLU activation function to enhance the nonlinear capability, thereby obtaining the features after preliminary normalization.
6. The method for enhancing and restoring an underwater image based on a two-domain adaptive fusion module according to claim 5, wherein in step S8, the initially normalized features are input into a global averaging pooling unit for global averaging pooling, pooling results and And carrying out residual addition, and carrying out group normalization processing on the result obtained by the residual addition again to obtain the statistically enhanced characteristic.
7. The method for enhancing and restoring underwater image based on the two-domain adaptive fusion module according to claim 6, wherein in the step S9, the statistically enhanced features are input into a lightweight depth feedforward network, the lightweight depth feedforward network includes a head point convolution, a depth convolution, a GELU activation function and an end point convolution connected in sequence, the head point convolution implements cross-channel information interaction, the depth convolution independently performs convolution operation inside each channel for capturing spatial context relation, the GELU activation function introduces smooth nonlinear mapping, the end point convolution re-integrates and outputs all features, and the final output features of the normalized enhancement branch are obtained , Wherein, a lightweight deep feed-forward network Expressed as: , representing the convolution of the head-end point, Representing a depth convolution of the image with the image, The representation GELU of the activation function, i.e. gaussian error linear unit activation function, Representing the convolution of the end points, A group normalization operation is represented and, Representing a global average pooling of the data, Representing a lightweight deep feed forward network Is input to the computer.
8. The method for enhancing and restoring an underwater image based on the two-domain adaptive fusion module according to claim 7, wherein in the step S10, the output characteristics of the frequency modeling branches are calculated Final output characteristics of and normalized enhancement branch Splicing in the channel dimension to obtain splicing characteristics , , The operation of the splice is indicated and, , Is real number domain, and is characterized by splicing Input a1 x 1 convolution layer, activate the function by convolution mapping and Sigmoid Generating a gating weight matrix , , , Representing a1 x 1 convolutional layer based on a gating weight matrix Will be And (3) with Weighted fusion is carried out to obtain the final output of the D2AFM of the coding end , , Representing element-wise multiplication.
9. The underwater image enhancement and restoration method based on the two-domain adaptive fusion module according to claim 8, wherein in the step S11, the final output of the D2AFM at the encoding end The first 1X 1 convolution and GELU activation functions are processed to obtain features Features of Is input to the stripe branches, square branches, and global branches to enhance the multi-scale representation; in square branches, depth convolution of kernel size K×K is applied to pursue receptive fields, K is a positive integer; in the stripe branches, stripe depth convolutions of 1×k and k×1 are used to collect stripe context information; The global branches are trained on the image patches with the dimensions of 3 multiplied by 256 after clipping, and the space size of the features in the bottleneck layer is 64 multiplied by 64; by employing dual domain processing in global branching, features are first identified The frequency channel attention FA is applied as follows: ; wherein the FFT represents the fast Fourier transform, the IFFT represents the inverse of the fast Fourier transform, Representing the output of the frequency channel attention FA, Representing a1 x 1 convolution layer in the frequency channel attention FA, GAP represents global average pooling, Representing element-by-element multiplication; The output of the frequency channel attention FA is further input to a spatial channel attention module SA, denoted as: ; Wherein, the For the output of the global branch, Representing a1 x 1 convolution layer in the spatial channel attention module SA; The results of the stripe branches, square branches and global branches are fused by addition and modulated by a second 1 x 1 convolution and a ReLU activation function to obtain the bottleneck layer output.
10. The method for enhancing and restoring underwater images based on the two-domain adaptive fusion module according to claim 9, wherein in the step S12, the output of the bottleneck layer is sent to a decoder to participate in the subsequent feature reconstruction process, meanwhile, the output of the decoding end D2AFM is sent to a progressive up-sampling path of the subsequent decoder, in the decoding stage, the decoder fuses the shallow layer features and the deep layer features through jump connection to realize progressive restoration of spatial resolution, the up-sampling operation of the third layer is replaced by D2AFM at the decoding end to further optimize the image restoration process, and finally, the restored images consistent with the resolution of the input images are obtained through progressive up-sampling and convolution reconstruction of each layer of the decoder.

Description

Underwater image enhancement and restoration method based on double-domain self-adaptive fusion module Technical Field The invention relates to the field of image processing, in particular to an underwater image enhancement and restoration method based on a double-domain self-adaptive fusion module. Background The problems of contrast reduction, color shift, fuzzy texture details, random noise enhancement and the like often occur in the underwater image due to the absorption and scattering effects of light in the propagation process, so that the usability of the underwater image is seriously influenced, and the application of the underwater image in the fields of ocean exploration, underwater robot navigation, archaeology, military and the like is limited. Conventional image restoration methods based on physical models generally rely on simplified assumptions on imaging models, and it is difficult to accurately estimate model parameters in a practical complex environment, resulting in insufficient reliability of restoration results. With the development of deep learning, an end-to-end method based on convolutional neural networks is widely applied to underwater image restoration. The U-Net structure is taken as a typical framework combining an encoder and a decoder, and is capable of extracting characteristics at multiple scales and preserving local information through jump connection, so that the U-Net structure is outstanding in underwater image restoration tasks. However, such methods still have significant drawbacks in dealing with degradation of underwater images. The convolutional neural network mainly performs local feature extraction in a spatial domain and lacks explicit modeling capability on frequency domain features. The low frequency component contains the overall structure and contours of the image and the high frequency component carries the texture and edge details. In an underwater environment, low frequency information is subject to attenuation by scattering, and high frequency information is often disturbed by noise. The existing method cannot fully utilize frequency information, and is difficult to achieve both structure recovery and noise suppression. On the other hand, the existing model usually adopts a direct splicing or fixed weighting mode in the characteristic fusion stage. Such static fusion strategies do not adaptively adjust the feature contributions according to the degree of degradation of the different images, resulting in excessive smoothing of detail in some cases and in other cases exacerbating noise amplification. Meanwhile, the distribution of the underwater images is obviously different along with the change of the acquisition depth, the water quality condition and the illumination environment, the common batch normalization or layer normalization method is insufficient in stability when applied across scenes, and the generalization capability of the network is limited. There have been studies that attempt to introduce attention mechanisms to enhance feature expression capability or to promote model robustness by regularization methods. But the attention mechanism is mostly limited to the spatial domain or the channel domain, and the frequency domain prior is not combined, so that the defects still exist in the aspect of processing the frequency characteristics. Although the regularization method can alleviate the overfitting to a certain extent, the additional calculation overhead is increased, and meanwhile, the contradiction between detail retention and noise suppression cannot be effectively solved. In practical applications, these deficiencies lead to networks that perform well on specific datasets, but that degrade significantly in complex real underwater scenarios, with unstable recovery results. Disclosure of Invention In order to solve the technical problems, the invention provides the underwater image enhancement and restoration method based on the double-domain self-adaptive fusion module, which is simple in algorithm and high in efficiency. The technical scheme for solving the technical problems is that the underwater image enhancement and restoration method based on the double-domain self-adaptive fusion module comprises the following steps: s1, extracting characteristics, namely extracting a multi-scale characteristic diagram from an input underwater image to obtain characteristic representations of different levels of an encoder in a U-Net network; S2, replacing a module, namely replacing the third layer downsampling of a U-Net network backbone and the corresponding third layer upsampling of the U-Net network backbone with a double-domain self-adaptive fusion module D2AFM to form a double-module structure symmetrical at a coding end and a decoding end, wherein the D2AFM comprises a frequency modeling branch, a normalization enhancement branch and a gating fusion unit; S3, frequency decomposition, namely sending the input characteristics of the encoding e