CN-122023162-A - Underwater image cross-domain enhancement method based on unsupervised UCycleGAN
Abstract
The invention discloses an unsupervised UCycleGAN-based underwater image cross-domain enhancement method which comprises the following steps of constructing an improved U-shaped generator, inputting a degraded underwater image into the U-shaped generator to output a fake clear image, sending a real clear image and the fake clear image into a discriminator together, calculating a counterloss through a discrimination probability value, calculating a perception loss, carrying out weighted summation on the counterloss, the perception loss, the cyclic consistency loss and the identity loss to form a final composite loss function, and inputting the underwater image into a trained U-shaped generator to obtain an enhanced clear image. The invention designs the U-shaped generator, which can enhance the extraction capability of the generator on the underwater image characteristics and the processing capability of the generator on details. The invention designs the U-shaped generator, which can enhance the extraction capability of the generator on the underwater image characteristics and the processing capability of the generator on details.
Inventors
- ZHOU XIAOLAN
- PU YIWEN
- WANG MENG
- WANG XIAOLIANG
- LIU YUZHEN
- CAO ZONGJING
- SHU HONGMEI
- BAI LIANG
- Liao Yanqiao
Assignees
- 湖南科技大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260413
Claims (10)
- 1. The method for enhancing the underwater image cross-domain based on the unsupervised UCycleGAN is characterized by comprising the following steps of: Constructing an improved U-shaped generator Pyra-Unet, wherein Pyra-Unet comprises an encoder downsampling module, a decoder upsampling module, a bottleneck layer module and a multi-scale space pyramid attention module MSPA embedded in jump connection; inputting the degraded underwater image to Pyra-Unet and outputting a forged clear image by Pyra-Unet; Step three, a real clear image in the target domain clear image data set and a fake clear image output by Pyra-Unet are sent into a discriminator together, the discriminator outputs a discrimination probability value, and the countermeasures are obtained through calculation of the discrimination probability value; Inputting the real clear image and the fake clear image into a pre-trained deep convolutional neural network VGG respectively, extracting corresponding high-dimensional feature images respectively, and calculating to obtain the perception loss between the two high-dimensional feature images; step five, weighting and summing the countering loss, the perception loss, the cycle consistency loss and the identity loss according to weights to form a final composite loss function, and enabling Pyra-Unet to minimize the total loss and enabling the discriminator to maximize the discrimination capability through the counter propagation updating Pyra-Unet and parameters of the discriminator; Step six, repeating the step one to the step five until Pyra-Unet and the discriminator reach Nash equilibrium, obtaining trained Pyra-Unet, and inputting the underwater image to the trained Pyra-Unet, thus obtaining the enhanced clear image.
- 2. The method of claim 1, wherein in the first step, the encoder downsamples into three layers, the first layer downsamples including a first convolution layer, a first BN layer, a first ReLU layer, and a first residual block that are sequentially connected, BN represents a batch normalization process, reLU represents a ReLU activation function, the second layer downsamples are identical in structure to the first layer downsamples, and the third layer downsamples include a second convolution layer, a second BN layer, and a second ReLU layer that are sequentially connected.
- 3. The method for enhancing the cross-domain of the underwater image based on the unsupervised UCycleGAN as claimed in claim 2, wherein in the first step, the bottleneck layer comprises an expansion convolution group, a convolution stacking unit, a hole convolution operation unit and a feature stitching unit, wherein four-stage serial convolution layers are adopted in the expansion convolution group, the convolution stacking unit comprises expansion convolution, batch normalization and activation functions, and a formula of the hole convolution operation in the hole convolution operation unit is as follows: ; Wherein, the The position is indicated by the position of the object, The input characteristic diagram is represented by a graph of the input characteristics, Representing the position The output after the cavity convolution operation is carried out, In order for the expansion rate to be high, To expand the sequence numbers of the convolutional layers in the convolutional groups, Represent the first Weights of the individual convolutional layers.
- 4. The method for enhancing an underwater image cross-domain based on unsupervised UCycleGAN as claimed in claim 3, wherein in the first step, the decoder upsamples into three layers of upsamples, the first layer upsamples corresponding to the feature map of the third layer downsampling, the first layer upsamples including a first upsampling, a first jump connection, a first convolution; in the first upsampling, upsampling is carried out after the downsampling is subjected to four expansion convolutions; In the first jump connection, the jump connection features after passing through the third layer MSPA are spliced with the output of the third layer MSPA, and the multi-scale features are weighted and fused through pyramid pooling and channel attention; The second layer up-sampling corresponds to a feature map of a second layer down-sampling, and the second layer up-sampling comprises a second up-sampling, a second jump connection and a second convolution; in the second jump connection, the features of the jump connection after passing through the second layer MSPA are spliced with the output of the second layer MSPA; the third layer up-sampling corresponds to the feature map of the first layer down-sampling, and the third layer up-sampling comprises a third up-sampling, a third jump connection and a third convolution; In the third jump connection, the features of the jump connection after passing through the first layer MSPA are spliced with the output of the first layer MSPA.
- 5. The method for enhancing an underwater image cross-domain based on unsupervised UCycleGAN as claimed in claim 4, wherein in the first step, the multi-scale spatial pyramid attention module MSPA includes a layered phantom convolution module HPC and a spatial pyramid recalibration module SPR; the hierarchical phantom convolution module HPC extracts enhanced spatial information through multi-scale features, performs channel dimension segmentation on an input feature map, and generates Each subset is processed through a hierarchical residual convolution chain, so that the characteristics of different receptive fields are fused; The hierarchical phantom convolution module HPC comprises a splitting unit, a convolution hierarchy chain and a splicing unit; the splitting unit uniformly splits the input feature map into the following components along the channel dimension A number of sub-sets, each sub-set having a channel number of ; The convolution hierarchical chain adopts a convolution group connected with hierarchical residual errors, and the first group directly processes subsets; the splicing unit splices all subset outputs along the channel dimension to form the multi-scale enhancement feature.
- 6. The method for enhancing an underwater image cross-domain based on the unsupervised UCycleGAN according to claim 5, wherein in the first step, the spatial pyramid recalibration module SPR adopts a dual-component collaboration mechanism including a spatial pyramid aggregation block and a channel interaction block; The spatial pyramid aggregation block adopts double path pooling, namely global path pooling and local path pooling, and performs self-adaptive fusion and weighted summation after double path pooling to obtain a characteristic diagram ; Channel interaction block reception feature map The initial channel attention weight vector is calculated by a lightweight two-layer 1x1 convolution bottleneck structure, and the first layer 1x1 convolution will Channel number of (2) Compressed to Followed by ReLU activation, introducing nonlinearity, a second layer 1×1 convolution to slave the number of channels Recovery to 。
- 7. The unsupervised UCycleGAN-based underwater image cross-domain enhancement method according to claim 6, wherein in the third step, the loss countermeasure The calculation formula of (2) is as follows: ; Wherein, the It is indicated that the desire is to be met, The representation of the arbiter is made of, The representation of the generator is provided with a representation, In order to input an image of the subject, Representing the probability value output by the discriminator after evaluating the image generated by the generator.
- 8. The method for enhancing an underwater image cross-domain based on unsupervised UCycleGAN as claimed in claim 7, wherein in said fourth step, the perceived loss is reduced The calculation formula of (2) is as follows: ; Wherein, the Representation correspondence Is used for the weight coefficient of the (c), Representation correspondence Is used for the weight coefficient of the (c), Representation correspondence Is used for the weight coefficient of the (c), Representing the peak signal-to-noise ratio, Represents the quality index of the product, Representing a correlation index.
- 9. The method for unsupervised UCycleGAN-based underwater image cross-domain enhancement according to claim 8, wherein in the fifth step, the cyclic consistency loss is performed The calculation formula of (2) is as follows: ; Wherein, the The feature extraction function is represented as a function of feature extraction, Representation of A norm; Identity loss The calculation formula of (2) is as follows: ; Wherein, the Representing the target domain image.
- 10. The method for enhancing an underwater image cross-domain based on unsupervised UCycleGAN as claimed in claim 9, wherein in the fifth step, a composite loss function is obtained The calculation formula of (2) is as follows: ; Wherein, the In order to counter the lost weight, In order to perceive the weight of the loss, For the weight of the cyclical consistency loss, Is the weight of identity loss.
Description
Underwater image cross-domain enhancement method based on unsupervised UCycleGAN Technical Field The invention relates to the field of image processing, in particular to an unsupervised UCycleGAN-based underwater image cross-domain enhancement method. Background The underwater image is affected by scattering, absorption and the like of the water body, so that the problems of low contrast, color distortion, blurring and the like often exist, and the application effects of underwater target identification, underwater robot navigation and the like are seriously affected. At present, although some underwater image enhancement methods exist, most of the methods have the problems that firstly, a large amount of annotation data is needed for supervised learning, the annotation cost of the underwater image is high and the underwater image is difficult to acquire, secondly, the existing enhancement methods cannot be suitable for different underwater environments and different types of underwater images, the generalization capability is limited, thirdly, the effective cross-domain conversion capability is lacking, the image style in one underwater environment is difficult to be converted into another underwater environment, and the application of the underwater image in different scenes is limited. The existing cyclic generation countermeasure network CycleGAN supports unsupervised conversion, but the standard generator is simple in structure, does not combine with underwater physical priori knowledge, has limited capability for detail recovery, and is easy to cause artifacts and detail distortion of enhanced images due to lack of constraint on an underwater physical degradation model in cross-domain conversion. Furthermore, it is difficult to optimize color correction, contrast enhancement, and texture recovery simultaneously with a single loss function. Disclosure of Invention In order to solve the technical problems, the invention provides the non-supervision UCycleGAN-based underwater image cross-domain enhancement method which is simple in algorithm and high in cross-domain conversion capability. The technical scheme for solving the technical problems is that the method for enhancing the cross-domain of the underwater image based on the unsupervised UCycleGAN comprises the following steps: Constructing an improved U-shaped generator Pyra-Unet, wherein Pyra-Unet comprises an encoder downsampling module, a decoder upsampling module, a bottleneck layer module and a multi-scale space pyramid attention module MSPA embedded in jump connection; inputting the degraded underwater image to Pyra-Unet and outputting a forged clear image by Pyra-Unet; Step three, a real clear image in the target domain clear image data set and a fake clear image output by Pyra-Unet are sent into a discriminator together, the discriminator outputs a discrimination probability value, and the countermeasures are obtained through calculation of the discrimination probability value; Inputting the real clear image and the fake clear image into a pre-trained deep convolutional neural network VGG respectively, extracting corresponding high-dimensional feature images respectively, and calculating to obtain the perception loss between the two high-dimensional feature images; step five, weighting and summing the countering loss, the perception loss, the cycle consistency loss and the identity loss according to weights to form a final composite loss function, and enabling Pyra-Unet to minimize the total loss and enabling the discriminator to maximize the discrimination capability through the counter propagation updating Pyra-Unet and parameters of the discriminator; Step six, repeating the step one to the step five until Pyra-Unet and the discriminator reach Nash equilibrium, obtaining trained Pyra-Unet, and inputting the underwater image to the trained Pyra-Unet, thus obtaining the enhanced clear image. In the above-mentioned method for enhancing the underwater image cross-domain based on the unsupervised UCycleGAN, in the first step, the encoder downsampling is three layers, the first layer downsampling includes a first convolution layer, a first BN layer, a first ReLU layer and a first residual block which are sequentially connected, BN represents batch normalization processing, reLU represents a ReLU activation function, the second layer downsampling has the same structure as the first layer downsampling, and the third layer downsampling includes a second convolution layer, a second BN layer and a second ReLU layer which are sequentially connected. In the above-mentioned method for enhancing underwater image cross-domain based on unsupervised UCycleGAN, in the first step, the bottleneck layer includes an expansion convolution group, a convolution stacking unit, a hole convolution operation unit and a feature splicing unit, wherein the expansion convolution group adopts four-stage serial convolution layers, the convolution stacking unit includes expa