Search

CN-121998850-A - Underwater image enhancement method, device, terminal and medium based on text-guided degradation generation and course learning

CN121998850ACN 121998850 ACN121998850 ACN 121998850ACN-121998850-A

Abstract

The application relates to the technical field of computer vision and image processing, in particular to an underwater image enhancement method, device, terminal and medium based on text-guided degradation generation and course learning, wherein the method comprises the steps of training a degradation generation model, inputting an underwater clear image into the trained degradation generation model, and generating synthetic degradation images with different difficulty levels based on text prompt libraries with different difficulty levels; the method comprises the steps of constructing a multi-difficulty enhancement data set, wherein the multi-difficulty enhancement data set comprises synthesized degradation images with different difficulty levels, corresponding underwater clear images and text prompts, training an image enhancement model by adopting a progressive course learning strategy based on the multi-difficulty enhancement data set, and inputting an underwater image to be enhanced and the corresponding text prompts into the trained image enhancement model to obtain an enhanced image. The application can precisely control the attribute of the generated degraded image by using natural language, effectively avoid the problem of unstable model training and improve the image quality.

Inventors

  • ZHU LEI
  • SU ZONGXIAN
  • Tian Hongri
  • LI WENXUE
  • SU QUANKE

Assignees

  • 香港科技大学(广州)

Dates

Publication Date
20260508
Application Date
20260408

Claims (10)

  1. 1. An underwater image enhancement method based on text-guided degradation generation and course learning, the method comprising: constructing a degradation generation model based on text guidance, and training the degradation generation model based on a pre-constructed underwater image data set to obtain a trained degradation generation model; Inputting the underwater clear images in the underwater image dataset into a trained degradation generation model, and generating synthetic degradation images with different difficulty levels based on a text prompt library with different difficulty levels constructed in advance; Constructing a multi-difficulty enhancement data set, wherein the multi-difficulty enhancement data set comprises synthesized degraded images with different difficulty levels, and corresponding underwater clear images and text prompts; Constructing an image enhancement model, training the image enhancement model by adopting a progressive course learning strategy based on the multi-difficulty enhancement data set, and obtaining a trained image enhancement model; and inputting the underwater image to be enhanced and the corresponding text prompt into a trained image enhancement model to obtain an enhanced image.
  2. 2. The method for enhancing an underwater image based on text-guided degradation generation and lesson learning as claimed in claim 1, wherein the step of constructing the underwater image dataset comprises: Describing the underwater degradation image by using a multi-mode large language model, and generating degradation description containing a plurality of preset degradation attributes; and acquiring an underwater clear image corresponding to the underwater degradation image, taking the degradation description as a text prompt, and constructing an underwater image data set based on the underwater degradation image, the underwater clear image and the text prompt.
  3. 3. The text-guided degradation generation and lesson learning-based underwater image enhancement method of claim 2, wherein training the degradation generation model based on a pre-constructed underwater image dataset to obtain a trained degradation generation model comprises: Inputting an underwater degradation image, an underwater clear image and a text prompt in an underwater image dataset into the degradation generation model; and under the condition of clear features extracted from the underwater clear images and text features extracted by text prompts, the noise features extracted from the underwater degraded images are used as supervision, the degradation generation model is finely adjusted through low-rank adaptation so as to learn vector fields corresponding to stream matching, and the trained degradation generation model is obtained by optimizing the stream matching loss.
  4. 4. The text-guided degradation generation and lesson learning-based underwater image enhancement method of claim 1, wherein training the image enhancement model with a progressive lesson learning strategy based on the multi-difficulty enhancement dataset results in a trained image enhancement model comprising: And dynamically sampling the sample data in the multi-difficulty enhancement data set, training the image enhancement model based on the sample data obtained by sampling, and continuously and smoothly transiting the sampling distribution according to a preset difficulty level along with the increase of training iteration times so as to obtain the trained image enhancement model.
  5. 5. The text-guided degradation generation and lesson learning-based underwater image enhancement method of claim 4, wherein training the image enhancement model based on sampled sample data comprises: inputting the synthesized degradation image in the sampled sample data into an image enhancement model, extracting potential characteristics of the synthesized degradation image through a variation encoder, and estimating a potential transmission diagram; and (3) splicing the potential features of the synthesized degraded image and the potential transmission diagram, inputting the spliced synthesized degraded image and the potential transmission diagram into a stable diffusion U-shaped network, carrying out single-step training by combining text features extracted by text prompts corresponding to the synthesized degraded image, and processing the synthesized degraded image by a variation decoder to obtain a prediction result.
  6. 6. The text-guided degradation generation and lesson learning-based underwater image enhancement method of claim 5, wherein a multi-scale jump connection is provided between a variational encoder and a variational decoder of the image enhancement model.
  7. 7. The method for enhancing an underwater image based on text-guided degradation generation and lesson learning as claimed in claim 1, wherein inputting the underwater image to be enhanced and the corresponding text prompt into a trained image enhancement model to obtain an enhanced image comprises: Inputting the underwater image to be enhanced and the corresponding text prompt into a trained image enhancement model, extracting potential characteristics of the underwater image to be enhanced through a variational encoder, and estimating a potential transmission diagram; And after the potential features of the underwater image to be enhanced and the potential transmission diagram are spliced, combining the potential features with the text features extracted from the text prompt corresponding to the underwater image to be enhanced through a stable diffusion U-shaped network, and processing the combined text features through a variation decoder to obtain the enhanced image.
  8. 8. An underwater image enhancement device based on text-guided degradation generation and lesson learning, the device comprising: The first training module is used for constructing a degradation generation model based on text guidance, training the degradation generation model based on a pre-constructed underwater image data set and obtaining a trained degradation generation model; The image generation module is used for inputting the underwater clear images in the underwater image data set into a trained degradation generation model and generating synthesized degradation images with different difficulty levels based on a text prompt library with different difficulty levels constructed in advance; The data set construction module is used for constructing a multi-difficulty enhanced data set, wherein the multi-difficulty enhanced data set comprises synthesized degraded images with different difficulty levels, corresponding underwater clear images and text prompts; the second training module is used for constructing an image enhancement model, training the image enhancement model by adopting a progressive course learning strategy based on the multi-difficulty enhancement data set, and obtaining a trained image enhancement model; The image enhancement module is used for inputting the underwater image to be enhanced and the corresponding text prompt into the trained image enhancement model to obtain an enhanced image.
  9. 9. A terminal comprising a memory, a processor and an underwater image enhancement program stored in the memory and operable on the processor for generating and learning a lesson based on text-guided degradation, wherein the method for generating and learning an underwater image based on text-guided degradation is implemented as set forth in any one of claims 1 to 7 when the program is executed by the processor.
  10. 10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program executable for implementing the steps of the text-guided degradation generation and course learning based underwater image enhancement method as claimed in any one of claims 1 to 7.

Description

Underwater image enhancement method, device, terminal and medium based on text-guided degradation generation and course learning Technical Field The invention relates to the technical field of computer vision and image processing, in particular to an underwater image enhancement method, device, terminal and medium based on text-guided degradation generation and course learning. Background Underwater image enhancement is critical for applications such as marine resource exploration, underwater robot vision, and the like. However, underwater images are affected by wavelength-dependent absorption, scattering, and color shift, resulting in serious degradation of imaging quality. The prior art mainly comprises a restoration method based on a physical model and an image enhancement method based on deep learning. However, traditional physical model methods are poor in generalization, and in particular, traditional physical model methods (such as dark channel priors) rely on idealized assumptions on water parameters, and because illumination and scattering of real underwater scenes present high non-uniformity, these methods tend to estimate failure in varying water types, resulting in enhanced results of color distortion or incomplete defogging. The image enhancement method based on deep learning is severely dependent on paired data, but real paired underwater images are extremely difficult to acquire, if the paired underwater image data are synthesized by using a physical formula synthesis mode, the generated degraded image visual effect is not real, and complex illumination and scattering effects cannot be captured. Therefore, the enhanced image quality obtained by the existing underwater image enhancement method is poor. Accordingly, the prior art has drawbacks and needs to be improved and developed. Disclosure of Invention The application provides an underwater image enhancement method, device, terminal and medium based on text-guided degradation generation and course learning, which are used for solving the technical problem of poor enhanced image quality obtained by the underwater image enhancement method in the related technology. In order to achieve the above purpose, the present application adopts the following technical scheme: an underwater image enhancement method based on text-guided degradation generation and course learning, the method comprising: constructing a degradation generation model based on text guidance, and training the degradation generation model based on a pre-constructed underwater image data set to obtain a trained degradation generation model; Inputting the underwater clear images in the underwater image dataset into a trained degradation generation model, and generating synthetic degradation images with different difficulty levels based on a text prompt library with different difficulty levels constructed in advance; Constructing a multi-difficulty enhancement data set, wherein the multi-difficulty enhancement data set comprises synthesized degraded images with different difficulty levels, and corresponding underwater clear images and text prompts; Constructing an image enhancement model, training the image enhancement model by adopting a progressive course learning strategy based on the multi-difficulty enhancement data set, and obtaining a trained image enhancement model; and inputting the underwater image to be enhanced and the corresponding text prompt into a trained image enhancement model to obtain an enhanced image. In one embodiment of the application, the step of constructing the underwater image dataset comprises: Describing the underwater degradation image by using a multi-mode large language model, and generating degradation description containing a plurality of preset degradation attributes; and acquiring an underwater clear image corresponding to the underwater degradation image, taking the degradation description as a text prompt, and constructing an underwater image data set based on the underwater degradation image, the underwater clear image and the text prompt. In one embodiment of the present application, training the degradation generating model based on a pre-constructed underwater image dataset to obtain a trained degradation generating model comprises: Inputting an underwater degradation image, an underwater clear image and a text prompt in an underwater image dataset into the degradation generation model; and under the condition of clear features extracted from the underwater clear images and text features extracted by text prompts, the noise features extracted from the underwater degraded images are used as supervision, the degradation generation model is finely adjusted through low-rank adaptation so as to learn vector fields corresponding to stream matching, and the trained degradation generation model is obtained by optimizing the stream matching loss. In one embodiment of the present application, training the image enhancement model using a progressive