CN-122023164-A - Defogging method and device based on semantic guidance and perception fusion

CN122023164ACN 122023164 ACN122023164 ACN 122023164ACN-122023164-A

Abstract

The invention provides a defogging method and device based on semantic guidance and perception fusion, which relate to the technical field of image processing and are characterized in that a text perception guidance module is introduced, the forward prompt word can be determined to conduct semantic guidance on the defogging process of the diffusion defogging main network, and the image quality of defogging images obtained by the diffusion defogging main network can be improved. By carrying out joint training on the diffusion defogging main network, the physical perception guide parameter estimation module and the perception fusion module, the output results of data driving and physical model driving can be effectively combined in the training process, the performance of the diffusion defogging main network is obviously improved, fog residues in defogging images are reduced, the defogging effect of the images is improved, higher detail retention and color reduction are realized, dependence on large-scale paired image samples can be reduced, and reliable clear images can be provided for subsequent industrial application.

Inventors

GAO YIN
LI JUN
LI HONGYUN
GUO FEIFEI
LIN YIBIN

Assignees

泉州职业技术大学

Dates

Publication Date: 20260512
Application Date: 20260415

Claims (10)

1. The defogging method based on semantic guidance and perception fusion is characterized by comprising the following steps of: Acquiring an image to be defogged; Inputting the image to be defogged to an image defogging model to obtain a defogging image corresponding to the image to be defogged, which is output by the image defogging model; The image defogging model comprises a text perception guiding module and a diffusion defogging main network, wherein the text perception guiding module is used for determining a forward prompt word based on the image to be defogged and converting the forward prompt word into a semantic embedded vector; The diffusion defogging main network is obtained by carrying out combined training on the basis of a real fog pattern sample, a physical perception guiding parameter estimation module and a perception fusion module, wherein the physical perception guiding parameter estimation module is used for estimating a target atmospheric light image and a target transmissivity image in the real fog pattern sample, and the perception fusion module is used for carrying out perception fusion on a defogging image of the real fog pattern sample determined by the diffusion defogging main network and a reconstructed defogging image obtained on the basis of the target atmospheric light image and the target transmissivity image to obtain a fusion image.
2. The semantic guidance and perception fusion-based defogging method according to claim 1, wherein the step of joint training comprises: Based on the target atmospheric light image and the target transmissivity image, applying an atmospheric scattering model to obtain a reconstructed fog map, and calculating scene reconstruction loss based on the real fog map sample and the reconstructed fog map; Calculating a dark channel prior loss based on the dark channel map of the fusion image; calculating a luminance saturation loss based on a luminance component and a saturation component of the defogged image; and calculating model training loss based on at least one of the scene reconstruction loss, the dark channel prior loss and the brightness saturation loss, and performing joint training on the diffusion defogging main network, the physical perception guide parameter estimation module and the perception fusion module based on the model training loss.
3. The defogging method based on semantic guidance and perception fusion according to claim 1, wherein the text perception guidance module is specifically configured to: Generating a degraded text description of the image to be defogged based on an image subtitle generator, and determining the forward prompt word based on the degraded text description; and based on a text encoder, applying a predetermined haze description text to convert the forward prompt word into the semantic embedded vector.
4. The defogging method based on semantic guidance and perception fusion according to claim 3, wherein the haze descriptive text is obtained by applying a text inversion method based on a plurality of real haze pattern books.
5. The defogging method based on semantic guidance and perception fusion according to claim 1, wherein the physical perception guidance parameter estimation module is specifically configured to: Determining a first initial atmospheric light image and a first initial transmittance image of the real fog pattern sample based on a dark channel prior, and determining a second initial atmospheric light image and a second initial transmittance image of the real fog pattern sample based on a color attenuation prior; Based on an atmosphere light fusion network, applying a channel attention mechanism to carry out pixel-level fusion on the first initial atmosphere light image and the second initial atmosphere light image to obtain the target atmosphere light image; And based on a transmittance fusion network, applying a spatial attention mechanism, and carrying out pixel-level fusion on the first initial transmittance image and the second initial transmittance image to obtain the target transmittance image.
6. The defogging method based on semantic guidance and perception fusion according to claim 5, wherein the atmospheric optical fusion network comprises a plurality of downsampling layers, a channel attention layer, a plurality of upsampling layers and a convolution layer which are sequentially connected; The transmissivity fusion network comprises a plurality of downsampling layers, a spatial attention layer, a plurality of upsampling layers and a convolution layer which are sequentially connected.
7. The defogging method based on semantic guidance and perception fusion according to any of claims 1-6, wherein said perception fusion module is specifically configured to: Respectively calculating a first gradient map of the defogging image and a second gradient map of the reconstructed defogging image, and respectively calculating a first chromaticity map of the defogging image and a second chromaticity map of the reconstructed defogging image; calculating a first global similarity between the defogging image and the image to be defogged based on the first gradient map and the first chromaticity diagram, and calculating a second global similarity between the reconstructed defogging image and the image to be defogged based on the second gradient map and the second chromaticity diagram; And calculating fusion weights of the defogging images and the reconstructed defogging images based on the first global similarity and the second global similarity, and performing perception fusion on the defogging images and the reconstructed defogging images based on the fusion weights.
8. Defogging device based on semantic guidance and perception fusion, characterized by comprising: The image acquisition module is used for acquiring an image to be defogged; The image defogging module is used for inputting the image to be defogged to an image defogging model to obtain a defogging image corresponding to the image to be defogged output by the image defogging model; The image defogging model comprises a text perception guiding module and a diffusion defogging main network, wherein the text perception guiding module is used for determining a forward prompt word based on the image to be defogged and converting the forward prompt word into a semantic embedded vector; The diffusion defogging main network is obtained by carrying out combined training on the basis of a real fog pattern sample, a physical perception guiding parameter estimation module and a perception fusion module, wherein the physical perception guiding parameter estimation module is used for estimating a target atmospheric light image and a target transmissivity image in the real fog pattern sample, and the perception fusion module is used for carrying out perception fusion on a defogging image of the real fog pattern sample determined by the diffusion defogging main network and a reconstructed defogging image obtained on the basis of the target atmospheric light image and the target transmissivity image to obtain a fusion image.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the semantic guidance and perception fusion-based defogging method according to any of claims 1-7 when executing the computer program.
10. A computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements a semantic guidance and perception fusion based defogging method according to any of the claims 1-7.

Description

Defogging method and device based on semantic guidance and perception fusion Technical Field The invention relates to the technical field of image processing, in particular to a defogging method and device based on semantic guidance and perception fusion. Background In haze weather, micron-sized particles suspended in the atmosphere cause light scattering, so that the definition, contrast and color fidelity of images are seriously reduced, and the performance of downstream computer vision tasks is further restricted. The existing image defogging model mostly relies on paired foggy images and foggy images for supervision training. Because the paired foggy images and foggy images in the real scene are extremely difficult to acquire, the existing method mostly adopts a synthetic foggy image generated based on an atmospheric scattering model. However, the non-uniform distribution and the optical scattering characteristics of the complex real fog patterns are difficult to completely fit to the synthetic fog patterns, so that the trained image defogging model has obvious domain difference in practical application, namely, the data distribution of the synthetic fog patterns learned by the image defogging model in the training stage is inconsistent with the data distribution of the real fog patterns encountered by the image defogging model in the practical application stage, and the defogging effect of the image defogging model is reduced. Disclosure of Invention The invention provides a defogging method and device based on semantic guidance and perception fusion, which are used for solving the defects in the related technology. The invention provides a defogging method based on semantic guidance and perception fusion, which comprises the following steps: Acquiring an image to be defogged; Inputting the image to be defogged to an image defogging model to obtain a defogging image corresponding to the image to be defogged, which is output by the image defogging model; The image defogging model comprises a text perception guiding module and a diffusion defogging main network, wherein the text perception guiding module is used for determining a forward prompt word based on the image to be defogged and converting the forward prompt word into a semantic embedded vector; The diffusion defogging main network is obtained by carrying out combined training on the basis of a real fog pattern sample, a physical perception guiding parameter estimation module and a perception fusion module, wherein the physical perception guiding parameter estimation module is used for estimating a target atmospheric light image and a target transmissivity image in the real fog pattern sample, and the perception fusion module is used for carrying out perception fusion on a defogging image of the real fog pattern sample determined by the diffusion defogging main network and a reconstructed defogging image obtained on the basis of the target atmospheric light image and the target transmissivity image to obtain a fusion image. According to the defogging method based on semantic guidance and perception fusion provided by the invention, the step of joint training comprises the following steps: Based on the target atmospheric light image and the target transmissivity image, applying an atmospheric scattering model to obtain a reconstructed fog map, and calculating scene reconstruction loss based on the real fog map sample and the reconstructed fog map; Calculating a dark channel prior loss based on the dark channel map of the fusion image; calculating a luminance saturation loss based on a luminance component and a saturation component of the defogged image; and calculating model training loss based on at least one of the scene reconstruction loss, the dark channel prior loss and the brightness saturation loss, and performing joint training on the diffusion defogging main network, the physical perception guide parameter estimation module and the perception fusion module based on the model training loss. According to the defogging method based on semantic guidance and perception fusion provided by the invention, the text perception guidance module is specifically used for: Generating a degraded text description of the image to be defogged based on an image subtitle generator, and determining the forward prompt word based on the degraded text description; and based on a text encoder, applying a predetermined haze description text to convert the forward prompt word into the semantic embedded vector. According to the defogging method based on semantic guidance and perception fusion, the haze description text is obtained by applying a text inversion method based on a plurality of real haze pattern books. According to the defogging method based on semantic guidance and perception fusion provided by the invention, the physical perception guidance parameter estimation module is specifically used for: Determining a first initial atmospheric light image and a fi