CN-119887557-B - Knowledge-embedded diffusion model-based infrared image stripe removal method

CN119887557BCN 119887557 BCN119887557 BCN 119887557BCN-119887557-B

Abstract

The invention discloses an infrared image stripe removing method based on a knowledge embedded diffusion model, which comprises the steps of S1, obtaining a stripe noise infrared image to be processed, S2, inputting the stripe noise infrared image into a trained stripe removing model, outputting a corresponding stripe removing infrared image, training the stripe removing model, jointly optimizing parameters of a denoising network based on a noise loss function and a priori loss function introducing stripe noise direction prior, and S3, outputting the output stripe removing infrared image as a stripe removing result of the stripe noise infrared image to be processed. According to the invention, a directional wavelet convolution module is introduced as a knowledge priori of the model on the basis of a diffusion model, the module can fuse the stripe direction priori with semantic information, a knowledge priori loss function of a calculation model is supplemented, the generation direction of the diffusion model is guided by directional constraint, and a knowledge and data dual-drive composite infrared stripe removal mode is formed.

Inventors

LI LINGXIAO
HUANG DAN
WANG XIN
LIU LINLIN
ZHAO JIANBO
TAN XIN
ZHENG HAONAN

Assignees

重庆理工大学

Dates

Publication Date: 20260512
Application Date: 20241230

Claims (8)

1. An infrared image streak removing method based on knowledge embedded diffusion model is characterized by comprising the following steps: s1, acquiring an infrared image with stripe noise to be processed; S2, inputting the infrared image with the stripe noise to be processed into a trained stripe removal model, and outputting a corresponding stripe removal infrared image; the streak removal model is built based on a potential diffusion model, and the training steps are as follows: s201, acquiring an infrared image y with stripe noise serving as a training sample and a corresponding infrared clear image x; s202, carrying out conditional coding on an infrared image y with stripe noise to obtain a conditional feature map c, and coding an infrared clear image x to obtain an initial feature vector z 0 of a potential space; S203, continuously adding standard Gaussian noise epsilon t -N (0,I) to an initial feature vector z 0 of a potential space at T discrete time nodes in sequence to obtain a potential feature vector z T at the time of t=T, wherein N (·) represents Gaussian distribution, and I represents an identity matrix; s204, through combining T trained denoising networks with the conditional feature map c, carrying out inverse spread denoising on the potential feature vector z t starting at the time t=T until the time t=0 in sequence to obtain potential feature estimation S205 estimation of potential features by decoder Recovering to obtain stripped infrared image S206, calculating a noise loss function through the difference between the standard Gaussian noise epsilon t added at each time node and the predicted noise epsilon θ (z t , t and c) output at the corresponding moment of the denoising network, wherein z t represents an intermediate potential feature vector at the moment t; S207 based on the infrared sharp image x and the striping infrared image The direction priori of the introduced stripe noise calculates a priori loss function; In step S207, the a priori loss function is calculated by: s2071, respectively carrying out infrared clear image x and striping infrared image through a directional wavelet convolution module Performing wavelet decomposition of multiple scales to obtain multi-scale wavelet decomposition results in different directions; s2072, extracting infrared clear image x and striping infrared image Wavelet vertical component x HL and at each scale S2073 by wavelet vertical component x HL and Calculating a priori loss function; The formula is: wherein L prior is a priori loss; In S2071, the directional wavelet convolution module includes 4 convolution filters, which are defined as follows: Wherein f LL is a low-pass filter corresponding to the low-frequency information of the input image, f LH 、f HL and f HH are three high-pass filters corresponding to the high-frequency information of the input image in the horizontal, vertical and diagonal directions respectively; For a given infrared image I, a convolution filter is utilized to obtain multi-scale wavelet decomposition results in different directions, and the expression is as follows: Conv (&) represents convolution operation, corresponds to 4 discrete wavelet convolutions and is used for further decomposing the low-frequency component of the current (i-1) scale into 4 components under the i scale; Respectively outputting results of the infrared image under the i-th scale in wavelet low-frequency, horizontal, vertical and diagonal directions; S208, calculating a total loss function through the noise loss function and the prior loss function and reversely optimizing parameters of the denoising network; S209, repeating the steps S201 to S208 to train the denoising network iteratively until convergence or reaching the preset iteration times; after training, the processing steps of the streak removal model are as follows: s211, encoding an infrared image with stripe noise to be processed to obtain potential feature vectors of all time nodes; s212, carrying out condition coding on the infrared image with the stripe noise to be processed to obtain a condition feature map; S213, carrying out inverse diffusion denoising on potential feature vectors starting at the time t=T in sequence until the time t=0 by combining the trained T denoising networks with the conditional feature map to obtain potential feature estimation; s214, recovering the potential feature estimation through a decoder to obtain a stripped infrared image; and S3, outputting the output stripped infrared image as a stripped noise infrared image stripped result to be processed.
2. The method for removing infrared image stripes based on knowledge-embedded diffusion model of claim 1, wherein the encoding is performed by a VAE variance self-encoder in step S202 and step S211.
3. The method for removing streaks from an infrared image based on a knowledge-based embedded diffusion model as in claim 1, wherein the streak noise infrared image is subjected to conditional encoding by a conditional encoder in step S202 and step S212.
4. The method for removing infrared image fringes based on knowledge-embedded diffusion model of claim 1, wherein in step S203, T discrete time nodes are sampled, standard Gaussian noise epsilon t is randomly generated at each time node, standard Gaussian noise epsilon t corresponding to each time is sequentially added onto potential space vector z 0 layer by layer from t=0 time until t=T time, and T intermediate potential feature vectors z 1 to z T are generated.
5. The method for removing infrared image fringes based on knowledge-embedded diffusion model of claim 1, wherein in step S204, T denoising networks corresponding to T time nodes one by one are set, at the time t=T, the first denoising network takes an intermediate latent feature vector z T and a conditional feature map c as inputs to perform inverse diffusion denoising, and corresponding latent feature estimation is output The subsequent denoising network sequentially takes the potential feature estimation and the conditional feature map c output by the previous denoising network as inputs to perform inverse spread denoising until the time t=0, and finally obtains the potential feature estimation
6. The method for removing infrared image stripes based on knowledge-embedded diffusion model of claim 1, wherein in step S205 and step S214, restoration is performed by a VAE decoder.
7. The method for removing infrared image fringes based on knowledge-embedded diffusion model of claim 1, wherein in step S206, the formula of the noise loss function is expressed as: L noise ＝||ε t -ε θ (z t ,t,c)|| 2 ; Where L noise denotes the noise loss, ε t denotes the added Gaussian noise, ε θ (z t , t, c) denotes the predicted noise output of the denoising network, z t denotes the intermediate latent feature vector at time t, and c denotes the conditional feature map.
8. The method for removing infrared image fringes based on knowledge-embedded diffusion model of claim 1, wherein in step S208, the calculation formula of the total loss function is as follows: L total ＝L noise +λL prior ; where L total represents the total loss function, λ is an adjustable super-parameter, L prior is a priori loss, and L noise represents the noise loss.

Description

Knowledge-embedded diffusion model-based infrared image stripe removal method Technical Field The invention relates to the technical field of infrared imaging and artificial intelligence big data, in particular to an infrared image stripe removing method based on a knowledge embedded diffusion model. Background The infrared Imaging (IR) technology has important application in the fields of military guidance, aerospace, remote sensing detection, security monitoring and the like because of the outstanding advantages of all-weather operation, high concealment, strong anti-interference capability and the like. However, due to the limitations of manufacturing level and processing accuracy of Infrared focal plane arrays (IRFPA), the actual response of each detection unit of an Infrared imaging system tends to be difficult to be completely uniform, resulting in the image output by the Infrared imaging system often having many non-uniform streak noise, known as fixed pattern noise (Fixed Pattern Noise, FPN) of the Infrared focal plane array, even with uniform radiation. The existence of the FPN can seriously influence the visual effect of the output image, greatly reduce the temperature resolution and the detection response capability of the infrared imaging system, and finally restrict the further application of the infrared imaging technology. Therefore, correction of non-uniform streak noise must be performed on the acquired infrared detection image in order to better perform subsequent image target recognition tasks. At present, algorithm researches on the infrared image striping can be roughly divided into two types, namely 1) a method based on traditional model optimization and 2) a method based on deep learning. For the various methods of 1), such as a filtering method, a minimized energy constraint method, a response statistical method and the like, the kernel is to strip the stripe noise and the image scene by adopting a single-frame or multi-frame iterative optimization algorithm according to the visual difference between the image scene and the stripe noise and through a manually designed feature extraction operator. The algorithm is simple in design, but the image attribute is described by excessively relying on manually designed rough features, so that the algorithm is poor in adaptability, and an acceptable correction effect on stripe noise of different intensities and types is difficult to achieve at the same time. The core of the methods of 2) is to extract the space-time distribution characteristics of streak noise from a large amount of sample data by using the strong characteristic representation capability of the Convolutional Neural Network (CNNs) and using a sample learning method, thereby eliminating the streak noise. In recent years, some methods are combined with a deep generation model based on a GAN network to realize the effect of self-supervision or unsupervised learning. Although various stripe removal methods based on deep learning show strong feature learning and modeling capabilities, the various stripe removal methods have defects which are difficult to overcome to date. For example, the CNN-based method restricts the prediction accuracy of complex semantic information due to the limited receptive field size of a general convolution kernel, so that the model can not completely distinguish stripe noise and background textures, and various generated models based on GAN can face challenges such as mode collapse, model optimization difficulty, gradient disappearance and the like, which can have adverse effects on the task of removing the infrared image stripes. In recent years, a new architecture called a diffusion model in a depth generation model has demonstrated great application potential in various tasks of computer vision. The applicant finds that compared with the GAN model, the diffusion model can generate more real detail textures in various low-level visual tasks such as image denoising, restoration, super resolution and the like, better image quality is output, and the problems of unstable training, difficult model optimization and the like are avoided as in the GAN model. The diffusion model adopts a progressive refinement feature learning strategy, and can simulate the feature distribution of a target image domain in an iterative sampling mode, which is consistent with the physical process of infrared image striping. Therefore, how to design an infrared image streak removing method based on a diffusion model is a technical problem to be solved. Disclosure of Invention Aiming at the defects of the prior art, the invention aims to provide an infrared image stripe removing method based on a knowledge embedding diffusion model, a directional wavelet convolution module is introduced on the basis of the diffusion model to serve as a knowledge priori of the model, the module can fuse the stripe direction priori with semantic information, supplement a knowledge priori loss functio