Search

CN-116977188-B - Infrared image enhancement method based on depth full convolution neural network

CN116977188BCN 116977188 BCN116977188 BCN 116977188BCN-116977188-B

Abstract

The invention discloses an infrared image enhancement method based on a depth full convolution neural network, which relates to the technical field of infrared images and comprises the following steps of collecting and manufacturing a high-low gain infrared image pair as a data set necessary for deep learning, dividing the data set into a training sample set and a test sample set according to a proportion, carrying out data enhancement on the training sample set to obtain a richer training data set, aiming at the problems of low contrast of a low-quality infrared image, fuzzy details and the like, designing a full convolution infrared image enhancement network by introducing a residual structure, carrying out supervision training on the network by utilizing the training sample set to obtain an infrared image enhancement model which is suitable for infrared images with different resolutions, and finally inputting the low-quality infrared image to be tested into the model to obtain the enhanced infrared image.

Inventors

  • LIU GUIHUA
  • LI GUILIN
  • Pang Zhongxiang
  • XU FENG
  • CHEN CHUNMEI
  • PU WEI
  • Gong Yinjun

Assignees

  • 西南科技大学

Dates

Publication Date
20260508
Application Date
20220415

Claims (5)

  1. 1. An infrared image enhancement method based on a depth full convolution neural network is characterized by comprising the following steps: s1, acquiring and manufacturing infrared high-low gain image pairs to obtain rich training sample sets and test sets; s2, according to different obtaining methods, carrying out corresponding data preprocessing and data augmentation on the training sample set obtained in the S1; s3, constructing an infrared image enhancement network by introducing a convolution block and a residual error structure; S4, sending the training sample set obtained in the S2 into a network designed in the S3 for training to obtain an infrared image enhancement model; s5, inputting the infrared test image obtained in the S1 into an infrared image enhancement model trained in the S4, and obtaining an enhanced infrared image; the infrared image enhancement network constructed in S3 includes the following: The image feature extraction part extracts infrared image features by using 8 convolution layers to obtain 8 feature images, wherein LeakyReLU activation functions are used behind the convolution layers, and mathematical expressions of LeakyReLU activation functions are shown as formula 1: Then fusing the feature map by utilizing a residual structure so that the feature map contains rich detail information and semantic information, and fully fusing features of each layer by using two-time jump connection in 8 convolution layers; Finally, outputting a result by using 1 convolution layer, wherein LeakyReLU activation functions are also used behind the convolution layer; the convolution block structure in S3 is: For the first to fourth convolution layers, the size of the convolution kernel is set to 7×7, the step size is set to 1, and the padding is set to 3; For the fifth layer of convolution layer, the size of the convolution kernel is set to 5×5, the step size is set to 1, and the padding is set to 2; For the sixth to ninth convolution layers, the size of the convolution kernel is set to 3×3, the step size is set to 1, and the padding is set to 3; padding is filled with 0 for all convolutions blocks; the mathematical expression of the residual structure described in S3 is shown in equation 2: Wherein x and y represent input and output, respectively, f (x) represents a feature extraction layer, features extracted from a first layer convolution layer of the infrared image enhancement model are connected to a fourth layer by using a jump connection, features extracted from a fifth layer convolution layer are connected to an eighth layer, so as to avoid gradient extinction and gradient explosion, and meanwhile, training complexity can be reduced by using a residual structure, and back propagation is facilitated.
  2. 2. The method for enhancing an infrared image based on a deep-full convolutional neural network according to claim 1, wherein the method for acquiring and producing an infrared high-low gain image pair in S1 is as follows: synthesizing gray infrared pictures, and carrying out graying treatment on visible light images from a public data set to obtain high-gain gray pictures; And secondly, using a pseudo-color infrared picture, using a commercial infrared core camera with a high-low gain mode, selecting different scenes to respectively acquire infrared image pairs, and forming a data set by all the infrared image pairs.
  3. 3. The method for enhancing an infrared image based on a deep convolutional neural network according to claim 1, wherein the data preprocessing and enhancing operation in S2 is as follows: S31, cutting an original image pair by utilizing a sliding window cutting algorithm to make the length and the width of the image equal, and expanding a training sample pair; S32, performing data augmentation on the training sample by adopting an image overturning mode, an image contrast conversion mode and an image scaling mode.
  4. 4. The method for enhancing an infrared image based on a deep-convolutional neural network according to claim 1, wherein the super-parameters of the training process in S4 are set as follows: (1) The initial learning rate lr is 1e-3, the learning rate is reduced from 1e-3 to 1e-9 by using a cosine annealing attenuation mode, and the cosine annealing formula is shown in formula 3: Wherein, eta is the current learning rate, eta max is the maximum learning rate, eta min is the minimum learning rate, tcur is the current iteration number, and Tmax is the maximum iteration number; (2) The optimization algorithm selects an Adam gradient descent method, the momentum parameter is set to be 0.9, the loss function uses mean square loss, the optimization process adopts the Adam gradient descent method and a general counter-propagation scheme, PSNR (peak signal to noise ratio) is measured simultaneously when training is carried out, so as to check the progress of model training, and an Adam algorithm strategy can be expressed as formula 4: Wherein m t and v t are first-order momentum and second-order momentum items respectively, beta 1 ,β 2 is a power value, the sizes are 0.9 and 0.999, , Respectively correcting values, wherein W t represents time t, namely the parameter of the t iteration model, g t =△J(W t ) represents the gradient of the t iteration cost function relative to W, epsilon is a small number, generally 1e-8, for avoiding that the denominator is 0; (3) Batch size batch_size is set to 64, total iteration period EPOCH is set to 500, total training samples are set to S, and maximum number of iterations N is shown in equation 5: 。
  5. 5. the infrared image enhancement method based on the deep-full convolutional neural network according to claim 1, wherein the specific process of S4 is as follows: s41, inputting an infrared image with a set size, and after normalization processing, maintaining the resolution of the image obtained by each convolution layer unchanged; s42, further extracting deep features by using three convolution modules, and fusing the output of the deep features with the features extracted by the first convolution layer, so that texture information in shallow features is effectively reserved, and meanwhile, the degradation problem is avoided; S43, continuing to extract the features after performing the fifth convolution; s44, performing three convolution operations, and fusing the output at the moment with the features extracted by the fifth convolution layer by utilizing a residual structure to obtain a feature map with rich semantic information and detail information as final output; s45, calculating a loss value of the enhanced infrared image relative to the reference infrared image, wherein a loss function uses a mean square error function, and a mathematical expression of the mean square error function is shown in a formula 6: Where m represents the batch size, W and H represent the image length and width, f ij represents the pixel value of the reference infrared image coordinate point (i, j), and f ij ' represents the pixel value of the predicted enhanced infrared image coordinate point (i, j); S46, reversely calculating the parameters of the gradient updated infrared image enhancement network according to the calculated loss value; S47, repeating the steps until the training times reach the set maximum number of times N, stopping training, and storing the trained infrared enhancement model.

Description

Infrared image enhancement method based on depth full convolution neural network Technical Field The invention relates to the technical field of infrared images, in particular to an infrared image enhancement method based on a depth full convolution neural network. Background With the advent of automated mobile devices for merchandise, identification is becoming more and more common during the extreme conditions of night and instability. This need has led to methods of using multi-modal sensors that can complement each other. The selection of a thermal camera provides a rich source of temperature information, less affected by constantly changing illumination or background clutter. However, the resolution of the existing thermal cameras is relatively small compared to RGB cameras, and it is difficult to make full use of such information in recognition tasks. Therefore, enhancing the infrared image to obtain a high quality infrared image is of great significance to the recognition task based on the infrared thermal imaging camera. Conventional image enhancement methods such as histogram equalization (histogram equalization, HE) enhance the target and the background noise is amplified with poor results. It improves the algorithmic contrast-limited histogram equalization (CLAHE) to suppress background noise, but edges are easily blurred. To alleviate this, the present invention, after extensive analysis of existing methods, implements a low resolution thermal image enhancement method based on a deep convolutional neural network. Disclosure of Invention In view of the technical shortcomings, the invention provides an infrared image enhancement method based on a depth full convolution neural network, which comprises the following steps: s1, acquiring and manufacturing infrared high-low gain image pairs to obtain rich training sample sets and test sets; s2, according to different obtaining methods, carrying out corresponding data preprocessing and data augmentation on the training sample set obtained in the S1; s3, constructing an infrared image enhancement network by introducing a convolution block and a residual error structure; S4, sending the training sample set obtained in the S2 into a network designed in the S3 for training to obtain an infrared image enhancement model; s5, inputting the infrared test image obtained in the S1 into the infrared image enhancement model trained in the S4, and obtaining an enhanced infrared image. Preferably, the two methods for obtaining the infrared high-low gain image pair described in S1 are: the first method is to synthesize gray infrared pictures, and gray processing is carried out on visible light images from a public data set to obtain high-gain gray pictures. The graying method comprises adopting random contrast function with contrast factor range of [0.5,0.51] to make the high-gain gray level picture low in contrast, and obtaining corresponding low-gain gray level picture; The second method is to use pseudo-color infrared pictures, use commercial infrared core cameras with high-low gain modes, select different scenes to respectively collect high-low gain infrared image pairs, and the obtained infrared image pairs form a data set. Preferably, the data preprocessing operation described in S2 is: Cutting an original image pair by utilizing a sliding window cutting algorithm, fixing and reducing the infrared image size, and enabling the length and the width of the image to be equal; And secondly, carrying out data enhancement on the cut image by adopting modes of image overturning, image contrast conversion, image scaling and the like. Preferably, the infrared image enhancement network constructed in S3 includes the following: the image feature extraction part extracts infrared image features by using 8 convolution layers to obtain 8 feature images, wherein LeakyReLU activation functions are used behind the convolution layers, and the mathematical expression of LeakyReLU activation functions is as follows: Then fusing the feature map by utilizing a residual structure so that the feature map contains rich detail information and semantic information, and fully fusing features of each layer by using two-time jump connection in 8 convolution layers; and finally, outputting a result by using 1 convolution layer, wherein the back of the convolution layer also uses LeakyReLU activation functions. Preferably, the structure of the convolution block in S3 is: For the first to fourth convolution layers, the size of the convolution kernel is set to 7×7, the step size is set to 1, and the padding is set to 3; For the fifth layer of convolution layer, the size of the convolution kernel is set to 5×5, the step size is set to 1, and the padding is set to 2; For the sixth to ninth convolution layers, the size of the convolution kernel is set to 3×3, the step size is set to 1, and the padding is set to 3; padding is filled with 0 for all convolutions. Preferably, the mathematical expressio