CN-121999326-A - Weak and small infrared target time-space domain feature fusion method and system

CN121999326ACN 121999326 ACN121999326 ACN 121999326ACN-121999326-A

Abstract

The invention provides a method and a system for fusing time-space domain features of a weak and small infrared target, wherein the method comprises the steps of inputting continuous multi-frame infrared weak and small target images into a feature fusion model for training, and obtaining a normalized continuous multi-frame infrared weak and small target image sequence in the training process; obtaining multi-scale space-time feature codes, obtaining enhanced feature codes, obtaining decoded multi-frame continuous feature graphs, obtaining infrared weak and small target space-time fusion features, obtaining updated feature fusion models, judging whether the preset training times or the total loss convergence of the feature fusion models are reached, if yes, taking the current updated feature fusion model as a trained feature fusion model, otherwise, training the updated feature fusion model again, determining the type of the infrared target to be fused according to actual feature fusion requirements, and inputting the type of the infrared target to be fused and the type of the infrared target to be fused into the trained feature fusion model to realize space-time feature fusion of the infrared targets of different types.

Inventors

Song Zizhuang
ZHOU JIE
ZHANG QIANG
LOU YAXIN
HOU QIWEN

Assignees

航天科工集团智能科技研究院有限公司

Dates

Publication Date: 20260508
Application Date: 20241105

Claims (8)

1. The weak small infrared target time-space domain feature fusion method is characterized by comprising the following steps of: Inputting the continuous multi-frame infrared weak and small target images into a feature fusion model for training, and normalizing the continuous multi-frame infrared weak and small target images based on a normalization module in the feature fusion model in the training process to obtain a normalized continuous multi-frame infrared weak and small target image sequence; Compressing the space-time feature scale of the normalized continuous multi-frame infrared weak and small target image sequence layer by layer based on a residual error downsampling module in the feature fusion model, and extracting space-time features based on a multi-layer 3D convolution module in the residual error downsampling module to obtain multi-scale space-time feature codes; performing time sequence causal modeling on the multi-scale space-time feature codes based on a space-time convolution cyclic gating unit in the feature fusion model to obtain enhanced feature codes; Performing multi-scale space-time feature decoding on the enhanced feature codes based on a decoding module in the feature fusion model to obtain decoded multi-frame continuous feature images, wherein the decoded multi-frame continuous feature images are feature images with the same scale as the continuous multi-frame infrared weak small target images; Performing dimension constraint on the decoded multi-frame continuous feature images based on a dimension constraint module in the feature fusion model to obtain infrared weak and small target space-time fusion features; Acquiring cross entropy loss and soft cross-over ratio loss of the feature fusion model based on the space-time fusion features of the infrared weak and small targets, taking the sum of the cross entropy loss and the soft cross-over ratio loss as the total loss of the feature fusion model, and updating the feature fusion model based on the total loss to obtain an updated feature fusion model; Judging whether the preset training times or the total loss convergence of the feature fusion model is reached, if so, taking the current updated feature fusion model as a trained feature fusion model, otherwise, training the updated feature fusion model again; Under the condition that feature fusion model training is completed, determining the type of the infrared target to be fused according to actual feature fusion requirements, and inputting the infrared target to be fused and the type of the infrared target to be fused into a trained feature fusion model to realize space-time feature fusion of the infrared targets of different types.
2. The method of claim 1, wherein the normalization is by gray maximum minimum normalization, mean variance normalization, histogram statistical normalization, or pixel maximum normalization.
3. Method according to claims 1 and 2, characterized in that the enhancement feature coding is obtained by: z t ＝σ(W z X t +U z H t-1 +b z ) r t ＝σ(W r X t +U r H t-1 +b r ) Wherein z t is the output of the update gate, σ (·) is a sigmoid activation function, W z 、U z is the first and second convolution weights of the update gate, X t is the feature map of the current time in the multi-scale space-time feature code, H t-1 is the state of the feature map of the previous time, b z is the convolution offset of the update gate, r t is the output of the reset gate, W r 、U r is the first and second convolution weights of the reset gate, b r is the convolution offset of the reset gate, For the candidate feature map state, W h 、U h is the first and second convolution weights of the candidate gate, b h is the convolution offset of the candidate gate, h t is the current time feature map state, For stitching in the temporal dimension, F ehc is enhancement feature encoding.
4. The method of claim 1, wherein performing multi-scale spatio-temporal feature decoding on the enhanced feature codes based on a decoding module in the feature fusion model to obtain feature maps of the same scale as the continuous multi-frame infrared weakly small target image comprises: the decoding module in the feature fusion model is used for carrying out nearest neighbor upsampling on the enhancement feature codes to obtain enhancement code features after nearest neighbor upsampling, and splicing the enhancement code features after nearest neighbor upsampling with enhancement feature codes of corresponding scales in a feature map channel dimension by adopting a side connection mode to obtain spliced enhancement feature codes; And a 3D convolution module in the decoding module carries out 3D convolution on the spliced enhancement feature codes to obtain multi-scale space-time decoding features.
5. The method of claim 1, wherein the number of dimensional constraints is the number of infrared dim target categories +1.
6. The method of claim 1, wherein the total loss of the feature fusion model is obtained by: L Total ＝L CE (GT,F head )+L SIoU (GT,F head ) where L Total is total loss, L CE (GT,F head ) is cross entropy loss, GT is area of all true value class pixels in the image, F head is weak and small infrared target space-time fusion feature, L SIoU (GT,F head ) is soft cross ratio loss, t, x, y, C are respectively feature map time constraint number, width constraint number, length constraint number and dimension constraint number, y c is independent thermal coding, p c is probability that the current pixel belongs to class C, GT c is area of pixels with true value class C in the image, and F headc is area of pixels with prediction class C in the image.
7. A weak small infrared target time-space domain feature fusion system, the system comprising: The system comprises a feature fusion model, a training module, a decoding module, a multi-frame continuous feature map, a dimension constraint module, a multi-scale space-time feature coding and a soft entropy feature loss judgment module, wherein the training module is used for inputting continuous multi-frame infrared weak target images into the feature fusion model for training, in the training process, the normalization module based on the feature fusion model normalizes the continuous multi-frame infrared weak target images to obtain a normalized continuous multi-frame infrared weak target image sequence, the dimension constraint module is used for compressing the space-time feature scale of the normalized continuous multi-frame infrared weak target image sequence layer by layer based on the residual error downsampling module in the feature fusion model, the multi-scale space-time feature coding is obtained based on the multi-layer 3D convolution gate control unit in the residual error downsampling module, the multi-scale space-time feature coding is obtained based on the time convolution cyclic gate control unit in the feature fusion model, the multi-scale space-time feature coding is obtained, the decoding module based on the multi-scale space-time feature coding is used for obtaining multi-frame continuous feature map after decoding based on the feature coding in the feature fusion model, the decoded multi-frame continuous feature map is the feature map with the same dimension as the continuous infrared weak target image, the dimensional constraint module is used for dimensional constraint of the decoded, the cross-feature constraint module is used for obtaining the cross-frame low-dimensional feature loss, the cross-entropy feature loss is obtained based on the cross-entropy feature loss of the cross-fusion model, and the cross-entropy feature loss is obtained based on the cross-entropy feature fusion model, and the total loss is obtained, and the total loss is used for the total is obtained based on the time loss of the feature fusion model is obtained, taking the current updated feature fusion model as a trained feature fusion model, otherwise, training the updated feature fusion model again; The reasoning module is used for determining the type of the infrared target to be fused according to the actual characteristic fusion requirement under the condition that the characteristic fusion model training is completed, and inputting the infrared target to be fused and the type of the infrared target to be fused into the trained characteristic fusion model to realize the space-time characteristic fusion of the infrared targets of different types.
8. A computer device comprising a memory, a processor and a small and weak infrared target time-space domain feature fusion program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 6 when executing the small and weak infrared target time-space domain feature fusion program.

Description

Weak and small infrared target time-space domain feature fusion method and system Technical Field The invention relates to the technical field of computer vision image recognition, in particular to a weak and small infrared target time-space domain feature fusion method and system. Background Due to the restriction of factors such as manufacturing process and materials, the infrared focal plane array has the defect of non-uniform noise and blind pixels, and the problems of inundation of noise, misleading of similar blind pixel characteristics, complex background interference and the like of weak and small target characteristics in a single frame infrared image are caused. At present, the traditional single-frame feature extraction method is extremely easy to be influenced by the problems, so that the dimension of the extracted features of the weak and small targets is limited, the extracted features are inaccurate and even misled, and the detection and identification of the subsequent weak and small target features are unfavorable, so that the whole system is influenced. Disclosure of Invention The invention provides a weak and small infrared target time-space domain feature fusion method and a weak and small infrared target time-space domain feature fusion system, which can solve the technical problem of low accuracy of weak and small infrared target single frame feature extraction fusion in the prior art. According to one aspect of the invention, a weak and small infrared target time-space domain feature fusion method is provided, and the method comprises the following steps: Inputting the continuous multi-frame infrared weak and small target images into a feature fusion model for training, and normalizing the continuous multi-frame infrared weak and small target images based on a normalization module in the feature fusion model in the training process to obtain a normalized continuous multi-frame infrared weak and small target image sequence; Compressing the space-time feature scale of the normalized continuous multi-frame infrared weak and small target image sequence layer by layer based on a residual error downsampling module in the feature fusion model, and extracting space-time features based on a multi-layer 3D convolution module in the residual error downsampling module to obtain multi-scale space-time feature codes; performing time sequence causal modeling on the multi-scale space-time feature codes based on a space-time convolution cyclic gating unit in the feature fusion model to obtain enhanced feature codes; Performing multi-scale space-time feature decoding on the enhanced feature codes based on a decoding module in the feature fusion model to obtain decoded multi-frame continuous feature images, wherein the decoded multi-frame continuous feature images are feature images with the same scale as the continuous multi-frame infrared weak small target images; Performing dimension constraint on the decoded multi-frame continuous feature images based on a dimension constraint module in the feature fusion model to obtain infrared weak and small target space-time fusion features; Acquiring cross entropy loss and soft cross-over ratio loss of the feature fusion model based on the space-time fusion features of the infrared weak and small targets, taking the sum of the cross entropy loss and the soft cross-over ratio loss as the total loss of the feature fusion model, and updating the feature fusion model based on the total loss to obtain an updated feature fusion model; Judging whether the preset training times or the total loss convergence of the feature fusion model is reached, if so, taking the current updated feature fusion model as a trained feature fusion model, otherwise, training the updated feature fusion model again; Under the condition that feature fusion model training is completed, determining the type of the infrared target to be fused according to actual feature fusion requirements, and inputting the infrared target to be fused and the type of the infrared target to be fused into a trained feature fusion model to realize space-time feature fusion of the infrared targets of different types. Preferably, the normalization is performed by using gray maximum and minimum normalization, mean variance normalization, histogram statistical normalization or pixel maximum normalization. Preferably, the enhancement feature code is obtained by: zt＝σ(WzXt+UzHt-1+bz) rt＝σ(WrXt+UrHt-1+br) Wherein z t is the output of the update gate, σ (·) is a sigmoid activation function, W z、Uz is the first and second convolution weights of the update gate, X t is the feature map of the current time in the multi-scale space-time feature code, H t-1 is the state of the feature map of the previous time, b z is the convolution offset of the update gate, r t is the output of the reset gate, W r、Ur is the first and second convolution weights of the reset gate, b r is the convolution offset of the reset gate, For the candida