CN-121999200-A - Vehicle change detection method and device based on multi-view visible light-thermal infrared

CN121999200ACN 121999200 ACN121999200 ACN 121999200ACN-121999200-A

Abstract

The invention provides a vehicle target detection method and device based on multi-view visible light-thermal infrared, belonging to the technical field of artificial intelligence, wherein the method comprises the steps of predicting space transformation parameters based on coarse alignment visible light characteristic vectors of visible light images to be detected and coarse pairs Ji Regong external characteristic vectors of thermal infrared images to be detected; based on a preset space parameter transformation matrix and space transformation parameters, the rough alignment visible light feature vector is deformed to obtain the fine alignment visible light feature vector, the rough alignment Ji Regong external feature vector and the fine alignment visible light feature vector are fused to obtain a multi-mode fusion feature vector, and the multi-mode fusion feature vector is input to a detection head of a vehicle change detection model to obtain a vehicle change detection result in a region to be detected, which is output by the vehicle change detection model. The invention improves the target-level vehicle change detection precision and generalization capability of the vehicle change detection model under the condition of multi-view visible light-thermal infrared images.

Inventors

SUN HE
GAO LIANRU
CAI LUYANG
SUN XU

Assignees

中国科学院空天信息创新研究院

Dates

Publication Date: 20260508
Application Date: 20260109

Claims (10)

1. A multi-view visible light-thermal infrared based vehicle change detection method, comprising: The method comprises the steps of obtaining a rough alignment visible light characteristic vector of a visible light image to be detected and a rough alignment Ji Regong external characteristic vector of a thermal infrared image to be detected in a region to be detected, wherein the time phases of the visible light image to be detected and the thermal infrared image to be detected are different; Predicting a spatial transformation parameter based on the coarse alignment visible light feature vector and the coarse pair Ji Regong of external feature vectors; Based on a preset space parameter transformation matrix and the space transformation parameters, deforming the coarse alignment visible light characteristic vector to obtain a fine alignment visible light characteristic vector; fusing the outer feature vector of the coarse pair Ji Regong and the fine alignment visible light feature vector to obtain a multi-mode fusion feature vector; and inputting the multi-mode fusion feature vector to a detection head of a vehicle change detection model to obtain a vehicle change detection result in the to-be-detected area output by the vehicle change detection model.
2. The method for detecting a vehicle change based on multi-view visible light-thermal infrared according to claim 1, wherein the obtaining the coarse alignment visible light feature vector of the visible light image to be detected and the coarse alignment Ji Regong external feature vector of the thermal infrared image to be detected in the area to be detected comprises: Inputting the visible light image to be detected to a first feature extraction layer of a first backbone network branch of the vehicle change detection model to obtain an initial visible light feature vector output by the first feature extraction layer; Inputting the thermal infrared image to be detected to a second feature extraction layer of a second backbone network branch of the vehicle change detection model to obtain an initial thermal infrared feature vector output by the second feature extraction layer; Inputting the initial visible light feature vector to a first self-attention layer of the first main network branch to obtain a visible light intermediate feature vector output by the first self-attention layer; Inputting the initial thermal infrared characteristic vector to a second self-attention layer of the second main network branch to obtain a thermal infrared intermediate characteristic vector output by the second self-attention layer; Inputting a first partial vector of the visible light intermediate feature vector and a second partial vector of the thermal infrared intermediate feature vector to a first cross attention layer of the first main network branch to obtain the coarse alignment visible light feature vector output by the first cross attention layer; and inputting a third part of vectors of the visible light intermediate feature vectors and a fourth part of vectors of the thermal infrared intermediate feature vectors to a second cross attention layer of the second main network branch to obtain the coarse pair Ji Regong external feature vectors output by the second cross attention layer.
3. The method for detecting vehicle changes based on multi-view visible light-thermal infrared according to claim 1, wherein said spatial transformation parameters include a translation parameter, a scale parameter and a rotation parameter, said predicting a spatial transformation parameter based on said coarse alignment visible light feature vector and said coarse alignment Ji Regong external feature vector comprises: Splicing the rough alignment visible light characteristic vector and the rough pair Ji Regong external characteristic vector to obtain a spliced characteristic representation; performing dimension reduction operation on the spliced characteristic representation to obtain a low-dimension characteristic representation; performing global average pooling operation and convolution operation on the low-dimensional feature representation, and predicting the spatial transformation parameters; the step of deforming the coarse alignment visible light feature vector based on a preset space parameter transformation matrix and the space transformation parameters to obtain a fine alignment visible light feature vector comprises the following steps: According to the translation parameter, the scale parameter and the rotation parameter, adjusting the preset space parameter transformation matrix to generate an affine transformation network; and deforming the coarse alignment visible light characteristic vector by utilizing the affine transformation network to obtain the fine alignment visible light characteristic vector.
4. The method for detecting a change in a vehicle based on multi-view visible light-thermal infrared according to claim 1, wherein the fusing the coarse pair Ji Regong of external feature vectors and the fine pair of aligned visible light feature vectors to obtain a multi-modal fused feature vector comprises: acquiring the thermal infrared channel attention weight of the coarse pair Ji Regong of external feature vectors and the visible light channel attention weight of the fine alignment visible light feature vectors; performing attention enhancement on the coarse pair Ji Regong of external feature vectors based on the attention weight of the thermal infrared channel to obtain a thermal infrared first enhancement feature vector after channel decoupling; performing attention enhancement on the fine alignment visible light characteristic based on the visible light channel attention weight to obtain a channel-decoupled visible light first enhancement characteristic vector; Acquiring the thermal infrared space attention weight of the thermal infrared first enhancement feature vector and the visible light space attention weight of the visible light first enhancement feature vector; The attention of the first enhancement feature vector of the visible light is enhanced based on the spatial attention weight of the visible light, so that the second enhancement feature vector of the visible light after spatial decoupling is obtained; And fusing the thermal infrared second enhancement feature vector and the visible light second enhancement feature vector to obtain the multi-mode fusion feature vector.
5. The multi-view visible-thermal infrared based vehicle change detection method of claim 4, wherein the obtaining the thermal infrared channel attention weight of the coarse pair Ji Regong of outer feature vectors and the visible channel attention weight of the fine alignment visible feature vectors comprises: Carrying out global average pooling treatment on the outer feature vector of the coarse pair Ji Regong to obtain a first pooled vector, carrying out maximum pooling treatment on the outer feature vector of the coarse pair Ji Regong to obtain a second pooled vector, carrying out global average pooling treatment on the fine alignment visible light feature vector to obtain a third pooled vector, and carrying out maximum pooling treatment on the fine alignment visible light feature vector to obtain a fourth pooled vector; Splicing the first pooling vector, the second pooling vector, the third pooling vector and the fourth pooling vector to obtain a first spliced vector; And inputting the first spliced vector to a multi-layer perceptron layer sharing weight in the vehicle change detection model to obtain the thermal infrared channel attention weight and the visible light channel attention weight output by the multi-layer perceptron layer.
6. The multi-view visible-thermal infrared based vehicle change detection method of claim 4, wherein the acquiring the thermal infrared spatial attention weight of the thermal infrared first enhancement feature vector and the visible light spatial attention weight of the visible light first enhancement feature vector comprises: splicing the thermal infrared first enhancement feature vector and the visible light first enhancement feature vector to obtain a second spliced vector; And inputting the second spliced vector to a lightweight convolution network of the vehicle change detection model to obtain the thermal infrared space attention weight and the visible light space attention weight output by the lightweight convolution network.
7. The multi-view visible light-thermal infrared-based vehicle change detection method according to claim 1, wherein the vehicle change detection model is obtained by performing end-to-end joint optimization based on a multi-task loss function; the multitasking loss function is determined based on target detection loss, variation classification loss, and spatial alignment loss; The target detection loss is determined based on a bounding box regression loss, a target confidence loss, and a vehicle class prediction loss; the change classification loss is determined based on a vehicle change type loss, which is determined based on a vehicle predicted change type and a vehicle change type tag; The vehicle class prediction loss is determined based on an alignment structure similarity loss and an L1 distance loss, and the alignment structure similarity loss is determined based on the fine alignment visible light feature vector and the coarse pair Ji Regong of external feature vectors.
8. A multi-view visible light-thermal infrared-based vehicle object detection device, comprising: The first alignment module is used for acquiring a coarse alignment visible light characteristic vector of a visible light image to be detected in a region to be detected and a coarse pair Ji Regong external characteristic vector of a thermal infrared image to be detected, wherein the time phases of the visible light image to be detected and the thermal infrared image to be detected are different; A parameter prediction module, configured to predict a spatial transformation parameter based on the coarse alignment visible light feature vector and the coarse alignment Ji Regong external feature vector; The second alignment module is used for deforming the coarse alignment visible light characteristic vector based on a preset space parameter transformation matrix and the space transformation parameters to obtain a fine alignment visible light characteristic vector; the feature fusion module is used for fusing the outer feature vector of the coarse pair Ji Regong and the fine alignment visible light feature vector to obtain a multi-mode fusion feature vector; and the target detection module is used for inputting the multi-mode fusion feature vector to a detection head of a vehicle change detection model to obtain a vehicle change detection result in the to-be-detected area output by the vehicle change detection model.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the multi-view visible light-thermal infrared based vehicle change detection method according to any one of claims 1 to 7 when executing the computer program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the multi-view visible-thermal infrared based vehicle change detection method according to any one of claims 1 to 7.

Description

Vehicle change detection method and device based on multi-view visible light-thermal infrared Technical Field The invention relates to the technical field of artificial intelligence, in particular to a vehicle change detection method and device based on multi-view visible light-thermal infrared. Background Object-level change detection (Object-LEVEL CHANGE detection) is a technology for identifying the change of an interesting Object by comparing different time phase images of the same region, and is a key technology in the fields of remote sensing, urban monitoring, urban management and the like. When multi-phase images are actually applied to target level change detection, a high-mobility platform such as an unmanned plane is often adopted, so that imaging angle difference exists between front and back phase images, and when the front and back phase images often adopt images of different modes, for example, visible light-thermal infrared images. The commonly used target level change detection method often relies on image level registration by means of geometric transformation, obtains a pre-registered image to be detected by selecting pixel control points of the images to be detected in front and back time phases and geometric transformation, and realizes target detection based on the pre-registered image to be detected. However, the same name point features of images to be detected in different time phases, different modes and multiple view angles have large differences, the image level alignment based on pixel points is lack of combination with the deep features of the target background, the detection result of the target level change of the vehicle is difficult to detect, the imaging mechanisms of images in different modes such as visible light images, thermal infrared images and the like are different, even after the spatial alignment of the images in multiple view angles is realized, the significant differences of the feature distribution still exist, the differences are difficult to directly measure, and the problems of low detection precision and poor generalization capability of the target level change of the vehicle in multiple view angles and different modes occur. Disclosure of Invention The invention provides a vehicle change detection method and device based on multi-view visible light-thermal infrared, which are used for solving the defects of low detection precision and poor generalization capability of vehicle target level change under different modes and multiple view angles in the prior art. The invention provides a vehicle change detection method based on multi-view visible light-thermal infrared, which comprises the following steps: The method comprises the steps of obtaining a rough alignment visible light characteristic vector of a visible light image to be detected and a rough alignment Ji Regong external characteristic vector of a thermal infrared image to be detected in a region to be detected, wherein the time phases of the visible light image to be detected and the thermal infrared image to be detected are different; Predicting a spatial transformation parameter based on the coarse alignment visible light feature vector and the coarse pair Ji Regong of external feature vectors; Based on a preset space parameter transformation matrix and the space transformation parameters, deforming the coarse alignment visible light characteristic vector to obtain a fine alignment visible light characteristic vector; fusing the outer feature vector of the coarse pair Ji Regong and the fine alignment visible light feature vector to obtain a multi-mode fusion feature vector; and inputting the multi-mode fusion feature vector to a detection head of a vehicle change detection model to obtain a vehicle change detection result in the to-be-detected area output by the vehicle change detection model. According to the multi-view visible light-thermal infrared-based vehicle change detection method, the method for obtaining the coarse alignment visible light feature vector of the visible light image to be detected and the coarse pair Ji Regong external feature vector of the thermal infrared image to be detected in the region to be detected comprises the steps of inputting the visible light image to a first feature extraction layer of a first trunk network branch of a vehicle change detection model to obtain an initial visible light feature vector output by the first feature extraction layer, inputting the thermal infrared image to a second feature extraction layer of a second trunk network branch of the vehicle change detection model to obtain an initial thermal infrared feature vector output by the second feature extraction layer, inputting the initial visible light feature vector to a first self-focusing layer of the first trunk network branch to obtain a visible light intermediate feature vector output by the first self-focusing layer, inputting the initial thermal infrared feature vector to a second s