CN-116863245-B - Lightweight RGB-IR fusion small target detection method

CN116863245BCN 116863245 BCN116863245 BCN 116863245BCN-116863245-B

Abstract

A light RGB-IR fusion small target detection method comprises the following steps of S100, preprocessing an image to be detected, wherein the preprocessing refers to adopting homography transformation to Ji Gong R and RGB images to obtain a homography transformation matrix, taking the RGB images as first three channels and the R images as fourth channels to form an image tensor of the four channels, and S200, inputting the preprocessed image tensor into a light RGB-IR fusion small target detection network to carry out target detection to obtain the types, positions, widths and heights of the image to be detected. The method has the characteristics of strong detection robustness, small calculated amount, capability of running in real time at an embedded end with limited calculation power and the like, and can be widely used in the field of monitoring security.

Inventors

ZHOU YANHUI
YANG XU
MA WENBIAO
GE CHENYANG

Assignees

西安交通大学

Dates

Publication Date: 20260512
Application Date: 20230728

Claims (4)

1. A light RGB-IR fusion small target detection method comprises the following steps: s100, preprocessing an image to be detected, wherein the preprocessing refers to aligning IR and RGB images by adopting homography transformation to obtain a homography transformation matrix, taking the RGB image as a first three channels and taking the IR image as a fourth channel to form an image tensor of the four channels; S200, inputting the preprocessed image tensor into a lightweight RGB-IR fusion small target detection network to perform target detection, and obtaining the type, position, width and height of an image to be detected; the homography transformation depends on a homography transformation matrix which generates deviation at a translation part along with the change of a detection distance, and the translation deviation is corrected by manually adjusting the translation part in the X direction and the Y direction; Wherein the lightweight RGB-IR fusion small target detection network in step S200 is improved on the basis of YOLOv framework, and the network comprises a Upsample upsampling module, a Concat module, a GCR module, an SPPF module, a MaxPool max pooling module, a small target detection layer and a C3SE module improved by using SE attention mechanism; The C3SE module comprises a GCR module and a SE-BottleNeck module, wherein the SE-BottleNeck module is formed by serially combining the GCR module and the SE attention module, and the SE attention module comprises an extrusion operation module, an excitation operation module and a scaling operation module; The small target detection layer is characterized in that a GCR module is added behind a C3 module of a 6 th layer of a reference network YOLOv-n to adjust the channel number, an Up-Sample module is used for Up-sampling the small target detection layer, the small target detection layer is spliced with the same-size output of the 2 nd layer GCR module, and finally a small target characteristic layer is obtained through a C3 module; the SPPF module includes a GCR module and MaxPool max-pooling module, which represents combining a Ghost convolution with a Relu activation function.
2. The method of claim 1, wherein the preprocessing in step S100 specifically includes performing homography on the IR image with the RGB image as a reference standard, aligning the IR image after homography with the RGB image, and forming a four-channel tensor from the IR image and the RGB image, wherein the RGB image is used as the first three channels, the IR image after homography is used as the fourth channel, and the combined four-channel tensor is used as the input of the lightweight RGB-IR fusion small target detection network.
3. The method of claim 1, wherein the squeeze operation module converts the input feature map from the C3 module into vectors with equal length to the channel number thereof through global averaging pooling, wherein the excitation operation module adjusts the obtained vectors by convolution, and wherein the scaling operation module multiplies the adjusted vectors at the channel level by the input feature as channel weights to complete a channel attention mechanism and realize the weight distribution at the channel level.
4. The method according to claim 1, wherein the training process of the lightweight RGB-IR fusion small-target detection network in step S200 comprises the following steps: step 1, randomly initializing the weight of the fusion small target detection network, inputting a training set picture into the fusion small target detection network, and obtaining a prediction result of the input training set picture; Step 2, calculating loss by using a prediction result and labels contained in a training set, and carrying out back propagation on the fusion small target detection network according to the calculated loss by using an SGD gradient descent method; and 3, repeatedly executing the step 1 and the step 2, finishing training when the converged small target detection network converges, and increasing training times until convergence if the converged small target detection network does not converge.

Description

Lightweight RGB-IR fusion small target detection method Technical Field The invention belongs to the fields of computer vision, pattern recognition and artificial intelligence, and particularly relates to a light-weight RGB-IR fusion small target detection method. Background With the advent of the intelligent age, the application scene of the visual system is more diversified, and the visible light camera is particularly sensitive to illumination, so that the infrared camera has certain limitation in a low-brightness or backlight environment, has low resolution and blurred edges, and has imaging details inferior to those of the visible light camera under the ideal illumination condition. Therefore, the RGB-IR fusion detection shows great advantages and value, and has important application in the aspects of unmanned aerial vehicles, intelligent home, robots, medical national defense and the like. The method comprises the general steps of carrying out frame-by-frame target detection on an image in an input video stream, outputting a feature image of the image through a feature extraction module, carrying out fusion processing on the extracted features through a feature fusion module to obtain a feature image with low-dimensional and high-dimensional information fused, carrying out regression prediction on the feature image, obtaining coordinate parameters of a detection frame and type confidence of target detection through regression prediction, and finally returning a result to the input image. On one hand, the mainstream target detection algorithms such as YOLO do not have special designs aiming at the characteristics of RGB-IR fusion detection, and infrared and visible light information cannot be utilized simultaneously. On the other hand, the current fusion detection model has large parameter and calculation amount, and the real-time detection of high frame rate can not be finished at the embedded end. Disclosure of Invention In order to solve the technical problems, the invention discloses a lightweight RGB-IR fusion small target detection method, which comprises the following steps: s100, preprocessing an image to be detected, wherein the preprocessing refers to aligning IR and RGB images by adopting homography transformation to obtain a homography transformation matrix, taking the RGB image as a first three channels and taking the IR image as a fourth channel to form an image tensor of the four channels; S200, inputting the preprocessed image tensor into a light RGB-IR fusion small target detection network to perform target detection, and obtaining the type, position, width and height of the image to be detected. Through the technical scheme, the detection of the small target in the complex environment is realized, and the method has the characteristics of accurate detection positioning, strong robustness, high precision and support for real-time detection. The method is not only suitable for images, but also suitable for dynamic small target detection in RGB-IR video streams. The method can be widely applied to the fields of monitoring security and protection, and in the application scene, a remote human-vehicle target occupies fewer pixels in an image, which is called a small target. The method has small calculated amount and can realize real-time fusion and small target detection on the embedded equipment with limited calculation force. Drawings FIG. l is a flow chart of a lightweight RGB-IR fusion small target detection method provided in one embodiment of the invention; FIG. 2 is a schematic diagram of a lightweight RGB-IR fusion small target detection network provided in one embodiment of the present disclosure; FIG. 3 is a schematic illustration of the Ghost convolution principle referenced in one embodiment of the present disclosure; FIG. 4 is a schematic diagram of an SE attention mechanism referenced in one embodiment of the present disclosure; FIG. 5 is a schematic diagram of a C3SE module referenced in one embodiment of the present disclosure; FIG. 6 is a schematic diagram of a GCR module referenced in one embodiment of the present disclosure; FIG. 7 is a schematic diagram of a SE-BottleNeck module referenced in one embodiment of the present disclosure; FIG. 8 is a schematic diagram of an SE attention mechanism referenced in one embodiment of the present disclosure; FIG. 9 is a schematic diagram of an SPPF module referenced in one embodiment of the present disclosure; FIG. 10 is a schematic diagram of an application device used in the field of monitoring security according to one embodiment of the present disclosure; FIG. 11 is a graph comparing the effect of using RGB-IR fusion detection with pure RGB detection on a practical application device according to one embodiment of the present disclosure. Detailed Description In order for those skilled in the art to understand the technical solutions disclosed in the present invention, the technical solutions of the various embodiments will be de