CN-116863321-B - SSE-YOLO deep learning model-based forward-looking sonar image small target recognition method

CN116863321BCN 116863321 BCN116863321 BCN 116863321BCN-116863321-B

Abstract

The invention discloses a forward-looking sonar image small target recognition method based on an SSE-YOLO deep learning model, and belongs to the field of deep learning; the method comprises the steps of designing a lightweight characteristic extraction module SSE based on a classical YOLOv3 model, building a main network, fusing shallower layer characteristics and deep layer characteristics to build a new prediction layer, enhancing small target characterization capability, and enhancing background interference resistance by enhancing detection target characteristics in a neck network by using a convolution attention mechanism module CBAM. The method achieves satisfactory balance between performance and consumption so as to realize real-time and rapid detection of sonar image targets on the premise of limited equipment computing capacity. The average precision and the detection speed of the method are higher than those of the existing lightweight mainstream algorithm YOLOv n through an evaluation process, and embedded deployment can be better realized.

Inventors

ZHOU TIAN
YANG HUILING
YU XIAOYANG
ZHANG TENGFEI

Assignees

哈尔滨工程大学

Dates

Publication Date: 20260512
Application Date: 20230705

Claims (8)

1. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model is characterized by comprising the following steps of: s1, collecting sonar images and constructing an underwater small target data set; s2, constructing an initial feature extraction module PREConv and a lightweight feature extraction module SSE, constructing a trunk feature extraction network, inputting the dataset into an improved SSE-YOLO model for training, and realizing feature extraction of a target; S3, improving a neck network, constructing a YOLO pre-measurement head aiming at detecting smaller targets, introducing CBAM attention mechanisms, and finally obtaining an optimized SSE-YOLO deep learning model; In the step S2, the common convolution in the original algorithm network initial feature extraction module CBL is replaced by the depth separable convolution to construct an initial feature extraction module PREConv, and the specific composition of the PREConv convolution block is as follows: PREConv convolution blocks, including 1 depth separable convolution layer, which are divided into 1 channel-by-channel convolution and 1 point-by-point convolution, the convolution kernel sizes are 3×3 and 1×1 respectively, the packing is 1,1 batch normalization layer, 1 LeakyReLU activation function; In the step S2, a lightweight characteristic extraction module SSE is constructed, wherein a multipath structure is formed by using ideas including residual connection, depth separable convolution, SE attention mechanism and proportional separation channels; the SSE module performs 1×1 common convolution operation on the input feature map, primarily processes and adjusts the number of channels of the feature map to be twice the number of original channels, and splits the feature map into two branches through 1×1 common convolution, wherein the splitting ratio is set to be 0.25, one branch does not perform any operation on the data, and the other branch is used for further extracting the features; The method comprises the steps that a Dense block keeps the same input channel number and output channel number, firstly, channel separation is carried out on an input characteristic diagram, the two branch channel numbers respectively occupy 1/2, the right branch is kept unchanged, the left branch consists of 2 convolutions with the step length of 1, the convolutions are respectively common convolutions of 1 multiplied by 1 and deep convolutions of 3 multiplied by 3, concat operations are carried out on the two branches, the channel numbers are added, the characteristics are fused, and information communication among different groups is carried out by using channel shuffling, so that the channels are fully fused; And embedding the SE attention mechanism module into the SSE module, and calculating the importance degree of each characteristic channel by the SE attention mechanism module through acquiring the weight factor of each channel characteristic layer, so that the network model is more concerned with the characteristic channel related to the target, and the anti-noise performance of the network model is effectively improved.
2. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model according to claim 1, wherein multi-beam forward-looking sonar is adopted as acoustic image acquisition equipment, and the target to be detected in the acoustic image is marked by marking software LabelImg, so that an underwater small target data set is constructed.
3. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model is characterized in that a high-frequency working mode with the working frequency of 1.2MHz is adopted by sonar, imaging resolution of the sonar image is improved, a data set is marked by LabelImg software, an annotated tag file is stored in an xml format, the total number of the data set is 17346 sonar images, 3 types of tags are contained, the target comprises an oxygen cylinder with the length of about 40cm and two spheres with different sizes, each sonar image comprises the three types of targets, the oxygen cylinder comprises different forms of horizontal rotation every 6 degrees, the collected sonar images are divided into a training set, a verification set and a test set according to the proportion of 8:1:1, and data enhancement is carried out in a mode of random scaling, overturning, brightness enhancement, contrast enhancement and Mosaic.
4. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model as set forth in claim 1, wherein the specific constitution of the 3 convolution blocks is as follows: the common convolution block comprises 1 common convolution layer, wherein the convolution kernel is 1 multiplied by 1, the filling is 0,1 batch of normalization layers and 1 ReLU activation function; The depth convolution block comprises 1 common convolution layer, wherein the convolution kernel is 3 multiplied by 3, and the filling is 0; the attention mechanism SE module comprises a global maximum pooling layer and 2 fully-connected layers, wherein the global maximum pooling layer is used for converting an input characteristic diagram with the height width of H multiplied by W into a characteristic diagram with the height width of 1 multiplied by 1, the number of channels is kept unchanged, and two fully-connected layers are used for better fitting complex correlations among channels and having more nonlinearities.
5. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model of claim 4, wherein in the step S2, the built main feature extraction network is constructed by an initial feature extraction module PREConv, a maximum pooling layer Maxpool and an SSE module, PREConv is used for extracting initial features of an input feature map, maxpool is used for reducing the resolution of the feature map, further reducing the calculation amount, reducing the complexity of the model, better preserving texture contour features, and SSE module is used for extracting features of deeper layers.
6. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model according to claim 1, wherein in the step S3, a neck network structure is improved, a PANet-structure multi-scale fusion method is adopted, a bottom-up feature fusion layer is added on the basis of FPN, and the shallow features and the deep features of the network are fused in an up-sampling mode to obtain feature prediction graphs with different scales.
7. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model according to claim 1, wherein in the step S3, a YOLO detection head is improved, a YOLO prediction layer gamma 1 fused with shallower layer information is constructed, semantic information of a small target realized by a network flows from deep layer to shallower layer, and a gamma 1 prediction layer with a new scale of 104×104 and a gamma 2 prediction layer with a scale of 52×52 are constructed, so that the network of the original 3 prediction layers is changed into a network of 2 prediction layers.
8. The method for identifying the small target of the forward-looking sonar image based on the SSE-YOLO deep learning model according to claim 1, wherein in the step S3, CBAM is added into a multi-scale detection layer, so that the expression capacity of the model to the target feature is improved under the condition that the size of the model is hardly increased, and the detection performance of the model is improved.

Description

SSE-YOLO deep learning model-based forward-looking sonar image small target recognition method Technical Field The invention belongs to the field of target detection of sonar images, and particularly relates to a small target recognition method of a forward-looking sonar image based on an SSE-YOLO deep learning model. Background The sonar image-based target detection technology is defined as analyzing and processing the sonar image after echo imaging, and then detecting is realized by using the target detection technology. The method is widely applied to civil and military fields such as underwater topography detection, fish swarm track monitoring, torpedo detection and the like. Underwater target detection includes two main approaches, traditional machine learning and recently emerging deep learning based on Convolutional Neural Networks (CNNs). The traditional sonar image target detection method firstly manually extracts the outline, texture and color characteristics of a sonar image, and then uses a classifier to classify and position the target. Such as support vector machines, singular value decomposition, independent component analysis, and the like. The inherent defect of the traditional sonar image target detection method is that the image characteristics extracted by manpower are taken as the judgment standard of a target object, if the pixels in the image are fewer, the characteristic extraction is inaccurate, the deep features of the sonar image cannot be utilized for decision making, so that invalid learning of sonar image data is caused, and the recognition effect is also not ideal. In recent years, a target detection method based on deep learning is introduced into target detection of sonar images, but most of the main stream target detection algorithms such as YOLO series, SSD and the like are directly applied to target detection of sonar images, and no adaptive improvement is made according to practical application and in combination with characteristics of acoustic images. The sonar image has fewer effective pixels, the target imaging area is small, and the network architecture which is too complex is not suitable for the feature learning of the sonar image. Meanwhile, the network model is overlarge in volume, the layer number is too deep, the parameter quantity is too large, and the algorithm deployment cannot be realized on some embedded equipment with limited performance. Excessive model volume and redundant parameter quantity can cause waste of storage resources and slow detection speed, and excessive network layer number can easily cause simple sonar image characteristics to be sunk into over fitting, so that detection accuracy and model robustness are affected. Therefore, target detection of sonar images by deep learning is required to be improved in adaptability. Disclosure of Invention The invention provides a forward-looking sonar image small target recognition method based on an SSE-YOLO deep learning model for solving the technical problems existing in the background technology. And acquiring an acoustic image by using the double-frequency multi-beam forward-looking sonar, setting a high-frequency working mode of 1.2MHz, and improving the imaging resolution and imaging quality of the sonar image. Constructing lightweight feature extraction modules PREConv and SSE, constructing a brand new lightweight backbone network, and relieving the problem of model overfitting caused by an overly complex network structure. Aiming at the problem of low detection precision of a small target of a forward-looking sonar image, a YOLO detection head for detecting the smaller target is constructed, feature fusion of the target is improved, semantic information of the small target flows from a deep layer to a shallower layer, and diversified features of the target are extracted. The CBAM attention mechanism is introduced to solve the problem of missed detection and inaccurate positioning of small targets in the feature extraction process. In order to solve the technical problems, the technical scheme of the invention is as follows: a method for identifying a small target of a forward-looking sonar image based on an SSE-YOLO deep learning model, the method comprising: s1, collecting sonar images and constructing an underwater small target data set; s2, constructing an initial feature extraction module PREConv and a lightweight feature extraction module SSE, constructing a trunk feature extraction network, inputting the dataset into an improved SSE-YOLO model for training, and realizing feature extraction of a target; and S3, improving a neck network, constructing a YOLO pre-measurement head aiming at detecting smaller targets, introducing CBAM attention mechanisms, and finally obtaining an optimized SSE-YOLO deep learning model. Further, the multi-beam forward-looking sonar is adopted as acoustic image acquisition equipment, and the target to be detected in the acoustic image is marked by marking soft