CN-121982290-A - Container lock hole detection method based on HSLD-YOLO
Abstract
The invention discloses a container lock hole detection method based on HSLD-YOLO, which comprises the steps of obtaining container lock hole images, constructing a small target training set and a test set of the lock hole, preprocessing and enhancing the container lock hole images, inputting the container lock hole images into a backstone network, extracting multi-scale feature images, performing convolution to compress channels, inputting the enhanced feature images into Neck networks, performing cross-scale fusion with a C2F_ GCSA module through a USFF module, inputting fusion features into a decoupling-Free detection head, training a model by using WIoU regression loss function, and detecting the container lock hole by using the trained model. The invention improves the recognition and positioning precision of the lock hole of the container.
Inventors
- ZHA YAN
- ZHANG FENGFENG
- WU PEIFENG
- WANG YUANBO
- SONG YUXIANG
- WANG HUI
Assignees
- 江苏润邦工业装备有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260128
Claims (7)
- 1. The container lock hole detection method based on the HSLD-YOLO is characterized by comprising the following steps of: S1, acquiring container keyhole images under different distance, illumination and gesture conditions, and constructing a keyhole small target training set and a test set; s2, preprocessing and enhancing the container keyhole image; s3, inputting the container lock hole image into a backbond network, and extracting multi-scale feature images P1-P5; s4, performing 1X 1 convolution on each scale feature map to perform channel compression; S5, adding a DPCA module to each scale feature map to enhance local texture and contour information of the lock hole; s6, inputting the enhanced scale feature graphs into Neck networks, and performing cross-scale fusion with a C2F_ GCSA module through a USFF module to generate fusion features containing shallow details and deep semantics; s7, inputting the fusion characteristics into a decoupling-Free detection head, and outputting the category probability of the lock hole, the position of the center point and the size of the boundary frame; S8, training the model by utilizing WIoU regression loss functions, and judging whether iteration is continued or not according to the evaluation index until the preset performance requirement is met; s9, detecting a container lock hole by using the trained model.
- 2. The method for detecting lock holes in a container based on HSLD-YOLO according to claim 1, wherein said preprocessing and enhancement in step S2 comprises size normalization, brightness correction and noise enhancement.
- 3. The method for detecting a lock hole of a container based on HSLD-YOLO as claimed in claim 1, wherein said step S5 is specifically: 5.1, the DPCA module respectively executes global average pooling and global maximum pooling on each scale feature map to obtain two channel description vectors: (1) (2) Wherein, the For the global average pooling of branches, For the global maximization of the pooling branches, For the number of C channels H and W are the height and width of the pooling window, respectively, i and j are the row index and column index of the pooling window, respectively, A global max pooling function; 5.2 utilizing two convolutional layers And Constructing bottleneck structure and convolution layer Compression channel dimension reduces computation, convolution layer Restoring to the original dimension, and implicitly learning the nonlinear relation among the channels; (3) (4) Wherein, the For the global average pooling of branches, For the global maximization of the pooling branches, And Is a convolution kernel of 1 x 1, Is an activation function; 5.3, double-branch fusion, namely extracting context information and local significant features, generating attention weight by using a Sigmoid function, and multiplying the attention weight by an input feature map to obtain a feature map with enhanced features; (5) (6) Wherein, the Is a characteristic diagram after double-branch fusion, As a function of the Sigmoid, In order to input the feature map, Is a characteristic diagram after characteristic reinforcement.
- 4. The method for detecting the lock hole of the container based on the HSLD-YOLO as claimed in claim 1, wherein the USFF module in the step S6 fuses the high-level features and the low-level features with different scales to obtain the fused features with semantic information and space details, and the specific process is as follows: a1, giving a high-level input feature And a low-level input feature The resolution of the high-level feature map is enlarged by 2 times through transpose convolution, and 256 channel features are output: (7) Wherein, the Is a transposed convolution function; A2, converting the high-level features into corresponding attention weights by using the DPCA module to filter the low-level features to obtain features with consistent dimensionality, then fusing the filtered low-level features with the high-level features, and enhancing the feature representation of the model to obtain : (8) Wherein, the As a function of the DPCA module function, Is the feature after the fusion of the bottom layer feature and the high layer feature.
- 5. The method for detecting the lock hole of the container based on the HSLD-YOLO according to claim 1, wherein in the step S6, the C2F_ GCSA module introduces the expanded convolution and the per-channel self-attention module GCSA to realize long-distance dependency relationship in the captured image by expanding the receptive field instead of the expanded convolution, and the expanded convolution and per-channel self-attention module GCSA dynamically focuses the lock hole area through a multi-head attention mechanism to strengthen feature extraction of the lock hole area, and further strengthen association of local details of the lock hole with global positions, and the specific contents are as follows: b1, input characteristic diagram Up-scaling by a 1 x 1 convolution, where In order to input the number of channels, 、 Depth separable convolution for feature map height and width and with a dilation rate of 2 is applied Performing feature enhancement, expanding a receptive field under the condition of not reducing resolution, and covering the relevant area of the lock hole and the peripheral structure; (9) Wherein, the Divided into queries Key and key Value of ; (10) Wherein, the Is a segmentation function; B2, pair of Splitting and reshaping into a multi-headed structure, pair And Carrying out channel dimension normalization to improve stability; (11) (12) (13) Wherein, the For after division , As a function of the weight of the plastic, For the number of channels per head of the device, For the number of heads to be counted, In order to be the number of spatial locations, 、 Is normalized to 、 ; B3, calculating And (3) with Is to introduce a leachable scaling factor Generating attention weights by Softmax after scaling Attention weighting And (3) with Multiplying to achieve attention weighted aggregation ; (14) (15) (16) Wherein T is the transposed symbol of the matrix, For a learnable scale factor, S is the scaled value of the transpose matrix product, a is the attention weight, For the normalization function, dim is the normalization direction, Weighting the aggregate value for attention; B4, recombining the multi-head output into a space feature map, and outputting the feature map with the same input size through 1X 1 convolution dimension reduction fusion ; (17)。
- 6. The method for detecting the lock hole of the container based on the HSLD-YOLO as claimed in claim 1, wherein the WIoU regression loss function in the step S8 is characterized in that the weight is dynamically adjusted according to the size of the target by introducing a size inverse ratio, so that the loss weight of the small target is larger, and the WIoU regression loss function is calculated as follows: (18) Wherein, the , In order to predict the frame of a picture, As a real frame of the image, the image is displayed, For the area of the overlapping area of the two frames, Covering the total area of the areas for two frames; (19) Wherein, the In order to penalize the term coefficient, In order to annotate the coordinates of the center point of the frame, In order to predict the coordinates of the center point of the frame, 、 To predict the frame width height.
- 7. The method for detecting a lock hole of a container based on HSLD-YOLO as claimed in claim 1, wherein the evaluation index in the step S8 comprises precision P, recall R, mAP, mAP50-95 and parameter; (20) (21) (22) (23) Wherein the accuracy is The probability of actually positive samples in the data predicted as positive samples, recall rate The probability of being predicted as a positive sample among the data actually being a positive sample; The number of positive samples predicted and actually positive samples; the number of positive samples predicted but actually negative samples; number of predicted negative but actually positive samples, average accuracy Is the area under the curve of precision-recall ratio, and the average precision average value Is in all sample classes The method comprises the steps of measuring the detection performance of all types of models, wherein mAP50 represents the average precision when the IOU threshold value of the detection model is set to 0.5, mAP50-95 represents the average precision when the IOU threshold value is in the range of 0.5 to 0.95, and parameter quantity refers to the number of parameters in the model and indicates the consumption of computational memory resources.
Description
Container lock hole detection method based on HSLD-YOLO Technical Field The invention relates to a container lock hole detection method, in particular to a container lock hole detection method based on HSLD-YOLO, and belongs to the technical field of ship intelligent detection. Background With the continuous increase of global trade volume and the deepening of intelligent port construction, the automation level of container handling operations has become an important direction for improving port operation efficiency. The identification of the lock hole position of the container is a key link for the automatic lock pin system to finish alignment and pin operation. However, since the imaging distance of the keyhole in the actual operation scene is usually 5-20 meters, the effective pixel area in the image is very small, and is often less than 20×20 pixels, which belongs to a typical remote micro target. The target is greatly influenced by factors such as imaging distance, angle change and the like, so that the edge of the target is blurred, the texture of the target is weakened, and the characteristics of the target are easily covered by background noise, so that the existing detection method is difficult to obtain a stable identification result. In addition, the outdoor environment of the port is complex and changeable, the illumination condition is greatly influenced by time and weather, and the image quality is reduced in strong light, backlight, night weak light, rain and fog weather and the like. Meanwhile, mechanical vibration, equipment shaking, stain shielding and the like exist in the operation process of the automatic shore bridge, the track crane and the collecting card, so that the imaging of the lock hole is further degraded. Under such an environment, the traditional target detection method based on image processing or deep learning is easy to be interfered by noise, so that the detection performance is unstable, and the requirements of an automatic loading and unloading system on instantaneity and high precision are difficult to meet. The method mainly comprises the following steps of 1, enabling effective pixels of a target to be extremely small due to long-distance imaging of a lock hole, enabling shallow layer features to be easily compressed in a network downsampling process, and enabling stable positioning to be difficult to achieve, 2, enabling image quality to be reduced due to illumination change, weather interference and mechanical vibration in a complex port environment, enabling an existing model to be difficult to keep robustness, 3, enabling shallow layer details and high-level semantics to be combined insufficiently in a multi-scale feature fusion process, enabling small target features of the lock hole to be easily submerged by background features, 4, enabling the existing model to be large in parameter quantity and high in calculation cost, being unfavorable for being deployed on an edge calculation platform of automatic loading and unloading equipment, and being difficult to meet real-time monitoring requirements. Disclosure of Invention The invention aims to provide a container lock hole detection method based on HSLD-YOLO, which improves the recognition and positioning accuracy of container lock holes. In order to solve the technical problems, the invention adopts the following technical scheme: a container lock hole detection method based on HSLD-YOLO comprises the following steps: S1, acquiring container keyhole images under different distance, illumination and gesture conditions, and constructing a keyhole small target training set and a test set; s2, preprocessing and enhancing the container keyhole image; s3, inputting the container lock hole image into a backbond network, and extracting multi-scale feature images P1-P5; s4, performing 1X 1 convolution on each scale feature map to perform channel compression; S5, adding a DPCA module to each scale feature map to enhance local texture and contour information of the lock hole; s6, inputting the enhanced scale feature graphs into Neck networks, and performing cross-scale fusion with a C2F_ GCSA module through a USFF module to generate fusion features containing shallow details and deep semantics; s7, inputting the fusion characteristics into a decoupling-Free detection head, and outputting the category probability of the lock hole, the position of the center point and the size of the boundary frame; S8, training the model by utilizing WIoU regression loss functions, and judging whether iteration is continued or not according to the evaluation index until the preset performance requirement is met; s9, detecting a container lock hole by using the trained model. Further, in the step S2, the preprocessing and enhancement includes size normalization, brightness correction, and noise enhancement. Further, the step S5 specifically includes: 5.1, the DPCA module respectively executes global average pooling and global maximum poolin