CN-121982651-A - Safety helmet detection method, system, equipment and medium
Abstract
The invention discloses a safety helmet detection method, a system, equipment and a medium, which relate to the technical field of target detection and comprise the steps of acquiring a construction site image to be detected, inputting the image into an improved RT-DETR detection model for processing, fusing a self-adaptive pooling and frequency domain edge enhancement technology by an MLE module to improve the positioning accuracy of a small-size safety helmet target, realizing feature Fusion under the conditions of shielding and background noise by an SDSA-Fusion module through global-local attention and content-guided pixel-level gating, combining gating convolution and a leachable residual coefficient by an SGR-gConvC module, inhibiting invalid information while enhancing semantic feature expression, and utilizing a RT-DETR decoder to position and classify the safety helmet target based on the processed features. The method improves the detection precision and the robustness of the model on the small target and the confusing area in the complex construction site environment.
Inventors
- LI JIAQI
- ZHANG HAO
- CHEN ZHUO
- ZHANG HE
- LIU ZHENG
Assignees
- 辽宁科技大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260131
Claims (10)
- 1. The method for detecting the wearing state of the safety helmet is characterized by comprising the following steps of: acquiring a construction site image to be detected; The improved RT-DETR detection model is based on an RT-DETR-r18 model, an MLE main network module is embedded in a CSP network structure of the RT-DETR-r18 model, a RepC network structure of an SGR-gConvC module RT-DETR-r18 model is used, and a cooperative attention Fusion module SDSA-Fusion is introduced; the method comprises the steps of extracting multi-scale features of a construction site image through an MLE backbone network module, and carrying out edge information processing on the multi-scale features to obtain features with enhanced edge information; Performing adaptive weighting Fusion on the characteristics after edge information enhancement through the inter-scale characteristic interaction and adaptive weighting of the SDSA-Fusion module to obtain the fused characteristics; the fused features are enhanced and characterized through dynamic feature selection and residual intensity coefficients of an SGR-gConvC module, and target features with background interference suppression function are obtained; And decoding the target features with the background interference suppression function by using a decoder of the RT-DETR to obtain the target positioning and classifying result of the safety helmet.
- 2. The method for detecting the safety helmet according to claim 1, wherein the steps of extracting the multiscale characteristics of the construction site image through the MLE backbone network module, and performing edge information processing of the multiscale characteristics to obtain the characteristics with enhanced edge information are as follows: Extracting a feature map of a construction site image; Adjusting the feature map to a plurality of predefined scales by utilizing adaptive averaging pooling to obtain feature representations containing different receptive fields; Carrying out local average smoothing on the characteristic representations of different receptive fields by 3×3 averaging pooling to obtain low-frequency components, and extracting high-frequency edge components of the characteristic representations of different receptive fields by utilizing differential residual error operation based on the low-frequency components; And activating the high-frequency edge component through convolution mapping and Sigmoid gating to obtain an adaptive edge enhancement, and adding the adaptive edge enhancement back to the feature map of the construction site image in a residual form to obtain the feature map after the edge information enhancement.
- 3. A method of headgear inspection according to claim 2, wherein the plurality of predefined dimensions includes 3x3, 6 x 6, 9 x 9 and 12 x 12.
- 4. The method for detecting the safety helmet according to claim 1, wherein the self-adaptive weighted fusion is performed on the characteristics after the edge information enhancement, specifically: element-level addition is carried out on the characteristics after edge information enhancement, so that initial fusion characteristics are obtained; Performing convolution enhanced attention extraction on the initial fusion feature to obtain a first-stage context attention response, and generating a pixel-level gating weight based on the initial fusion feature and the first-stage context attention response; and carrying out weighted summation on the characteristics after the edge information enhancement by using the pixel-level gating weight to obtain the fused characteristics.
- 5. The method for detecting the safety helmet according to claim 1, wherein the step of performing enhanced characterization on the fused features to obtain target features with background interference suppression comprises the following steps: projecting the fused features through double branches 1 multiplied by 1 to generate main branch features and bypass branch features respectively; Inputting the main branch characteristics into a cross-layer gating residual stacking link formed by connecting n gating convolution units in series, and introducing a learnable residual intensity coefficient to carry out weighted adjustment on the output of the current layer gating convolution unit in each level of transformation of the stacking link; and carrying out residual addition on the weighted and adjusted output and the current layer input, carrying out addition and fusion on the final output of the residual addition and the bypass branch characteristic, and obtaining the target characteristic with the background interference suppression through 1X 1 convolution.
- 6. The method for detecting a helmet according to claim 5, wherein before the step of performing enhanced characterization on the fused features, the method further comprises: Inputting the fused characteristics into a characteristic branch and a gating branch for parallel processing; in the characteristic branches, channel mixing and depth separable convolution are sequentially carried out through 1X 1 convolution to extract local space semantics; in the gating branch, generating pixel-by-pixel and channel-by-channel selection weights through 1×1 convolution, and mapping to (0, 1) through Sigmoid; and carrying out element level multiplication on the output of the characteristic branch and the output of the gating branch, and obtaining the final fused characteristic through 1X 1 projection.
- 7. A headgear detection method according to claim 5, wherein the learnable residual intensity coefficients adjust the layer gating contributions during residual stacking.
- 8. A headgear inspection method system comprising: the data acquisition module is used for acquiring a construction site image to be detected; The model input module is used for inputting the construction site image into an improved RT-DETR detection model for processing to obtain a safety helmet target positioning and classifying result, wherein the improved RT-DETR detection model is based on an RT-DETR-r18 model, an MLE main network module is embedded in a CSP network structure of the RT-DETR-r18 model, a RepC network structure of the RT-DETR-r18 model is used by an SGR-gConvC module, and a cooperative attention Fusion module SDSA-Fusion is introduced; The feature extraction module is used for extracting multi-scale features of the construction site image through the MLE backbone network module, and carrying out edge information processing on the multi-scale features to obtain features with enhanced edge information; The Fusion module is used for carrying out self-adaptive weighted Fusion on the characteristics after edge information enhancement through the cross-scale characteristic interaction and self-adaptive weighting of the SDSA-Fusion module to obtain the fused characteristics; The enhancement module is used for enhancing and characterizing the fused features through dynamic feature selection and residual intensity coefficients of the SGR-gConvC module to obtain target features with background interference suppression; And the detection module is used for decoding the target features with the background interference suppression function by utilizing a decoder of the RT-DETR to obtain the target positioning and classifying result of the safety helmet.
- 9. A computer device comprising a memory and a processor, the memory having stored therein a program which, when executed by the processor, causes the processor to perform the steps of a method of detecting a helmet as claimed in any one of claims 1 to 7.
- 10. A storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of a method of detecting a helmet as claimed in any one of claims 1 to 7.
Description
Safety helmet detection method, system, equipment and medium Technical Field The present invention relates to the field of target detection technologies, and in particular, to a method, a system, an apparatus, and a medium for detecting a helmet. Background In the construction field, complex work environments and heavy labor tasks are often accompanied by potential safety risks. Various risk factors, such as high-altitude falling objects, equipment misoperation, and sudden structural failures, can cause serious casualties. Thus, safety management of construction sites is always a major issue in the industry. Under the background, the personal protective equipment has a non-negligible effect, wherein the safety helmet is the most basic and common protective article, has a critical effect, can effectively protect the head from external impact and falling objects, and is a key protective measure for preventing serious injury to the head of staff caused by overhead operation and other sudden accidents. At present, the detection of the wearing condition of the safety helmet mainly depends on manual inspection or monitoring video, and the methods depend on experience and judgment of a safety officer, so that the efficiency is low, and omission is easy to occur. Early researchers have attempted to use accelerometers, altitude sensors, radio frequency identification sensors, and pressure sensors to detect whether protective equipment such as helmets is worn. Although these sensors can assist in determining the wearing state to some extent, there are two main problems. First, these sensors cannot accurately determine whether the helmet is actually worn or is simply placed in a specific location. Second, invasive sensors may result in some workers refusing to wear the headgear because of wearing discomfort or privacy concerns. Furthermore, the equipment costs of these sensors are high, maintenance is complex, and coordination of workers is required, which makes them challenging to popularize on a large scale. In an actual construction scenario, the detection of the wearing of the safety helmet still faces a plurality of challenges. Currently, most of the monitoring probes are relied on as a main deployment platform of a detection model. However, in surveillance videos, the helmet target is often small and often blocked by other objects. This can result in information loss, making it difficult for the system to accurately determine whether the worker is wearing the helmet correctly. In addition, the cost and the detection speed need to be comprehensively considered in the model deployment of the construction site, so that the detection model not only needs to have higher precision, but also needs to have the characteristic of light weight so as to meet the real-time application requirement. With the continuous development of artificial intelligence technology, a non-contact type helmet wearing detection method based on computer vision is widely focused, and has the advantages of non-contact, strong instantaneity, flexible deployment and the like, so that the method becomes a research hot spot in the field of helmet detection. A great deal of research has been conducted on automatic detection of helmets by adopting Convolutional Neural Network (CNN) structures such as fast R-CNN, SSD, YOLO. With the appearance of a transducer architecture, a new breakthrough is brought to target detection, a strong multi-head self-attention mechanism of the transducer architecture has excellent global modeling capability, stronger adaptability is shown in the aspects of small target recognition, complex background and global context modeling, and the transducer architecture is used as a first transducer-based end-to-end detector DETR (Detection TRansformer) to remove components such as anchor frame generation and non-maximum inhibition, so that a more concise detection flow is realized. Meanwhile, in order to solve the problems of low training convergence speed and low reasoning efficiency, which limit practical application, researchers put forward a more efficient Real-time transducer detection framework RT-DETR (Real-Time DEtection TRansformer), and the reasoning speed is obviously improved while high precision is maintained, so that the transducer is possible in a Real-time detection scene. Although the RT-DETR has both precision and real-time performance in a target detection task, certain limitations still exist in scene applications such as safety helmet detection, the RT-DETR depends on a complex transducer structure, the calculation cost is large, and especially in the safety helmet detection with small size and complex background scene, the extracted feature expression effect is insufficient, so that the condition of missing detection or false detection occurs, and the detection effect is poor. Disclosure of Invention The invention aims to overcome the defects in the prior art and provide a safety helmet detection method, a sa