CN-122022454-A - Potential safety hazard risk rating method and device, electronic equipment and storage medium

CN122022454ACN 122022454 ACN122022454 ACN 122022454ACN-122022454-A

Abstract

The invention relates to a potential safety hazard risk rating method, a device, electronic equipment and a storage medium, belonging to the technical field of safety monitoring, wherein the potential safety hazard risk rating method comprises the steps of obtaining multi-mode risk potential risk information of a target area, wherein the multi-mode risk potential risk information comprises visual image information, visual text information, wind speed information and temperature information; extracting the character features of the visual text information, and carrying out fusion recognition on the character features and the multi-mode fusion features to obtain the risk level of potential safety hazards of the target area. The invention can realize the accurate identification of the potential safety hazard of the target place.

Inventors

LEI PENG
CHEN MINGZI
DU YUTING
CHEN XIAOQIU
LIU YIBO
LV WEI

Assignees

武汉理工大学

Dates

Publication Date: 20260512
Application Date: 20251231

Claims (10)

1. A method for risk rating potential safety hazards, comprising: Acquiring multi-modal risk potential hazard information of a target area, wherein the multi-modal risk potential hazard information comprises visual image information, visual text information, wind speed information and temperature information; extracting the single-mode characteristics of the visual image information, the wind speed information and the temperature information, and carrying out interactive fusion on each single-mode characteristic to obtain a multi-mode fusion characteristic; and extracting the character features of the visual text information, and carrying out fusion recognition on the character features and the multi-mode fusion features to obtain the risk level of the potential safety hazard of the target area.
2. The method for grading risk of potential safety hazards according to claim 1, wherein the step of obtaining multi-modal risk potential risk information of a target area comprises: Acquiring a video of the target area through a vision acquisition device, and acquiring wind speed information and temperature information of the target area through a sensor; and processing the video frames of the video frame by frame, extracting visual image information of the video frame when the video frames do not contain text information, and extracting visual text information in the video frame when the video frames contain text information.
3. The method of claim 1, wherein the extracting the single-mode features of the visual image information, the wind speed information and the temperature information, and performing interactive fusion on the single-mode features to obtain a multi-mode fusion feature, comprises: Extracting a single-mode time sequence feature vector of each mode based on a time sequence relation of each mode information, wherein the single-mode time sequence feature vector comprises an image time sequence feature vector, a wind speed time sequence feature vector and a temperature time sequence feature vector; Extracting interaction feature vectors among the single-mode time sequence feature vectors based on the directional attention channel, and splicing each single-mode time sequence feature vector with the corresponding interaction feature vector to obtain a single-mode strengthening feature vector; And carrying out weighted fusion on each single-mode strengthening feature vector to obtain multi-mode fusion features.
4. The method of claim 1, wherein the extracting text features of the visual text information comprises: Calculating the pixel value definition of a text region in the visual text information, and when the pixel value definition is greater than or equal to a preset definition threshold value, performing optical character recognition on the text region to obtain the text information of the text region; When the pixel value definition is smaller than the preset definition threshold, determining the position of each text stroke in the text region based on the pixel values of the text region in all video frames containing the text region, and determining the text information of the text region based on the position of each text stroke; And extracting a clause vector of the text information, and determining the text feature based on the clause vector.
5. The method of claim 4, wherein calculating the pixel value sharpness of a text region in the visual text information comprises: Extracting image coordinates of the text region in an image, and determining pixel values of pixel points of the text region based on the image coordinates; And calculating the mean square error of the pixel value of each pixel point in the text region and the pixel mean value of the image to determine the pixel value definition of the text region.
6. The method for grading risk of potential hazards according to claim 4, wherein the step of merging and identifying the text features and the multimodal fusion features to obtain the risk grade of potential hazards in the target area comprises the steps of: Weighting and fusing the text features and the multi-modal fusion based on the association degree of the text features and the multi-modal fusion features with the risk level of the potential safety hazard of the target area respectively to obtain weighted fusion features; And carrying out risk identification on the weighted fusion characteristics to obtain the original score of each risk level, determining probability distribution of each risk level based on the original score of each risk level, and determining the risk level with the largest probability distribution as the risk level of the potential safety hazard of the target area.
7. The method for risk ranking potential hazards according to claim 6, wherein the step of performing risk identification on the weighted fusion features to obtain an original score of each risk level comprises: Constructing a feature sequence and position codes corresponding to all features in the feature sequence based on the weighted fusion features to obtain an embedded vector in a sequence form; and performing risk feature aggregation on the embedded vectors to obtain global risk feature vectors, and mapping the dimensions of the global risk feature vectors to the number of each risk level to obtain the original score of each risk level.
8. A potential safety hazard risk rating device, comprising: the information acquisition module is used for acquiring multi-mode risk hidden danger information of the target area, wherein the multi-mode risk hidden danger information comprises visual image information, visual text information, wind speed information and temperature information; The feature fusion module is used for extracting the single-mode features of the visual image information, the wind speed information and the temperature information, and carrying out interactive fusion on each single-mode feature to obtain a multi-mode fusion feature; and the risk level assessment module is used for extracting the character features of the visual text information, and carrying out fusion recognition on the character features and the multi-mode fusion features to obtain the risk level of the potential safety hazard of the target area.
9. An electronic device comprising a memory and a processor, wherein, The memory is used for storing programs; The processor, coupled to the memory, is configured to execute the program stored in the memory to implement the steps in the security risk rating method of any of the above claims 1 to 7.
10. A computer readable storage medium storing a computer readable program or instructions which when executed by a processor is capable of carrying out the steps of the potential safety hazard risk rating method of any of the preceding claims 1 to 7.

Description

Potential safety hazard risk rating method and device, electronic equipment and storage medium Technical Field The present invention relates to the field of security monitoring technologies, and in particular, to a method and apparatus for grading risk of potential safety hazard, an electronic device, and a storage medium. Background With the rapid development of social technology, the productivity level is continuously improved, however, the potential safety hazard problem is in an infinite situation in various links of production and life. Traditional manual inspection methods are time-consuming and labor-consuming, and are difficult to capture hidden danger information in real time, so that potential risks are difficult to discover and treat in time. Therefore, the efficiency and the accuracy of potential safety hazard identification are improved by introducing new technical means, the accurate classification and the rapid early warning of risks are realized, and the safety and the reliability of production operation and daily life are ensured. In the traditional industrial safety monitoring and intelligent safety protection fields, two independent paths are generally adopted for monitoring potential safety hazards, namely, safety hazard information existing in images is identified based on computer vision, such as workers do not wear safety helmets, cracks appear in pipelines and the like, and text information is extracted based on optical character recognition, so that warning marks, equipment meter readings or operation instructions in pictures are read, such as approaching prohibition and falling stone attention. The visual model is often interfered by external environmental factors, such as illumination change, object shielding and the like, text recognition is also interfered by factors such as image quality, context complexity and the like, the two paths can have false alarms and false alarms on potential safety hazards, and the recognition result can be in error. Therefore, in the prior art, by identifying data of a single mode, the risk hidden danger identification of the target place is not accurate enough. Disclosure of Invention In view of the foregoing, it is necessary to provide a method, an apparatus, an electronic device and a storage medium for risk rating of potential hazards, which are used for solving the problem that in the prior art, the identification of potential hazards in a target place is not accurate enough by identifying data in a single mode. In order to solve the above problems, in a first aspect, the present invention provides a security risk rating method, including: acquiring multi-modal risk hidden danger information of a target area, wherein the multi-modal risk hidden danger information comprises visual image information, visual text information, wind speed information and temperature information; Extracting single-mode characteristics of visual image information, wind speed information and temperature information, and carrying out interactive fusion on the single-mode characteristics to obtain multi-mode fusion characteristics; and extracting the character features of the visual text information, and carrying out fusion recognition on the character features and the multi-mode fusion features to obtain the risk level of the potential safety hazard of the target area. In one possible implementation manner, acquiring multi-modal risk potential information of a target area includes: Acquiring a video of a target area through a vision acquisition device, and acquiring wind speed information and temperature information of the target area through a sensor; And processing the video frames of the video frame by frame, extracting visual image information of the video frame when the video frame does not contain text information, and extracting visual text information in the video frame when the video frame contains text information. In one possible implementation manner, the method includes extracting single-mode features of visual image information, wind speed information and temperature information, and performing interactive fusion on the single-mode features to obtain multi-mode fusion features, including: extracting a single-mode time sequence feature vector of each mode based on the time sequence relation of each mode information, wherein the single-mode time sequence feature vector comprises an image time sequence feature vector, a wind speed time sequence feature vector and a temperature time sequence feature vector; Extracting interaction feature vectors among the single-mode time sequence feature vectors based on the directional attention channel, and splicing each single-mode time sequence feature vector with the corresponding interaction feature vector to obtain a single-mode strengthening feature vector; and carrying out weighted fusion on each single-mode strengthening feature vector to obtain multi-mode fusion features. In one possible implementation, extracting