Search

CN-121981245-A - Hydraulic engineering accident cause association analysis method based on improved Apriori algorithm

CN121981245ACN 121981245 ACN121981245 ACN 121981245ACN-121981245-A

Abstract

The invention discloses a hydraulic engineering accident cause association analysis method based on an improved Apriori algorithm, which comprises the steps of collecting safety accident case data produced in the field of hydraulic engineering construction, constructing a standardized accident cause factor set, completing cause text normalization processing by means of synonym replacement and semantic mapping, then constructing a cause factor causal relationship graph, mining association rules based on the improved Apriori algorithm, introducing accident severity and influence range, constructing a weighted evaluation model, dynamically setting minimum weighted support, confidence and promotion, and finally realizing cause network visualization by combining Gephi software. The method solves the problems that the traditional Apriori algorithm does not consider the accident differentiation characteristics and the cause analysis accuracy is low, obviously improves the scientificity of the hydraulic engineering accident cause association mining, and provides accurate data support for safety management.

Inventors

  • WU ZENGGUANG
  • LIU HAO
  • MA CHENGFANG
  • QI SANHONG
  • HUANG PEIJIE
  • ZUO XUJUN
  • FU WANWAN

Assignees

  • 黄河勘测规划设计研究院有限公司

Dates

Publication Date
20260505
Application Date
20260126

Claims (10)

  1. 1. A hydraulic engineering accident cause association analysis method based on an improved Apriori algorithm is characterized by comprising the following steps: s1, collecting production safety accident investigation report data in the field of hydraulic engineering construction, and extracting a standardized accident cause factor association relation; S2, constructing a causal relationship graph based on the standardized accident cause factor association relationship; s3, adopting an improved Apriori algorithm to mine association rules of causative factors, wherein the method specifically comprises the following steps of S3.1, assigning values to the accident severity and the accident influence range according to assignment rules, and determining the weight of each causative factor based on HArank algorithm; S3.2, dynamically setting association rule mining indexes, and determining the minimum weighted support, the minimum weighted confidence and the minimum weighted lifting degree of an Apriori algorithm; s3.3, generating frequent item sets, generating 1 item candidate item set by scanning the causal relation graph, calculating the weighted support degree of each item set, and reserving the item sets with the support degree more than or equal to 0.2 to form the frequent 1 item set; And S3.4, screening association rules of the cause factors, calculating the weighted confidence coefficient and the weighted lifting coefficient of each rule in the frequent item set, screening out strong association rules meeting the conditions that the weighted confidence coefficient is more than or equal to 0.7 and the weighted lifting coefficient is more than or equal to 1, and extracting a safety accident cause list according to the strong association rules.
  2. 2. The method and the device for analyzing the cause association of the hydraulic engineering accident based on the improved Apriori algorithm of claim 1, wherein the step S1 comprises the following steps: S1.1, constructing a standardized accident cause factor set, dividing cause factors into 4 types of individual behaviors, equipment facilities, external environments and organization management according to 4 dimensions of people, objects, environments and management, and at least 26 cause factors; s1.2, normalizing the cause text in the production safety accident investigation report data by using a standardized accident cause factor set, and converting the cause text with the same meaning and different expression into a unified format through synonym replacement, rule matching and semantic mapping according to the accident cause theory and the hydraulic engineering safety management experience; S1.3, constructing an accident dictionary which is constructed based on the normalized production safety accident investigation report and comprises 8 core dimensions including occurrence time, the project where the accident is located, accident passing, accident type, death number, economic loss, accident reason and responsibility identification, and is used for extracting the standardized accident cause factor association relation.
  3. 3. The method for analyzing the cause of hydraulic engineering accident according to claim 1, wherein the step S2 comprises S2.1, constructing an initial causal relation graph by taking causal relation among factors as directed edges based on the extracted standardized causal factor association relation so that the causal factors are nodes; s2.2, obtaining an optimization causality graph by adopting a pruning algorithm fused with random forest feature evaluation.
  4. 4. The method for analyzing the cause of hydraulic engineering accident according to claim 3, wherein the step S2.2 comprises the following steps: S2.2.1, converting the causal relationship graph into an undirected graph, and dividing the undirected graph into 8 sub-graphs by using a Louvain community discovery algorithm, wherein each sub-graph comprises 3-5 closely-related causative factors; S2.2.2, constructing a random forest classifier for each sub-graph, and calculating the feature importance score of each sub-graph through the classifier according to whether the sub-graph contains the cause factors to directly cause the accident to occur as a binary label; s2.2.3, reserving a subgraph with feature importance score more than or equal to 0.6, eliminating redundant subgraphs with score less than 0.6, comparing the reserved subgraphs with the initial causal relation graph, and restoring the directed edge direction to obtain an optimized causal relation graph.
  5. 5. The method for analyzing the cause of hydraulic engineering accident according to claim 4, wherein the construction parameters of the random forest classifier in the step S2.2.2 are 100 decision trees, 10 maximum trees, 2 minimum sample divisions, 1 minimum sample leaf nodes and Gini coefficients are adopted as characteristic selection indexes.
  6. 6. The hydraulic engineering accident cause association analysis method based on the improved Apriori algorithm of claim 1 is characterized by further comprising the steps of visually displaying, inputting Gephi a safety accident cause list into software for visual display, setting the node size and the color to be determined by node degree and cause factor weight together, visually displaying the association strength and network structure of the cause factors, and providing visual decision basis for safety management.
  7. 7. The method for analyzing the cause of hydraulic engineering accident according to claim 1, wherein in the generating process of the frequent item set in the step S3.3, a hash tree is adopted to store the candidate item set, the calculation complexity is reduced, the calculation efficiency of the weighted support degree is improved, the branching factor of the hash tree is set to be 5, and the tree depth is consistent with the length of the item set.
  8. 8. The method for analyzing the cause of hydraulic engineering accident according to claim 1, wherein the minimum weighted support degree is 0.2, the minimum weighted confidence degree is 0.7 and the minimum weighted lifting degree is 1 in the step S3.2.
  9. 9. The method for analyzing the water conservancy project accident cause association based on the improved Apriori algorithm according to claim 1, wherein the confidence level=the number of cases containing the front item and the rear item/the number of cases containing the front item, and the promotion level=the confidence level/the rear item support level in S3.4.
  10. 10. The method for analyzing the cause of hydraulic engineering accident according to claim 1, wherein the weight = (severity score + influence range score)/2 in S3.1.2 is higher, and the higher the weight is, the higher the importance of the cause factor is.

Description

Hydraulic engineering accident cause association analysis method based on improved Apriori algorithm Technical Field The invention relates to the technical field of hydraulic engineering safety production management, in particular to a hydraulic engineering accident cause association analysis method based on an improved Apriori algorithm. Background The hydraulic engineering construction has the characteristics of various project types, complex construction environment, numerous risk factors and the like, and the safety production management difficulty is extremely high. Traditional accident cause analysis is mostly dependent on manual experience, lacks system excavation on the association relation among cause factors, and is difficult to reveal the intrinsic law of accident occurrence. In the prior art, the Apriori algorithm has application in accident cause analysis in the fields of road traffic, aviation, coal mine and the like, but has less application in the field of hydraulic engineering, and lacks a comprehensive application scheme combining a visualization technology and hidden danger topic analysis. In addition, the existing Apriori algorithm does not consider the severity and influence of accidents in association rule mining, and the accuracy of association rule mining is required to be improved. In view of the above, conventional analysis of the causes of hydraulic engineering safety accidents still faces many difficulties and challenges. Disclosure of Invention The invention aims to provide a hydraulic engineering accident cause correlation analysis method based on an improved Apriori algorithm, which is used for solving the problem of low accuracy of hydraulic engineering safety accident cause analysis and providing data support for accident prevention and hidden trouble investigation. In order to achieve the above purpose, the hydraulic engineering accident cause association analysis method based on the improved Apriori algorithm provided by the invention comprises the following steps: s1, collecting production safety accident investigation report data in the field of hydraulic engineering construction, and extracting a standardized accident cause factor association relation; S2, constructing a causal relationship graph based on the standardized accident cause factor association relationship; s3, adopting an improved Apriori algorithm to mine association rules of causative factors, wherein the method specifically comprises the following steps of S3.1, assigning values to the accident severity and the accident influence range according to assignment rules, and determining the weight of each causative factor based on HArank algorithm; S3.2, dynamically setting association rule mining indexes, and determining the minimum weighted support, the minimum weighted confidence and the minimum weighted lifting degree of an Apriori algorithm; s3.3, generating frequent item sets, generating 1 item candidate item set by scanning the causal relation graph, calculating the weighted support degree of each item set, and reserving the item sets with the support degree more than or equal to 0.2 to form the frequent 1 item set; And S3.4, screening association rules of the cause factors, calculating the weighted confidence coefficient and the weighted lifting coefficient of each rule in the frequent item set, screening out strong association rules meeting the conditions that the weighted confidence coefficient is more than or equal to 0.7 and the weighted lifting coefficient is more than or equal to 1, and extracting a safety accident cause list according to the strong association rules. Further, step S1 includes: S1.1, constructing a standardized accident cause factor set, dividing cause factors into 4 types of individual behaviors, equipment facilities, external environments and organization management according to 4 dimensions of people, objects, environments and management, and at least 26 cause factors; s1.2, normalizing the cause text in the production safety accident investigation report data by using a standardized accident cause factor set, and converting the cause text with the same meaning and different expression into a unified format through synonym replacement, rule matching and semantic mapping according to the accident cause theory and the hydraulic engineering safety management experience; S1.3, constructing an accident dictionary which is constructed based on the normalized production safety accident investigation report and comprises 8 core dimensions including occurrence time, the project where the accident is located, accident passing, accident type, death number, economic loss, accident reason and responsibility identification, and is used for extracting the standardized accident cause factor association relation. Further, step S2 includes S2.1, constructing an initial causal relation graph by taking causal relation among factors as directed edges based on the extracted standardized causal fac