CN-121981218-A - Knowledge graph completion model construction method for fusion entity description and graph structure in electricity safety field
Abstract
The invention provides a knowledge graph completion model construction method for fusion entity description and graph structure in the electricity safety field, which belongs to the field of electric power information technology and artificial intelligence fusion application and comprises the steps of obtaining a knowledge graph in the electricity safety field; the method comprises the steps of obtaining entity description information of a target entity from an electricity safety domain knowledge graph, carrying out entity description text coding by adopting a pre-training SBERT model to obtain a target entity description text vector, adopting a TransE model to code the structural relation of the target entity in the electricity safety domain knowledge graph to obtain a graph structure vector based on triplets of the target entity, inputting the target entity description text vector and the graph structure vector into a weighted fusion model GAT to splice, carrying out information capturing and feature integration by utilizing a multi-layer attention mechanism to obtain a fused entity representation, and inputting the fused entity representation into a decoder ConKB to carry out link prediction to obtain a recognition result of potential electricity safety hazards.
Inventors
- WANG XIAOMING
- BAI YUNLONG
- ZHAO WENGUANG
- WANG YUHANG
- XU BIN
- NI JINGYI
Assignees
- 国网安徽省电力有限公司电力科学研究院
Dates
- Publication Date
- 20260505
- Application Date
- 20251120
Claims (10)
- 1. The method for constructing the knowledge graph complement model of the electric safety field fusion entity description and graph structure is characterized by comprising the following steps: step 1, acquiring an existing knowledge graph in the electricity safety field; step 2, obtaining entity description information of a target entity from the obtained power utilization safety domain knowledge graph, adopting a pre-training SBERT model to carry out entity description text coding to generate a target entity description vector, and adopting a TransE model to code the structural relation of the target entity in the power utilization safety domain knowledge graph to generate an initial structural vector of the target entity based on a triplet; Step 3, inputting the generated target entity description vector and the initial structure vector based on the triplet into a weighted fusion model GAT for splicing, and capturing information and integrating features by using a multi-layer attention mechanism to obtain a fused entity representation; and step 4, inputting the obtained fused entity representation into a decoder ConKB for link prediction to obtain a recognition result of the potential safety hazard.
- 2. The method of claim 1, wherein in step 1, the electricity safety domain knowledge graph includes an entity type of the electricity safety domain and a relationship between the entity types, wherein, The entity types comprise industry information, equipment information, production flow, fault hidden danger/abnormality information, area information and treatment measure information; The relationship among the entity types comprises the containing, existence and locating relationship among different entity types, and the front-back or association relationship among equipment, production flow, fault hidden danger and disposal measures.
- 3. The method of claim 2, wherein each type of entity extracted from the unstructured text is stored in tabular form, wherein, The industry information table comprises the industry load characteristics, industry regulation priority, industry adjustable capacity and industry electricity consumption; The key equipment list comprises related electricity utilization attributes of production equipment in each production flow, wherein the related electricity utilization attributes comprise load grade, load type, load characteristic, adjustable capacity and shutdown result of electric equipment; the production flow list comprises the flows of the industry in production, the electric equipment of each flow, the load characteristics of the production flow, the regulation and control priority, the electricity consumption of the production flow and the constraint of the production flow; The fault hidden trouble list comprises fault information which possibly occurs in the production process of the corresponding industry, including fault types, equipment and processes where the fault occurs, responsibility departments, fault reasons and disposal modes; The workshop area table comprises floors, areas, area voltages, adjustable capacity and fire-fighting grades of all power utilization areas, and key equipment in the placed production flow; the handling suggestion table corresponds to the fault hidden danger table, wherein the handling mode of the fault is recorded, and the handling mode comprises responsibility personnel and operation details.
- 4. The method of claim 2, wherein the relationships between entity types are stored in the form of tables.
- 5. The method of claim 1, wherein in step 2, the SBERT model includes two BERT models sharing weights, the two BERT models process the input of sentences respectively and pool the respective outputs to obtain sentence vectors representing entity description information with fixed dimensions respectively, and the two obtained sentence vectors are subjected to similarity calculation to obtain two sentence similarity outputs, and when the similarity is greater than a predetermined threshold, the entity description vector is generated.
- 6. The method of claim 5, wherein the similarity calculation is performed using cosine similarity, manhattan distance, or euclidean distance.
- 7. The method of claim 5, wherein the pooling operation employs an average pooling strategy.
- 8. The method of claim 1, wherein in step 2, the BERT model is optimized using a mean square loss function.
- 9. The method of claim 1, wherein in the TransE model of step 2, the triplet (h, r, t) satisfies: where h represents a head entity, t represents a tail entity, and r represents a relationship between the head entity h and the tail entity r.
- 10. The method of claim 1, wherein step 3 comprises: step 31, fusion splicing is carried out on a target node in the electricity safety knowledge graph and the generated entity description vector and the initial structure vector by using a splicing function to form an entity joint vector; step 32, carrying out weighted summation on the entity relationship characteristics and the formed entity joint vectors through the graph attention layer to obtain the characteristic vectors of the adjacent nodes; step 33, calculating the importance degree of the feature vector of the obtained adjacent node to the target node; Step 34, the feature vectors of the adjacent nodes are weighted and summed through the attention layer of the graph; And 35, performing linear transformation on the embedded vectors of the relations to obtain a fused entity representation.
Description
Knowledge graph completion model construction method for fusion entity description and graph structure in electricity safety field Technical Field The invention belongs to the technical field of electric power information technology and artificial intelligence fusion application, and particularly relates to a knowledge graph completion model construction method for fusion entity description and graph structure in the electric power safety field. Background In an actual electricity utilization potential safety hazard decision scene, more factors need to be considered in electricity utilization safety inspection, the problems of single inspection data, dead zones of hidden hazards, incomplete decision and the like exist in a mode of identifying through manual data observation, the existing deep learning model generally only comprises fewer features, a complete solution idea is difficult to put forward, the intelligent requirement of comprehensive potential safety hazard investigation is not sufficiently supported, in addition, the operation of the map in a downstream task of a knowledge map has complexity, the traditional reasoning method has the limitations on efficiency and accuracy, and the comprehensive prediction effect of the electricity utilization potential safety hazard is limited. Aiming at the specific problem scene of electricity utilization potential safety hazard detection, an entity and relation model of an electricity utilization safety knowledge graph covering safety regulations, operation flows, electric safety logic, fault potential hazard characteristics and the like is established, the aim of laying a foundation for intelligent data application is achieved, a digital comprehensive management platform is finally accessed, the efficiency of data management work is improved, and management means are enriched. The construction of the high-quality knowledge graph in the electricity safety field mainly solves the following two difficulties in the electricity safety detection field: (1) A large amount of expertise in the current electricity safety inspection field exists in a discretized document form, a unified knowledge system and a database are not formed, and foundation support such as convenient and efficient data query and analysis cannot be provided for staff, so that intelligent transformation and promotion are difficult. For example, in various aspects of daily management, standardized operation, fault investigation and the like of a power distribution network, massive fragmented electronic documents and data exist, so that at present, professional and accurate data are not reasonably mined and integrated, and the intelligent cognition complexity of a power system is further increased by professional field knowledge. (2) In the electric safety management, the data of potential safety hazards of electric type and non-electric type are obviously different, and a typical data gap is formed. The electrical hidden trouble usually originates from structural data such as equipment sensors and monitoring systems and has definite numerical characteristics and logical relations, while the non-electrical hidden trouble usually involves unstructured or semi-structured information such as operation rules, environmental factors, personnel behaviors and the like, often exists in a text form and lacks unified data standards and modeling modes. The difference between the structure and the semantic level causes that two hidden danger data are difficult to communicate and fuse in practical application, an information island is formed, and the improvement of comprehensive perception, joint analysis and intelligent decision making capability of safety risks is severely restricted. Therefore, how to effectively integrate the electrical hidden danger data and the non-electrical hidden danger data becomes one of the core challenges facing the intelligent management of the electricity safety field. The invention discloses a deep learning-based electric power domain knowledge graph relationship discovery method, which is characterized in that electric power domain news data is acquired through a web crawler, a corpus is constructed by combining public domain corpus with electric power professional vocabulary, a BERT model is subjected to incremental training to extract text features, entities and relationships are embedded into a low-dimensional space through a stacked convolutional neural network, the features are extracted to complement knowledge graphs, and then candidate triples are rearranged through a student model obtained through knowledge distillation, so that the relationships between the knowledge graph complement and the predicted entities are performed. This method has the following disadvantages: 1. The data corpus construction lacks professional diversity, the technology mainly relies on news text data collected by a web crawler as a corpus source for knowledge graph construction, the news text ha