CN-121980342-A - Power defect multi-mode identification method and system integrating space reasoning learning
Abstract
The invention provides a power defect multi-mode identification method and system integrating space reasoning learning, and relates to the technical field of power system automation. The method and the device have the advantages that the target entity and the spatial relationship thereof are modeled into the graph structure, the model can explicitly learn the geometric topology and interaction among devices in the electric power scene, the understanding capability of complex spatial logic is enhanced, the identification accuracy of defect types depending on relative position judgment is improved, the spatial relationship graph is then utilized to infer by a message transmission mechanism of the graph neural network, the enhancement characteristic fusing the geometric topology and the spatial constraint relationship among the targets is generated, the identification capability under the condition of visual similar target interference is improved, the interference resistance and scene adaptability are improved, finally, the interference filtering is carried out by combining the spatial relationship graph in the identification stage, the false defect targets are filtered, the false alarm rate is reduced, the problem that the identification capability of the traditional defect identification technology on the composite defect of the electric power device line is insufficient is solved, and the identification capability of the composite defect of the electric power device line is improved.
Inventors
- XU XING
- ZHAO JIANBIN
- ZHANG BO
- ZHANG PENGFEI
- LIU YADUO
- ZHAO XIAOXIANG
- Geng Mingxi
- YANG CHEN
- WANG SHAOYING
Assignees
- 国网河北省电力有限公司信息通信分公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260112
Claims (10)
- 1. The power defect multi-mode identification method integrating space reasoning learning is characterized by comprising the steps of acquiring a power inspection image and text description data; Based on the power inspection image and the text description data, extracting visual features and text semantic features, and obtaining multi-mode features through cross-mode alignment and semantic mapping; based on the multi-mode characteristics, a plurality of target entities are identified, and a spatial relationship diagram taking the target entities as nodes and the spatial relationship among the target entities as edges is constructed; Carrying out space reasoning learning on the space relation graph by adopting a graph neural network to generate enhanced features fused with the geometric topology and space constraint relation between targets; And carrying out defect type recognition and interference target filtering according to the enhanced features and the spatial relation diagram to obtain an electric power defect recognition result.
- 2. The method for multi-modal identification of power defects by fusion of spatial reasoning learning according to claim 1, wherein the steps of extracting visual features and text semantic features based on the power inspection image and text description data, and obtaining multi-modal features through cross-modal alignment and semantic mapping include: the visual encoder is adopted to carry out block coding on the power inspection image, and a high-dimensional visual characteristic sequence is extracted; Word segmentation encoding is carried out on the text description data by adopting a text encoder, and a text semantic feature sequence is extracted; inputting the high-dimensional visual feature sequence and the text semantic feature sequence into a multi-modal feature fusion module, carrying out feature interaction through a learnable query vector, carrying out cross-modal feature fusion by using a cross-attention mechanism, and mapping the visual and text features into the same shared semantic space to obtain the aligned multi-modal features.
- 3. The method for identifying multiple modes of power defects by fusion of spatial reasoning and learning according to claim 1, wherein the steps of identifying multiple target entities based on the multiple mode features and constructing a spatial relationship graph with the target entities as nodes and spatial relationships among the target entities as edges comprise: Inputting the multi-mode characteristics into a target detection network, and predicting to obtain bounding boxes and local visual characteristics of a plurality of target entities in the power inspection image; constructing an initial graph by taking each target entity as a graph node and taking the local visual characteristic of each target entity as an initial characteristic vector; Establishing a connection edge between nodes on the initial graph by combining the spatial relative position relation between target entities to obtain a secondary graph; Based on the geometric relations among bounding boxes of a plurality of target entities, giving an initial weight representing a spatial relation type to each side in the secondary graph to obtain a structured spatial relation graph, wherein the spatial relation type comprises one or more of up and down, left and right, contact and winding.
- 4. The method for identifying the power defects in the multiple modes by fusing spatial reasoning and learning as set forth in claim 1, wherein the spatial reasoning and learning the spatial relationship graph by using the graph neural network to generate the enhanced feature fusing the geometric topology and the spatial constraint relationship among the targets comprises: Extracting visual feature vectors of each node and spatial relation type codes of each side from the spatial relation graph to form node-side feature pairs; Inputting the node-edge feature pairs into a graph attention network, wherein a visual feature vector is embedded as an initial node, and a spatial relationship type code is used as a relationship bias in attention calculation; in each layer of graph reasoning process of the graph attention network, adopting a relationship-aware attention mechanism, adjusting attention weights among nodes according to edge characteristics, and realizing the message transmission guided by the spatial relationship; taking the global semantic vector in the text semantic features as a context condition of graph reasoning, and fusing the global semantic vector with the neighbor features of the node in each layer of node updating; after multiple rounds of iterative updating, the final feature vector of each node is extracted and used as an enhancement feature fusing the global space context and the local topological relation.
- 5. The method for identifying multiple modes of power defects by fusion of spatial reasoning and learning according to claim 1, wherein the steps of identifying defect categories and filtering interference targets according to the enhancement features and the spatial relationship diagram to obtain power defect identification results include: inputting the enhanced features of each target entity to a multi-head self-attention layer, and aggregating to generate a scene-level global context vector; splicing and fusing the scene-level global context vector and the enhancement features of each target entity to obtain a fusion vector; inputting the fusion vector into a classifier to conduct defect type prediction to obtain a prediction result of each target entity, wherein the prediction result comprises whether defects exist or not and defect types; Based on the prediction result, checking whether the spatial relationship accords with the preset spatial logic constraint of the defect type for the defect entity in each target entity to obtain a checking result of each defect entity; if the verification result is that the preset space logic constraint is not met, judging that the verification result is an interference target and filtering the interference target to obtain a true defect entity; And generating the electric power defect identification result based on the real defect entity, and the defect category and the space position information of each real defect entity.
- 6. The method for identifying multiple modes of power defects by fusion of spatial reasoning and learning according to claim 1, wherein the steps of identifying defect types and filtering interference targets according to the enhancement features and the spatial relationship diagram, and after obtaining a power defect identification result, further comprise: Determining a risk level and a processing suggestion of the power equipment based on the power defect identification result; determining a target inspection image related to a true defect entity from the power inspection images based on the power defect identification result; generating a structured inspection report based on the defect type, the spatial position information and the confidence information in the electric power defect identification result, the risk level and the processing suggestion and the target inspection image; and adding the time stamp and the equipment identifier to the structured inspection report, and storing the structured inspection report into an inspection database.
- 7. The method for identifying multiple modes of power defects by fusion of spatial reasoning and learning according to claim 1, wherein the steps of identifying defect types and filtering interference targets according to the enhancement features and the spatial relationship diagram, and after obtaining a power defect identification result, further comprise: determining a risk level of the power equipment based on the power defect identification result; Setting a pushing priority based on the risk level, wherein high-risk defects are pushed in real time, and medium-risk defects and low-risk defects are pushed in batches at fixed time; And sending a patrol data packet to a patrol management platform based on the push priority, wherein the patrol data packet comprises the electric power defect identification result and a structured patrol report.
- 8. The method for identifying multiple modes of power defects by fusion of spatial reasoning and learning according to claim 1, wherein the steps of identifying defect types and filtering interference targets according to the enhancement features and the spatial relationship diagram, and after obtaining a power defect identification result, further comprise: extracting a key space relation triplet according to the defect type and the space position information in the electric power defect identification result, wherein the key space relation triplet comprises a subject, a relation and an object; comparing the key space relation triples with a historical space relation knowledge base, and determining a comparison result; if the comparison result is a new relation mode, adding the key space relation triples into a history space relation knowledge base to obtain an updated knowledge base; Based on the updated knowledge base, the weight distribution of the samples in the spatial relationship data set is adjusted to realize relationship reasoning reinforcement in model training.
- 9. The method for multi-modal identification of power defects with fused spatial reasoning learning of claim 1, further comprising: Recording key intermediate data in an reasoning process, wherein the key intermediate data comprises a target detection result, a spatial relationship graph structure, a node attention weight, a classification confidence level and a logic filtering decision path; correlating the intermediate data with a power defect identification result to generate a structured reasoning log; and storing the structured reasoning log into a log database.
- 10. A power defect multimodal recognition system incorporating spatial reasoning learning, characterized in that the system comprises an electronic device comprising a memory storing a computer program and a processor for invoking and running the computer program stored in the memory to perform the method according to any of claims 1 to 9.
Description
Power defect multi-mode identification method and system integrating space reasoning learning Technical Field The invention relates to the technical field of power system automation, in particular to a power defect multi-mode identification method and system integrating space reasoning learning. Background Regular inspection of the power transmission line is an important basic link for guaranteeing safe and stable operation of the power system. Along with the deep fusion of unmanned aerial vehicle technology and artificial intelligent algorithm, computer vision-based intelligent inspection gradually replaces the traditional artificial inspection mode, and becomes a mainstream mode in the current industry. The early intelligent recognition technology mainly relies on a single-mode visual model such as a convolutional neural network and the like, focuses on feature extraction and detection of preset targets such as insulator breakage, bird nest or wire foreign matter and the like, is usually limited by closed set setting, can only recognize predefined specific defect types, lacks understanding capability of unstructured scene and complex text description, and is easy to generate false alarm and missing alarm under complex working conditions of strong background interference or high similarity of defect targets and normal component appearance. The current intelligent power defect identification technology mainly relies on a visual characteristic matching model, and detects and classifies preset targets such as insulator breakage, bird nest or wire foreign matter through a single-mode method such as convolutional neural network. While such approaches have achieved some success in structured scenes, the recognition mechanism is essentially based on closed set matching of visual features such as appearance texture, color, shape, etc., and does not model the interaction relationships between objects and the spatial causal logic. In a truly complex power inspection scenario, the determination of defects often depends not only on the target itself, but also on the spatial relationship and interaction pattern with surrounding devices. For example, bird nests constitute a potential hazard in that they are wound or built on wires or towers, rather than merely because they have a visual appearance similar to birds perching, and as such, the risk of hanging foreign matter arises from their contact or hanging relationship with wires. The existing method can not understand the space topology and cause and effect constraint, and can only judge the apparent similarity, so that the false alarm and false alarm rate is obviously increased under the scenes of visual similarity interference, complex background superposition, multi-target interaction and the like, and the recognition capability of the composite type and hidden defects is seriously insufficient. Disclosure of Invention The invention provides a multi-mode identification method and a multi-mode identification system for electric power defects by integrating space reasoning learning, which solve the problem that the identification capability of the existing defect identification technology on composite defects is insufficient. The invention provides a power defect multi-mode identification method integrating space reasoning learning, which comprises the steps of obtaining power inspection images and text description data, extracting visual features and text semantic features based on the power inspection images and the text description data, obtaining multi-mode features through cross-mode alignment and semantic mapping, identifying a plurality of target entities based on the multi-mode features, constructing a space relation graph taking the target entities as nodes and taking space relations among the target entities as edges, carrying out space reasoning learning on the space relation graph by adopting a graph neural network, generating enhancement features integrating geometric topology among targets and space constraint relations, and carrying out defect type identification and interference target filtering according to the enhancement features and the space relation graph to obtain a power defect identification result. The invention provides a power defect multi-mode identification device integrating space reasoning learning, which comprises a communication module, a processing module, a multi-mode identification module, a graph neural network and a power defect identification module, wherein the communication module is used for acquiring a power inspection image and text description data, the processing module is used for extracting visual characteristics and text semantic characteristics based on the power inspection image and the text description data, obtaining multi-mode characteristics through cross-mode alignment and semantic mapping, identifying a plurality of target entities based on the multi-mode characteristics, constructing a space relation graph ta