CN-121093277-B - Hierarchical architecture-based power knowledge processing method and device, computer equipment and storage medium
Abstract
The invention relates to the field of data processing and discloses a power knowledge processing method, device, medium and equipment based on a hierarchical architecture, wherein the method comprises the steps of acquiring multi-source heterogeneous power knowledge data and respectively executing standardized processing on structured data, unstructured data and real-time stream data; classifying and labeling the standardized structured knowledge data based on a multi-dimensional classification system to obtain labeling data, converting the standardized unstructured text into triplet data, constructing a dynamic knowledge graph based on the triplet data, the labeling data and the standardized real-time stream data, and generating a processing report based on the power service scene type and the dynamic knowledge graph. By implementing the invention, the problems of obvious defects in the aspects of processing multi-source heterogeneous power knowledge data, classifying labels, constructing knowledge maps, generating processing reports and the like in the related technology are solved.
Inventors
- WU SHEN
- YU YIFAN
- SHI YADONG
- WANG GUORUI
- PEI QIUGEN
- JIANG JIANG
- WANG XUXIAN
- ZENG LIANGBO
- SU HUAQUAN
- CHEN LILI
Assignees
- 广东电网有限责任公司
Dates
- Publication Date
- 20260512
- Application Date
- 20250904
Claims (8)
- 1. A hierarchical architecture-based power knowledge processing method, the method comprising: Acquiring multi-source heterogeneous power knowledge data, wherein the multi-source heterogeneous power knowledge data comprises structured data, unstructured data and real-time streaming data; respectively performing standardized processing on the structured data, unstructured data and real-time stream data; Classifying and labeling the standardized structured knowledge data based on a multi-dimensional classification system to obtain labeling data, wherein the multi-dimensional classification system comprises a main dimension and an auxiliary dimension; converting the unstructured text after the standardized processing into triplet data; constructing a dynamic knowledge graph based on the triplet data, the labeling data and the real-time stream data after standardized processing, wherein the knowledge graph takes equipment, faults and regulations as core entities; generating a processing report based on the power business scene type and the dynamic knowledge graph; The structured data comprises standing account information of equipment asset dimension and flow form data of service dimension, the unstructured data comprises policy files of knowledge type dimension and technical documents of professional dimension, the real-time stream data comprises equipment running state monitoring data, and the standardized processing is respectively carried out on the structured data, the unstructured data and the real-time stream data, and the method comprises the following steps: Performing unified field mapping on the structured data based on a preset metadata standard, and performing compliance verification on the structured data by associating corresponding professional technical standards; analyzing and extracting the unstructured data by adopting a multi-mode document, converting the unstructured data into an resolvable format, and performing semantic cleaning; Performing noise filtering processing and characteristic parameter extraction on the real-time stream data through an edge node, and performing standardized encapsulation on the processed real-time stream data based on a time sequence and equipment identification; The main dimension comprises a knowledge type LX, a business YW and a professional ZY, the auxiliary dimension comprises an equipment asset SB and an organization JG, the standardized structured knowledge data is classified and marked based on a multi-dimensional classification system to obtain marked data, and the method comprises the following steps: carrying out semantic analysis on the structured knowledge data by adopting a pre-training large model in the electric power field, and extracting service theme characteristics, technical field characteristics and core object characteristics of the structured knowledge data; Matching a knowledge type LX, a business YW and a professional ZY based on the business theme characteristics, the technical field characteristics and the core object characteristics, and forcing the associated ordering based on the sequence from the knowledge type LX to the business YW to the professional ZY; Dynamically screening the equipment assets SB and the organization JG based on knowledge attributes, and sorting the equipment assets SB and the organization JG based on relevancy priority sorting, wherein the knowledge attributes are determined by business scene features, business theme features, technical field features and core object features; Calling a hierarchical category knowledge graph under the main dimension for the ordered main dimension, and gradually matching to the finest granularity category through similarity calculation of feature words and category attributes, and marking to obtain main dimension marking data; Calling a hierarchical category knowledge graph under the auxiliary dimension for the ordered auxiliary dimension, and gradually matching to the finest granularity category and marking through similarity calculation of feature words and category attributes to obtain auxiliary dimension marking data; and forming the annotation data based on the main dimension annotation data and the auxiliary dimension annotation data.
- 2. The method of claim 1, wherein the converting the normalized unstructured text into triplet data comprises: Performing word segmentation and entity recognition on the unstructured text by adopting a natural language processing technology to obtain a core entity, wherein the core entity comprises equipment, faults, regulations and business activities, and the entity recognition is realized based on a pre-training model in the electric power field; Analyzing the semantic association degree between the core entities by utilizing a semantic relation extraction algorithm, and determining the relation type between the subject entity and the object entity; and constructing a triplet based on the formats from the subject entity, the relation type and the object entity, wherein the subject entity, the relation type and the object entity are all associated with the dimension information in the multi-dimensional classification system.
- 3. The method according to claim 2, wherein the constructing a dynamic knowledge-graph based on the triplet data, the annotation data and the normalized real-time stream data comprises: Constructing a map basic framework by taking a device entity, a fault entity and a rule entity as core entities, and establishing an initial association relation among the core entities through triple data, wherein the device entity associates the annotation data of the device asset SB, and the rule entity associates the annotation data of the knowledge type LX; performing time sequence feature extraction on the real-time stream data after the standardized processing, and associating the real-time stream data to a corresponding core entity based on the time stamp and the equipment identifier; mining the relation between core entities by adopting a graph neural network algorithm, and calculating an implicit association path for complementing the equipment association fault association procedure by using the entity vector similarity; Detecting whether logic conflict exists between different dimension classification systems by using a cross-dimension consistency check mechanism and using a graph neural network; If logic conflict exists, marking the difference nodes and triggering a tracing verification flow based on the marked data, and updating the dynamic knowledge graph.
- 4. The method of claim 3, wherein generating a processing report based on the power business scenario type and the dynamic knowledge-graph comprises: determining a power business scene type, wherein the power business scene type comprises a device operation and maintenance scene and a policy compliance scene; Invoking a category system of a service YW corresponding to the power service scene type in the dynamic knowledge graph, and determining a content framework and a data extraction range of a processing report; retrieving entity-related data related to the power business scene type from a dynamic knowledge graph, wherein the entity-related data comprises equipment assets SB, historical fault cases, procedure files in knowledge types LX and technical standards in professions ZY, and forming a report basic data set; carrying out deep analysis on the entity associated data by adopting a graph neural network, and generating structural analysis content comprising scene adaptation conclusion and according to traceability and optimization suggestions by combining a retrieval enhancement generation technology of a generation type large model; And integrating the structural analysis content into a processing report according to the output specification of the power business scene type, wherein the processing report output by the equipment operation and maintenance scene comprises a fault diagnosis report and a historical case matching result, and the processing report output by the policy compliance scene comprises a scheme and a policy suitability report.
- 5. The method according to claim 4, wherein the classifying and labeling the normalized structured knowledge data based on the multi-dimensional classification system to obtain labeled data, further comprises: adding dimension attribute labels for the core entities based on the labeling information of the knowledge type LX, the business YW, the professional ZY, the equipment asset SB and the organization JG in the labeling data; And carrying out weight adjustment on the association relation between the core entities by adopting a time sequence attenuation algorithm based on a weight adjustment rule, wherein the real-time data weight of the equipment asset SB is dynamically increased along with time, and the weight adjustment rule is associated with the flow node requirement of the service YW.
- 6. A hierarchical architecture-based power knowledge processing apparatus, the apparatus comprising: the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring multi-source heterogeneous power knowledge data, and the multi-source heterogeneous power knowledge data comprises structured data, unstructured data and real-time stream data; A normalization module for performing normalization processing on the structured data, unstructured data, and real-time stream data, respectively; the labeling module is used for classifying and labeling the standardized structured knowledge data based on a multi-dimensional classification system to obtain labeling data, wherein the multi-dimensional classification system comprises a main dimension and an auxiliary dimension; the conversion module is used for converting the unstructured text after the standardized processing into triplet data; The construction module is used for constructing a dynamic knowledge graph based on the triplet data, the labeling data and the real-time stream data after standardized processing, wherein the knowledge graph takes equipment, faults and regulations as core entities; The generation module is used for generating a processing report based on the power business scene type and the dynamic knowledge graph; Wherein the structured data includes ledger information of equipment asset dimension and flow form data of business dimension, the unstructured data includes policy file of knowledge type dimension and technical document of professional dimension, the real-time flow data includes equipment running state monitoring data, the standardized module is further used for executing the following steps: Performing unified field mapping on the structured data based on a preset metadata standard, and performing compliance verification on the structured data by associating corresponding professional technical standards; analyzing and extracting the unstructured data by adopting a multi-mode document, converting the unstructured data into an resolvable format, and performing semantic cleaning; the method comprises the steps of carrying out noise filtering processing and characteristic parameter extraction on the real-time stream data through an edge node, carrying out standardized encapsulation on the processed real-time stream data based on a time sequence and a device identifier, wherein a main dimension comprises a knowledge type LX, a service YW and a professional ZY, an auxiliary dimension comprises a device asset SB and an organization JG, the standardized structured knowledge data is classified and labeled based on a multidimensional classification system to obtain labeling data, carrying out semantic analysis on the structured knowledge data by adopting a pre-training large model of an electric field, extracting service subject characteristics, technical field characteristics and core object characteristics of the structured knowledge data, matching knowledge types LX, service YW and professional ZY based on the service subject characteristics, carrying out forced association ordering based on the sequence from the knowledge types LX to the service YW to the professional ZY, dynamically screening the device asset SB and the organization JG based on the knowledge attributes, and sorting the device asset SB and the organization JG based on the association priority ordering, determining the knowledge attributes by service scene characteristics, the service subject characteristics, the technical subject characteristics and the core object characteristics, carrying out hierarchical order of the main dimension and the main dimension by the class size matching with the detailed class characteristics, and the class of the main dimension is calculated by the class-by the order of the classification, the method comprises the steps of obtaining main dimension marking data, calling a hierarchical category knowledge graph under the auxiliary dimension after sorting, gradually matching to the finest granularity category through similarity calculation of feature words and category attributes, marking to obtain the auxiliary dimension marking data, and forming the marking data based on the main dimension marking data and the auxiliary dimension marking data.
- 7. A computer device, comprising: A memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the hierarchical architecture based power knowledge processing method of any one of claims 1 to 5.
- 8. A computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the hierarchical architecture-based power knowledge processing method of any one of claims 1 to 5.
Description
Hierarchical architecture-based power knowledge processing method and device, computer equipment and storage medium Technical Field The invention relates to the technical field of data processing, in particular to a power knowledge processing method, device, medium and equipment based on a hierarchical architecture. Background In the current background of rapid development of the power industry, the scale of the power system is continuously expanding, and the complexity is continuously rising. From various energy units at the power generation end, to complicated circuits and substations in the power transmission network, and to diversified user demands at the power utilization side, the whole power ecological system relates to massive data and knowledge. As structured data, the power enterprises accumulate a lot of account information of equipment asset dimension, such as detailed parameters, purchase time, maintenance records and the like of various power generation equipment, transmission lines, power transformation equipment and the like, and the data are stored in a form of a table and the like. Meanwhile, the flow form data of the business dimension covers forms of various links from power planning, construction, operation and maintenance to marketing, such as engineering progress reports, equipment maintenance work forms, customer electricity utilization application forms and the like. Unstructured data is also important in the power domain. The policy files of knowledge type dimension comprise national and local outgoing power industry policies, constraint files of environmental protection policies on the power industry and the like, and technical documents of professional dimension such as various academic journal papers, enterprise internal technical reports, equipment operation manuals and the like, and comprise a large amount of precious industry knowledge real-time flow data, and along with the construction of a smart grid, a large amount of equipment running state monitoring data such as real-time parameters of rotating speed, temperature, voltage, current and the like of a generator, and state data such as sag, icing, galloping and the like of a power transmission line are generated in a power system. The real-time stream data has the characteristics of large data volume, high flow speed and strong timeliness. Traditional knowledge classification often focuses on one dimension or a few dimensions, for example, the traditional knowledge classification is simply classified according to the type of equipment or the business process, and is difficult to adapt to a complex knowledge structure in the power industry. Therefore, knowledge cannot be accurately positioned during storage and retrieval, and the utilization efficiency of knowledge is reduced. Conventional techniques lack in depth understanding and targeted analysis of power business scenarios when generating processing reports. In the case of equipment operation and maintenance, the equipment asset information, the historical fault cases, the professional technical standards and other knowledge cannot be comprehensively synthesized to generate an accurate fault diagnosis report with guiding significance and an effective historical case matching result, and in the case of policy compliance, the scheme and policy suitability can not be rapidly and accurately analyzed to generate a high-quality scheme and policy suitability report. In summary, the related technology has obvious defects in the aspects of processing multi-source heterogeneous power knowledge data, classifying and labeling, constructing a knowledge graph, generating a processing report and the like, and cannot meet the requirements of the power industry on efficient management and utilization of knowledge. Disclosure of Invention In view of the above, the invention provides a power knowledge processing method, device, medium and equipment based on a hierarchical architecture, so as to solve the problem that the related technology has obvious defects in the aspects of processing multi-source heterogeneous power knowledge data, classifying and labeling, constructing a knowledge graph, generating a processing report and the like, and cannot meet the requirements of the power industry on efficient management and utilization of knowledge. The invention provides a power knowledge processing method based on a hierarchical architecture, which comprises the steps of obtaining multi-source heterogeneous power knowledge data, respectively executing standardized processing on the structured data, the unstructured data and the real-time stream data, classifying and labeling the standardized structured knowledge data based on a multi-dimensional classification system to obtain labeling data, converting the standardized unstructured text into triplet data, constructing a dynamic knowledge graph based on the triplet data, the labeling data and the standardized real-time stream data, and generat