Search

CN-121998105-A - Fault root cause positioning method, device, equipment and readable storage medium

CN121998105ACN 121998105 ACN121998105 ACN 121998105ACN-121998105-A

Abstract

The application discloses a fault root cause positioning method, a device, equipment and a readable storage medium, wherein the method comprises the steps of determining multisource abnormal characteristics; the method comprises the steps of obtaining a domain knowledge graph, retrieving analysis basis from the domain knowledge graph, utilizing a preliminary screening layer to trace root cause by combining the analysis basis to generate a root cause reasoning process and a reasoning complex score, utilizing a conclusion generation layer to generate a root cause analysis report when the reasoning complex score is lower than a complex threshold value, calling a depth reasoning layer to conduct depth cause and effect reasoning to generate a reasoning thought and a root cause propagation path when the reasoning complex score is not lower than the complex threshold value, and utilizing the conclusion generation layer to generate the root cause analysis report based on the reasoning thought and the root cause propagation path. Therefore, the application can clearly present the reasoning ideas of faults with different difficulties while improving the reasoning efficiency and the fault root positioning speed through the hierarchical reasoning architecture and the domain knowledge graph, and improves the credibility and the verifiability of various fault root positioning conclusions.

Inventors

  • WANG CHIMING
  • LIN ZIHAO
  • Hong Qiaona
  • Zhuo Chuhan
  • ZHENG DONGKE
  • ZHOU YIYING
  • BAO ZHILIANG
  • CHEN CAISEN
  • WU BINGKUN
  • LI ZHENJUN
  • ZHANG JING
  • LI YANAN
  • LIU LING
  • RAN FEIPENG
  • YAO FENG

Assignees

  • 厦门理工学院

Dates

Publication Date
20260508
Application Date
20260410

Claims (10)

  1. 1. A method for locating a root cause of a fault, comprising: Determining multi-source abnormal characteristics; acquiring a domain knowledge graph and a hierarchical reasoning architecture comprising a preliminary screening layer, a deep reasoning layer and a conclusion generation layer; Retrieving analysis basis matched with the multisource abnormal characteristics from the domain knowledge graph; utilizing the preliminary screening layer and combining the analysis basis to trace root cause of the multi-source abnormal characteristics, and generating a root cause reasoning process and a reasoning complex score; When the reasoning complexity score is lower than a preset complexity threshold, the conclusion generation layer is utilized to combine the root cause reasoning process and the analysis basis to generate a root cause analysis report containing a visual reasoning logic link; And when the reasoning complex score is not lower than the complex threshold, calling the deep reasoning layer, carrying out deep causal reasoning on the multi-source abnormal characteristics by combining the root cause reasoning process and the analysis basis to generate a reasoning thought and a root cause propagation path, and generating a root cause analysis report containing a visual reasoning logic link based on the reasoning thought and the root cause propagation path by utilizing the conclusion generation layer.
  2. 2. The fault root cause positioning method according to claim 1, wherein obtaining a domain knowledge graph includes: Carrying out data alignment on historical log data, time sequence index data, historical fault cases, structured standard data, transmission link data and extended rare fault data by combining a semantic matching mode, a node mapping mode and a time stamp matching mode, and determining operation characterization of nodes in different fields under normal operation states and fault characterization and processing modes under different fault types; generating a plurality of entities, a plurality of association relations and a plurality of attributes matched with the domain node based on the operation representation of the domain node and fault representation and processing modes under different fault types for each domain node; and constructing the domain knowledge graph based on each analysis triplet.
  3. 3. The fault root location method of claim 2, wherein the constructing the domain knowledge graph based on each analysis triplet comprises: Converting each entity and each association relation corresponding to each entity into vector representation, merging corresponding knowledge association information, and generating a structured triplet; Determining a weight coefficient corresponding to each source based on the contribution degree of different sources to fault root cause positioning; And determining weight coefficients corresponding to each structured triplet and each analysis triplet, and constructing a domain knowledge graph.
  4. 4. The method for locating a root cause of a fault according to claim 1, wherein the domain knowledge graph comprises a plurality of triples derived from multi-source heterogeneous data, each triplet comprising a corresponding entity, a corresponding relationship and corresponding attribute data; the searching the analysis basis matched with the multi-source abnormal characteristics from the domain knowledge graph comprises the following steps: Performing task analysis on the multisource abnormal characteristics to generate fault analysis prompt words containing fault analysis task requirements and causal reasoning rules; Retrieving a plurality of target triples matched with the multi-source abnormal features from the domain knowledge graph; and cleaning and fusing each target triplet and the fault analysis prompt word to generate the analysis basis.
  5. 5. The fault root cause positioning method according to claim 1, wherein the invoking the deep reasoning layer, in combination with the root cause reasoning process and the analysis basis, performs deep causal reasoning on the multi-source abnormal feature, and generates a reasoning thought and a root cause propagation path, includes: Predicting the reasoning delay of different reasoning layer numbers of the multi-source abnormal characteristics by combining the reasoning complex score and the current residual computing resources; combining the response speed requirement of the multi-source abnormal characteristics and each reasoning delay to determine the number of target reasoning layers corresponding to the multi-source abnormal characteristics; and calling a plurality of reasoning layers matched with the target reasoning layer number in the depth reasoning layers, and carrying out depth causal reasoning on the multi-source abnormal characteristics by combining the root cause reasoning process and the analysis basis to generate a reasoning thought and a root cause propagation path.
  6. 6. The method according to claim 5, wherein the analysis basis comprises a plurality of triplet bodies, wherein each triplet body corresponds to a weight coefficient representing the contribution degree of the triplet body to the fault root positioning; Invoking a plurality of inference layers matched with the target inference layer number in the depth inference layers, carrying out depth causal inference on the multi-source abnormal characteristics by combining the root cause inference process and the analysis basis, and generating an inference idea and a root cause propagation path, wherein the method comprises the following steps: Calling a plurality of inference layers matched with the target inference layer number in the depth inference layers, and analyzing each triplet main body and the multi-source abnormal characteristics according to the corresponding weight coefficient from large to small to generate an abnormal main body containing abnormal nodes and/or abnormal types; Based on the multi-source abnormal characteristics, combining the abnormal main body, the analysis basis and the domain knowledge graph, and generating root cause distribution probability and a propagation link by causal reasoning; verifying the root cause distribution probability based on the root cause reasoning process; and after verification, integrating the analysis basis, the abnormal main body, the root cause distribution probability and the propagation link to form the reasoning thought and the root cause propagation path.
  7. 7. The method of claim 1, wherein generating, with the conclusion generation layer, a root cause analysis report including a visual inference logical link based on the inference ideas and the root cause propagation path, comprises: Converting the root cause propagation path and the reasoning thought into a visual graph; determining a processing scheme and a specification basis based on the root cause propagation path and the domain knowledge graph; and processing the visual graph, the processing scheme and the standard basis to generate a visual root cause analysis report.
  8. 8. A fault root cause locating device, comprising: the determining module is used for determining multisource abnormal characteristics; The acquisition module is used for acquiring a domain knowledge graph and a hierarchical reasoning architecture comprising a preliminary screening layer, a depth reasoning layer and a conclusion generation layer; The retrieval module is used for retrieving analysis basis matched with the multi-source abnormal characteristics from the domain knowledge graph; The generation module is used for carrying out root cause tracing on the multi-source abnormal characteristics by utilizing the preliminary screening layer and combining the analysis basis to generate a root cause reasoning process and a reasoning complex score; The combination module is used for combining the root cause reasoning process and the analysis basis by utilizing the conclusion generation layer to generate a root cause analysis report containing a visual reasoning logic link; The reasoning module is used for calling the deep reasoning layer, carrying out deep causal reasoning on the multisource abnormal characteristics by combining the root cause reasoning process and the analysis basis, generating a reasoning thought and a root cause propagation path, and generating a root cause analysis report containing a visual reasoning logic link based on the reasoning thought and the root cause propagation path by utilizing the conclusion generation layer.
  9. 9. A fault root cause locating device, comprising a memory and a processor; the memory is used for storing programs; The processor is configured to execute the program to implement the steps of the fault root locating method according to any one of claims 1 to 7.
  10. 10. A readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the fault root localization method as defined in any one of claims 1-7.

Description

Fault root cause positioning method, device, equipment and readable storage medium Technical Field The present application relates to the field of fault processing technologies, and in particular, to a fault root cause positioning method, apparatus, device, and readable storage medium. Background With the rapid development of industrial digitization and IT architecture cloud primordia, industrial equipment iterates to be large, complex and intelligent, and an IT system evolves to be micro-services, distributed and Yun Bianduan in a cooperative mode. Under the trend, the operation monitoring and fault root analysis of equipment and systems become a core link for guaranteeing production continuity and service stability. Currently, the fault root cause analysis work mainly adopts a mode of combining manual investigation and large model analysis. However, the large model has inherent black box reasoning characteristics, so that the analysis process of the fault root causes lacks traceability. In order to avoid the delay of fault handling progress caused by positioning deviation, operation and maintenance personnel need to carry out rationality verification on the reasoning logic of the large model, and adopt the output conclusion after the verification is passed, and the additional verification link obviously reduces the overall fault positioning efficiency. Therefore, how to generate a fault root cause analysis conclusion with traceability becomes a technical problem to be solved urgently by those skilled in the art. Disclosure of Invention In view of the above, the present application provides a fault root cause positioning method, device, apparatus and readable storage medium, which are used for solving the defect of lack of traceability in the existing fault root cause analysis technology. In order to achieve the above object, the following solutions have been proposed: a fault root cause locating method comprising: Determining multi-source abnormal characteristics; acquiring a domain knowledge graph and a hierarchical reasoning architecture comprising a preliminary screening layer, a deep reasoning layer and a conclusion generation layer; Retrieving analysis basis matched with the multisource abnormal characteristics from the domain knowledge graph; utilizing the preliminary screening layer and combining the analysis basis to trace root cause of the multi-source abnormal characteristics, and generating a root cause reasoning process and a reasoning complex score; When the reasoning complexity score is lower than a preset complexity threshold, the conclusion generation layer is utilized to combine the root cause reasoning process and the analysis basis to generate a root cause analysis report containing a visual reasoning logic link; And when the reasoning complex score is not lower than the complex threshold, calling the deep reasoning layer, carrying out deep causal reasoning on the multi-source abnormal characteristics by combining the root cause reasoning process and the analysis basis to generate a reasoning thought and a root cause propagation path, and generating a root cause analysis report containing a visual reasoning logic link based on the reasoning thought and the root cause propagation path by utilizing the conclusion generation layer. Optionally, obtaining the domain knowledge graph includes: Carrying out data alignment on historical log data, time sequence index data, historical fault cases, structured standard data, transmission link data and extended rare fault data by combining a semantic matching mode, a node mapping mode and a time stamp matching mode, and determining operation characterization of nodes in different fields under normal operation states and fault characterization and processing modes under different fault types; generating a plurality of entities, a plurality of association relations and a plurality of attributes matched with the domain node based on the operation representation of the domain node and fault representation and processing modes under different fault types for each domain node; and constructing the domain knowledge graph based on each analysis triplet. Optionally, the constructing the domain knowledge graph based on each analysis triplet includes: Converting each entity and each association relation corresponding to each entity into vector representation, merging corresponding knowledge association information, and generating a structured triplet; Determining a weight coefficient corresponding to each source based on the contribution degree of different sources to fault root cause positioning; And determining weight coefficients corresponding to each structured triplet and each analysis triplet, and constructing a domain knowledge graph. Optionally, the domain knowledge graph includes a plurality of triples derived from multi-source heterogeneous data, and each triplet includes a corresponding entity, a corresponding relationship and corresponding attribute