Search

CN-119696991-B - Fault network analysis method and device, electronic equipment and computer storage medium

CN119696991BCN 119696991 BCN119696991 BCN 119696991BCN-119696991-B

Abstract

The application provides a fault network analysis method, device, electronic equipment and computer storage medium, which are used for collecting alarm data in real time, preprocessing the alarm data for each alarm data to obtain target alarm information, wherein the target alarm information comprises a time window to which the alarm data belong, an application system to which the system alarm data belong and an entity involved in the alarm data, carrying out association analysis on the alarm data according to the target alarm information for each application system, combining the alarm data to form an association scene, wherein the association scene at least comprises a triggering sequence among a plurality of alarm data, and finally carrying out fault analysis according to the association scene and an important node link map for each application system to obtain an analysis result, wherein the important node link map is constructed according to an important entity and an operation and maintenance knowledge map, so that the purpose of analyzing a fault network is achieved rapidly and effectively.

Inventors

  • LIU YUNHAN
  • Li shining
  • SUN YONGJING
  • HAN JIUXUE
  • LIU BOSONG
  • LIU DONGYANG

Assignees

  • 中国建设银行股份有限公司

Dates

Publication Date
20260512
Application Date
20241223

Claims (6)

  1. 1. A method of analyzing a faulty network, comprising: Collecting alarm data in real time; Preprocessing the alarm data aiming at each alarm data to obtain target alarm information, wherein the target alarm information comprises a time window to which the alarm data belongs, an application system to which the alarm data belongs and an entity involved in the alarm data; Aiming at each application system, carrying out association analysis on alarm data according to target alarm information, and combining the alarm data to form an association scene, wherein the association scene at least comprises a triggering sequence among a plurality of alarm data; Performing fault analysis according to the association scene and an important node link map aiming at each application system to obtain an analysis result, wherein the important node link map is constructed according to a key entity and an operation and maintenance knowledge map; Performing fault analysis according to the association scene and the important node link map aiming at each application system to obtain an analysis result, wherein the analysis result comprises the steps of acquiring an intersection of an entity involved in alarm data in the association scene and the important node link map to obtain a target entity; The important node link map comprises leaf nodes and non-leaf nodes, and the minimum fault subgraph is constructed according to the node relation between the target entities and the important node link map, wherein the minimum fault subgraph comprises a minimum fault subgraph constructed according to a first processing method if all the target entities are located at the leaf nodes, a minimum fault subgraph constructed according to a second processing method if all the target entities are located at the non-leaf nodes, a first minimum fault subgraph constructed according to the first processing method on the target entities located at the leaf nodes if some target entities are located at the leaf nodes, and a second minimum fault subgraph constructed according to the second processing method on the target entities located at the non-leaf nodes, wherein the first minimum fault subgraph has higher priority than the second minimum fault subgraph in the root cause positioning process; The method comprises the steps of constructing a minimum fault subgraph according to a first processing method if all target entities are located in leaf nodes, wherein the minimum fault subgraph comprises the steps of selecting links between the target entities and other key elements in an application system as first links for each target entity, constructing the minimum fault subgraph according to all first links, constructing the minimum fault subgraph according to a second processing method if all target entities are located in non-leaf nodes, determining whether intersection exists between each target entity and each important node link and the minimum distance between the target entity and the leaf node, taking the link of the target entity, which is the most intersected with the important node link and is closest to the leaf node, as a main link, taking the link, which is the most intersected with the important node link of the target entity and is the main link, of each target entity, as a branch link, taking the links, which are the most intersected with the important node links of the target entity and the main link, of which are the maximum length, as other target entities except the main links, and constructing the minimum fault subgraph according to the main links and the minimum fault subgraph.
  2. 2. The method for analyzing a fault network according to claim 1, wherein the alarm data includes alarm information and alarm time, and the preprocessing the alarm data for each alarm data to obtain target alarm information includes: Dividing each alarm data according to the occurrence time of the alarm data and a preset time window to obtain a time window to which the alarm data belong; According to the alarm information in the alarm data, carrying out application system division on the alarm data to obtain an application system to which the alarm data belongs; For the alarm information related to the basic platform and the network, the alarm information is associated to a corresponding application system through the related configuration relation in the configuration management system; determining entity information to which the alarm information belongs through key information in the alarm information, wherein the entity information at least comprises an instance; The entity to which the alert data relates is determined according to the instance.
  3. 3. The method for analyzing a fault network according to claim 1, wherein the alarm data includes an alarm time, an alarm type and an alarm source, and the performing association analysis on the alarm data according to the target alarm information for each application system, and combining the alarm data to form an association scene includes: determining an association relation among alarm data according to alarm time, alarm type, alarm source and association rule network for each application system, wherein the association rule network is generated according to the support degree and the confidence degree; And combining the incidence relations among a plurality of alarm data caused by the same problem or event based on the incidence relations among the alarm data to obtain an incidence scene.
  4. 4. An analysis device for a faulty network, comprising: the acquisition unit is used for acquiring alarm data in real time; The system comprises a preprocessing unit, a processing unit and a processing unit, wherein the preprocessing unit is used for preprocessing the alarm data to obtain target alarm information aiming at each alarm data, and the target alarm information comprises a time window to which the alarm data belongs, an application system to which the alarm data belongs and an entity to which the alarm data relates; The system comprises a correlation analysis unit, a correlation analysis unit and a processing unit, wherein the correlation analysis unit is used for carrying out correlation analysis on alarm data according to target alarm information aiming at each application system, and combining the alarm data to form a correlation scene, wherein the correlation scene at least comprises a triggering sequence among a plurality of alarm data; The fault analysis unit is used for carrying out fault analysis according to the association scene and the important node link map aiming at each application system to obtain an analysis result, wherein the important node link map is constructed according to a key entity and an operation and maintenance knowledge map; the fault analysis unit comprises an intersection unit, a construction unit and a fault analysis subunit; The intersection unit is used for acquiring intersection of the entity involved in the alarm data in the association scene and the important node link map to obtain a target entity; the construction unit is used for constructing a minimum fault subgraph according to the node relation between the target entity and the important node link map; The fault analysis subunit is used for analyzing according to the minimum fault subgraph to obtain an analysis result; If the important node link map comprises leaf nodes and non-leaf nodes, the construction unit comprises a first construction subunit, a second construction subunit and a third construction subunit; the first construction subunit is configured to, if all the target entities are located at leaf nodes, construct, according to a first processing method, a minimum fault subgraph; The second construction subunit is configured to, if all the target entities are located at non-leaf nodes, construct, according to a second processing method, a minimum fault subgraph; The third constructing subunit is configured to, if some target entities are located at leaf nodes, construct the target entities located at leaf nodes according to a first processing method to obtain a first minimum fault subgraph, and construct the target entities located at non-leaf nodes according to a second processing method to obtain two minimum fault subgraphs, where the first minimum fault subgraph has a higher priority than the second minimum fault subgraph in the root cause positioning process; If all target entities are located at leaf nodes, the first construction subunit comprises a link determining unit and a first minimum fault sub-graph generating unit; The link determining unit is used for selecting, for each target entity, a link between the target entity and other key elements in the application system as a first link; The first minimum fault subgraph generation unit is used for constructing a minimum fault subgraph according to all the first links; If all the target entities are located at the non-leaf nodes, the second construction subunit comprises a fourth determining unit, a fifth determining unit, a sixth determining unit, a seventh determining unit and a second minimum fault sub-graph generating unit; the fourth determining unit is configured to determine whether an intersection exists between each target entity and each important node link and a minimum distance between the target entity and the leaf node; The fifth determining unit is configured to take, as a main link, a link of the target entity, where the link intersection between the target entity and the important node is the largest and the link is closest to the target entity of the leaf node; The sixth determining unit is configured to, for each to-be-determined target entity, use, as a branch link, a link with the largest intersection between an important node link where the to-be-determined target entity is located and a main link and the largest length; The seventh determining unit is configured to take, if there is no intersection between the important node link where the target entity to be determined is located and the main link, a shortest path between the target entity to be determined and the main link as a branch link; and the second minimum fault subgraph generation unit is used for constructing a minimum fault subgraph according to the main link and all branch links.
  5. 5. An electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon; The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of analysis of a faulty network of any one of claims 1 to 3.
  6. 6. A computer storage medium, characterized in that it has stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of analyzing a faulty network according to any of claims 1 to 3.

Description

Fault network analysis method and device, electronic equipment and computer storage medium Technical Field The present application relates to the field of anomaly analysis technologies, and in particular, to a method and apparatus for analyzing a fault network, an electronic device, and a computer storage medium. Background At present, in the face of increasingly complex architecture and ever-increasing business scale of banking systems, the traditional alarm management mode is difficult to effectively cope with the mass, diversity and dynamic performance of alarm information, so that the positioning is slow and the processing efficiency is low. Disclosure of Invention In view of the above, the present application provides a method, apparatus, electronic device and computer storage medium for analyzing a failure network, which can quickly and effectively analyze the failure network. The first aspect of the present application provides a method for analyzing a fault network, including: Collecting alarm data in real time; preprocessing the alarm data aiming at each alarm data to obtain target alarm information, wherein the target alarm information comprises a time window to which the alarm data belongs, an application system to which the system alarm data belongs and an entity involved in the alarm data; Aiming at each application system, carrying out association analysis on alarm data according to target alarm information, and combining the alarm data to form an association scene, wherein the association scene at least comprises a triggering sequence among a plurality of alarm data; And aiming at each application system, carrying out fault analysis according to the association scene and the important node link map to obtain an analysis result, wherein the important node link map is constructed according to the key entity and the operation and maintenance knowledge map. Optionally, the alarm data includes alarm information and alarm time, and the preprocessing the alarm data for each alarm data to obtain target alarm information includes: Dividing each alarm data according to the occurrence time of the alarm data and a preset time window to obtain a time window to which the alarm data belong; According to the alarm information in the alarm data, carrying out application system division on the alarm data to obtain an application system to which the alarm data belongs; For the alarm information related to the basic platform and the network, the alarm information is associated to a corresponding application system through the related configuration relation in the configuration management system; determining entity information to which the alarm information belongs through key information in the alarm information, wherein the entity information at least comprises an instance; The entity to which the alert data relates is determined according to the instance. Optionally, the alarm data includes alarm time, alarm type and alarm source, and the performing association analysis on the alarm data according to the target alarm information for each application system, and combining the alarm data to form an association scene includes: determining an association relation among alarm data according to alarm time, alarm type, alarm source and association rule network for each application system, wherein the association rule network is generated according to the support degree and the confidence degree; And combining the incidence relations among a plurality of alarm data caused by the same problem or event based on the incidence relations among the alarm data to obtain an incidence scene. Optionally, for each application system, performing fault analysis according to the association scenario and the important node link map to obtain an analysis result, where the performing includes: acquiring intersection of an entity involved in alarm data in the association scene and an important node link map to obtain a target entity; constructing a minimum fault subgraph according to the node relation between the target entity and the important node link map; And analyzing according to the minimum fault subgraph to obtain an analysis result. Optionally, the important node link map includes leaf nodes and non-leaf nodes, and the constructing a minimum fault subgraph according to the node relation between the target entity and the important node link map includes: if all the target entities are located at leaf nodes, constructing and obtaining a minimum fault subgraph according to a first processing method; If all the target entities are located at the non-leaf nodes, constructing and obtaining a minimum fault subgraph according to a second processing method; If some target entities are located at leaf nodes, constructing the target entities located at the leaf nodes according to a first processing method to obtain a first minimum fault subgraph, and constructing the target entities located at non-leaf nodes according to a second process