Search

CN-121998403-A - Alarm noise reduction method and system based on large model of power system

CN121998403ACN 121998403 ACN121998403 ACN 121998403ACN-121998403-A

Abstract

The application provides an alarming and noise reducing method and system based on a large model of a power system, wherein the method comprises the steps of obtaining multi-source heterogeneous alarming data in the power system and constructing a preprocessing data set; the method comprises the steps of constructing a heterogeneous information network based on a preprocessing data set, generating a node sequence containing alarm semantics by utilizing the heterogeneous information network, aggregating a plurality of node sequences meeting the requirement of similarity into an alarm cluster, generating an alarm cluster characteristic data structure based on the alarm cluster, acquiring a knowledge map and historical alarm data of the power system, constructing a power threat detection big model, inputting the alarm cluster characteristic data structure into the power threat detection big model, and generating an alarm judging report of the power system. The problem that high-risk events in the existing power system cannot be found in time when hidden in mass alarms is solved.

Inventors

  • LIU JIAXI
  • CHEN HONGYOU
  • LU PENG
  • FU YUNXI
  • GUO GUANGLAI
  • GAO LIYUAN
  • LI ZHIQI
  • TIAN SHUANGPENG
  • Yuan teng
  • FAN JINQIANG
  • GU ZHIQI

Assignees

  • 国网信息通信产业集团有限公司
  • 国网思极网安科技(北京)有限公司

Dates

Publication Date
20260508
Application Date
20251205

Claims (10)

  1. 1. An alarm and noise reduction method based on a large model of a power system, which is characterized by comprising the following steps: acquiring multi-source heterogeneous alarm data in the power system, and constructing a preprocessing data set; Constructing a heterogeneous information network based on the preprocessing data set, generating a node sequence containing alarm semantics by utilizing the heterogeneous information network, and aggregating a plurality of node sequences meeting the similarity requirement into an alarm cluster; Generating an alarm cluster characteristic data structure based on the alarm cluster; Acquiring a knowledge graph and historical alarm data of the power system, and constructing a power threat detection large model; And inputting the alarm cluster characteristic data structure into the power threat detection large model to generate an alarm judging report of the power system.
  2. 2. The power system large model based alarm and noise reduction method according to claim 1, wherein the constructing a preprocessing data set includes: Carrying out standardized processing on the multi-source heterogeneous alarm data to generate standard format data; extracting characteristic fields from the standard format data, and constructing a regular expression; and collecting the regular expression into the preprocessing data set.
  3. 3. The power system large model-based alarm and noise reduction method according to claim 1, wherein the constructing a heterogeneous information network based on the preprocessed data set, generating a node sequence containing alarm semantics using the heterogeneous information network, comprises: Acquiring a basic information network, wherein the basic information network comprises a plurality of nodes and a plurality of edges, the nodes correspond to the multi-type information, and the edges correspond to the relation among the multi-type information; Introducing the data in the preprocessing data set into the nodes of the basic information network according to the type to generate the heterogeneous information network; and generating the node sequence corresponding to the alarm semantics by using a random walk strategy.
  4. 4. The power system large model-based alarm and noise reduction method according to claim 1, wherein the aggregating the plurality of node sequences meeting the similarity requirement into an alarm cluster comprises: Processing a plurality of node sequences to generate a plurality of vector representations corresponding to the plurality of node sequences; And calculating Euclidean distances among a plurality of vector representations by using a density-based clustering algorithm, and dividing the vector representations into the alarm clusters and the outliers.
  5. 5. The power system large model-based alarm noise reduction method according to claim 1, wherein the generating an alarm cluster feature data structure based on the alarm clusters comprises: Based on the alarm cluster, generating abstract features which are structurally represented in a target format by using a clustering algorithm, and enabling the abstract features to be used as an alarm cluster feature data structure.
  6. 6. The method for alarming and denoising based on a large model of a power system according to claim 1, wherein the steps of obtaining a knowledge graph and historical alarming data of the power system, constructing a large model of power threat detection include: Selecting a base large model; And freezing the pre-training weight of the base large model, injecting the knowledge graph and the historical alarm data, and fine-tuning the base large model.
  7. 7. The method for alert and noise reduction based on a large model of a power system according to claim 1, wherein the inputting the alert cluster feature data structure into the large model of power threat detection generates an alert and judgment report of the power system, comprising: inputting the alarm cluster characteristic data structure into the power threat detection large model, and generating a research result analysis document in a specified format as the alarm research report; After the alarm judging report of the power system is generated, the method comprises the following steps: And judging the correctness of the alarm judging report, and inputting the alarm judging report and the judging result thereof into the electric threat detection large model as a training set.
  8. 8. The power system large model-based alarm and noise reduction method according to claim 1, wherein after the aggregating the plurality of node sequences meeting the similarity requirement into an alarm cluster, the method comprises: spot checking the risk level of the node sequences with specific numbers in the single alarm cluster; And responding to the risk levels of the multiple node sequences subjected to spot check to be the same, and judging that the risk level of the node sequence is the risk level of the alarm cluster.
  9. 9. The power system large model based warning noise reduction method according to claim 8, wherein the spot checking the risk level of the specific number of the node sequences in the single warning cluster comprises: Detecting the highest risk level of the alarm cluster; Extracting the node sequence from the alarm cluster, and calculating the probability of the node sequence with the highest risk level in the extraction according to the following formula: ; Wherein P threshold is the probability, q is the spot check number, U represents the set of all node sequences in the alarm cluster, |u| is the total number of the node sequences in the alarm cluster, H is the set of the node sequences of the highest risk level in the alarm cluster, |h| is the total number of the node sequences of the highest risk level in the alarm cluster; And selecting the corresponding extraction quantity when the probability meets the requirement, and taking the minimum sampling quantity in the selected extraction quantity as the specific quantity.
  10. 10. An alarm and noise reduction system based on a large model of an electric power system, the system comprising: The first acquisition module is used for acquiring multi-source heterogeneous alarm data in the power system and constructing a preprocessing data set; the aggregation module is used for constructing a heterogeneous information network based on the preprocessing data set, generating a node sequence containing alarm semantics by utilizing the heterogeneous information network, and aggregating a plurality of node sequences meeting the similarity requirement into an alarm cluster; the first generation module is used for generating an alarm cluster characteristic data structure based on the alarm cluster; The first construction module is used for acquiring a knowledge graph and historical alarm data of the power system and constructing a power threat detection large model; And the second generation module is used for inputting the alarm cluster characteristic data structure into the power threat detection large model to generate an alarm judging report of the power system.

Description

Alarm noise reduction method and system based on large model of power system Technical Field The application relates to the technical field of computers, in particular to an alarming and noise reduction method and system based on a large model of an electric power system. Background With the rapid promotion of the construction of a novel power system, the large-scale access of distributed energy sources such as wind power, photovoltaic and the like, and the wide deployment of intelligent terminals and Internet of things equipment, the network architecture of the power system is more and more complex. Meanwhile, the network attack surface of the power system is gradually expanded, so that the network security situation awareness platform needs to process massive alarm data from multiple sources such as a firewall, an intrusion detection system, host protection software and the like every day. These alert data are not only voluminous, but also have significant heterogeneous, short text and multi-modal features. In the face of massive alarm data, the traditional analysis method has difficulty in realizing efficient alarm research and judgment and quick treatment. On one hand, the traditional alarm analysis technology relying on design rules of safety experts and simple feature matching is difficult to adapt to a novel and unknown attack mode, so that high-risk events are submerged in massive low-risk and invalid alarms, safety analysis personnel are in the dilemma of alarm fatigue, alarm delay is generated, and threat is caused to enterprise network safety. On the other hand, although some existing machine learning or deep learning techniques can automatically extract part of the effective features from the text data, the network security domain alarm data has high specificity and complex association relation, and it is difficult to form an accurate alarm report with high interpretability in the prior art. Disclosure of Invention In view of the above, the application aims to provide an alarm noise reduction method and system based on a large model of a power system, which solve the problem that high-risk events in the existing power system cannot be found in time when hidden in mass alarms. In order to achieve one of the above disclosed objects, the present application provides an alarm noise reduction method based on a large model of a power system, the method comprising: acquiring multi-source heterogeneous alarm data in the power system, and constructing a preprocessing data set; Constructing a heterogeneous information network based on the preprocessing data set, generating a node sequence containing alarm semantics by utilizing the heterogeneous information network, and aggregating a plurality of node sequences meeting the similarity requirement into an alarm cluster; Generating an alarm cluster characteristic data structure based on the alarm cluster; Acquiring a knowledge graph and historical alarm data of the power system, and constructing a power threat detection large model; And inputting the alarm cluster characteristic data structure into the power threat detection large model to generate an alarm judging report of the power system. As a further improvement of an embodiment of the present application, said constructing a preprocessed dataset comprises: Carrying out standardized processing on the multi-source heterogeneous alarm data to generate standard format data; extracting characteristic fields from the standard format data, and constructing a regular expression; and collecting the regular expression into the preprocessing data set. As a further improvement of an embodiment of the present application, the constructing a heterogeneous information network based on the preprocessed data set, generating a node sequence including alarm semantics using the heterogeneous information network, includes: Acquiring a basic information network, wherein the basic information network comprises a plurality of nodes and a plurality of edges, the nodes correspond to the multi-type information, and the edges correspond to the relation among the multi-type information; Introducing the data in the preprocessing data set into the nodes of the basic information network according to the type to generate the heterogeneous information network; and generating the node sequence corresponding to the alarm semantics by using a random walk strategy. As a further improvement of an embodiment of the present application, the aggregating the plurality of node sequences meeting the similarity requirement into the alarm cluster includes: Processing a plurality of node sequences to generate a plurality of vector representations corresponding to the plurality of node sequences; And calculating Euclidean distances among a plurality of vector representations by using a density-based clustering algorithm, and dividing the vector representations into the alarm clusters and the outliers. As a further improvement of an embodiment of the