CN-122027184-A - Non-supervision APT detection method based on timing diagram attention network and deviation network

CN122027184ACN 122027184 ACN122027184 ACN 122027184ACN-122027184-A

Abstract

The application provides an unsupervised APT detection method based on a time sequence diagram attention network and a deviation network, which comprises the steps of obtaining a system audit log of a fixed-width time window by utilizing a sliding time window, extracting entities related to each event and interaction relations among the entities, building a dynamic tracing diagram by combining occurrence time of each event, modeling interaction behaviors of each node adjacent to the node in a time dimension by utilizing the time sequence diagram attention network to extract characteristic representation of node time dependence, predicting types of edges between adjacent node pairs by utilizing a multi-layer perceptron to obtain prediction errors, evaluating the deviation degree between the predicted edge types and corresponding reference standards by utilizing the deviation network, and identifying the time window and corresponding attack chain which are attacked by the APT based on the prediction errors and the deviation degree. The method is obviously superior to the traditional detection method in detection precision and robustness, and provides an efficient and extensible solution for APT detection.

Inventors

DAI HONG
SUN JIWANG
SUN BAIPING
WEI XI

Assignees

辽宁科技大学

Dates

Publication Date: 20260512
Application Date: 20251010

Claims (9)

1. An unsupervised APT detection method based on a timing diagram attention network and a bias network, the method comprising: Continuously acquiring a system audit log of a fixed-width time window by utilizing the sliding time window; Extracting entities related to each event and interaction relations among the entities from a system audit log of a current time window, and constructing a dynamic traceability graph by combining the occurrence time of each event; Modeling the interaction behavior of each node adjacent to the node in the time dimension by using a multi-layer stacked time sequence diagram attention network aiming at each node in the dynamic traceability diagram so as to extract the characteristic representation of the time dependence of the node; Based on the characteristic representation of the time dependence of the node and any node adjacent to the node, predicting the type of the edge between the adjacent node pair by using a multi-layer perceptron, and obtaining a prediction error; Evaluating a degree of deviation between the predicted type of edge between the pair of adjacent nodes and the corresponding reference standard using a deviation network; And identifying a time window under APT attack and a corresponding attack chain based on the prediction error and deviation degree of the edges between each adjacent node pair.
2. The method of claim 1, wherein the dynamic traceability map is constructed by: Taking entities related to each event as nodes of the dynamic tracing graph, and taking interaction relations among the entities as edges of the dynamic tracing graph; Extracting a node ID, a node type and an event time stamp; Performing one-time thermal coding on node types, and embedding the edge types into vectors with fixed dimensions by using a hash function to obtain respective characteristic representations, wherein the node types comprise a Subject node associated with a command line, a File node corresponding to a File path, and a Netflow node associated with an IP address and a port number; And according to the event time stamp, using the characteristic representation of the nodes and the edges to completely describe the process of each event to obtain the dynamic traceability graph.
3. The method of claim 1, wherein each timing diagram attention network layer extracts the node time dependent feature representation by: Obtaining the aggregation characteristics of each node adjacent to the node based on the importance weight of each node adjacent to the node, the characteristics of the edges between each node adjacent to the node and the high-dimensional vector of each node interacted with the node; and obtaining the characteristic representation of the node in the attention network layer of the current time sequence diagram based on the characteristic representation of the node in the attention network layer of the previous time sequence diagram and the aggregate characteristics of each node adjacent to the node.
4. A method according to claim 3, wherein the importance weight of each node adjacent to the node is obtained by: Encoding the time difference of interaction between the nodes by using sine and cosine functions so as to map the time difference into a high-dimensional vector; Based on the characteristics of each node adjacent to the node and the high-dimensional vector of each node interacted with the node, the importance weight of each node adjacent to the node is adaptively distributed by using an attention mechanism.
5. The method of claim 1, wherein predicting the type of edge between the pair of neighboring nodes using a multi-layer perceptron comprises: And after splicing the time-dependent characteristic representations of the node and any node adjacent to the node, obtaining the probability distribution of the type of the edge between the adjacent node pairs by using a nonlinear mapping function.
6. The method of claim 1, wherein the prediction error is obtained by: acquiring the edge characteristics between the adjacent node pairs from the dynamic tracing graph; and calculating the error between the predicted type of the edge between the adjacent node pair and the acquired characteristic of the edge between the adjacent node pair by using the cross entropy loss function to obtain a prediction error.
7. The method of claim 1, wherein the reference criteria is a desired representation of the type of edge between the pair of adjacent nodes obtained by learning a feature distribution of historical normal events in a self-supervised training manner by the deviation network.
8. The method of claim 1, wherein the time window under APT attack and the corresponding attack chain are identified by: For each time window, carrying out abnormal scoring on the time window according to the rareness of each node in the time window and the prediction error and deviation degree of edges between each adjacent node pair; Determining the time window with the highest abnormal score as a core time window under APT attack; Expanding the core time window outwards based on a Jaccard similarity calculation method; and merging the time segments highly correlated with the nodes in the expanded continuous time windows to obtain continuous tracks of the APT attack in the time dimension.
9. The method of claim 8, wherein the expanding the core time window outward based on the Jaccard similarity calculation method comprises: Step one, taking the core time window as a target window, and acquiring a time window adjacent to the core time window; step two, filtering low-frequency node sets from the target window and the time windows adjacent to the target window respectively; Calculating the ratio of the intersection set and the union set of two low-frequency node sets to obtain the similarity between the target window and the adjacent time window; And step four, if the similarity is larger than a set threshold, taking a time window adjacent to the similarity as a target window, and repeating the step one.

Description

Non-supervision APT detection method based on timing diagram attention network and deviation network Technical Field The application belongs to the technical field of advanced persistent threat detection, and particularly relates to an unsupervised APT detection method based on a time sequence diagram attention network and a deviation network. Background The advanced persistent threat (ADVANCED PERSISTENT THREAT, APT) is a long-term latent and deliberate form of attack, and has extremely strong concealment and pertinence. Intrusion control is usually achieved step by step through multi-stage infiltration behaviors, and the safety of government enterprises, scientific research institutions and key information systems is seriously threatened. With advanced persistent threat attacks becoming increasingly complex and hidden, traditional signature or rule-based intrusion detection systems have had difficulty effectively identifying their attack paths and behavioral characteristics. Recently, graph neural networks (Graph Neural Network, GNN) have been widely used for APT attack detection. In the static traceability graph scene, early research on the multi-dependency graph neural network extracts nodes and neighborhood structures, if THREATRACE adopts GRAPHSAGE to detect node abnormality, suspicious behaviors are identified through neighbor node learning, but time factors are not considered. Embedding root cause analysis into a trace-source-based intrusion detection (Embedding Root Cause ANALYSIS WITHIN Provenance-based Intrusion Detection, R-CAID) model further introduces root cause node embedding, and enhances detection capability by combining a graph attention network (Graph Attention Network, GAT), but has higher false alarm rate. Flash improves model performance through a graph self-encoder and semantic analysis, but dynamic evolution characteristics of APT attacks are generally ignored, so that low-frequency latency attacks are difficult to identify. To date, most of the study of graphs has been directed to static graphs, and relatively little has been explored for dynamic graphs. Unicorn monitors deviations of real-time behavior from historical patterns through flowsheet snapshots to identify anomalies, but has insufficient sensitivity to minor disturbances hidden in normal system logs. TBDetector model events based on transfomers, evaluate anomalies with similarity and isolation scores, but underutilize time attributes and lack interpretability. NODLINK is an online system, and real-time detection and investigation of APT attacks are realized through a fine-grained analysis association network, and the system may face performance bottlenecks when processing extremely large-scale data, affecting real-time response capability. KAIROS adopts a time sequence diagram neural network (Temporal Graph Network, TGN) to combine with a memory module to track the node state, and detects abnormality through edge reconstruction, but the false alarm rate is higher in an APT scene. In summary, in the existing method, the time sequence dynamic change of the structure is generally ignored in the process of constructing the traceability graph, so that real attack behaviors are difficult to distinguish effectively, and the detection accuracy is limited due to the fact that the correlation of surface data is excessively relied on. Disclosure of Invention In view of the above, the application aims to provide an unsupervised APT detection method based on a timing diagram attention network and a bias network, which has good performance in terms of detection precision and robustness, is significantly superior to the conventional detection method, and provides an efficient and extensible solution for APT detection. The application provides an unsupervised APT detection method based on a timing diagram attention network and a deviation network, which comprises the following steps: Continuously acquiring a system audit log of a fixed-width time window by utilizing the sliding time window; Extracting entities related to each event and interaction relations among the entities from a system audit log of a current time window, and constructing a dynamic traceability graph by combining the occurrence time of each event; Modeling the interaction behavior of each node adjacent to the node in the time dimension by using a multi-layer stacked time sequence diagram attention network aiming at each node in the dynamic traceability diagram so as to extract the characteristic representation of the time dependence of the node; Based on the characteristic representation of the time dependence of the node and any node adjacent to the node, predicting the type of the edge between the adjacent node pair by using a multi-layer perceptron, and obtaining a prediction error; Evaluating a degree of deviation between the predicted type of edge between the pair of adjacent nodes and the corresponding reference standard using a deviation network; And identifyi