CN-122027252-A - Network intrusion detection method and system based on heterogeneous information and graph neural network

CN122027252ACN 122027252 ACN122027252 ACN 122027252ACN-122027252-A

Abstract

The invention discloses a network intrusion detection method and system based on heterogeneous information and a graph neural network, wherein the method comprises the steps of collecting multi-source heterogeneous data, fusing correlation keys such as time stamps and IP addresses into an entity-relation structure, constructing a multi-domain security knowledge base, defining a graph mode, generating a time sequence graph snapshot, dynamically constructing isomerism An Quantu, triggering local subgraph extraction based on suspicious nodes, adopting a isomerism feature attention mechanism and a space-time joint modeling module, analyzing node spatial correlation and time evolution features, adapting to topological changes through a dynamic aggregation strategy, designing cross-domain federal graph learning, aggregating global knowledge to generate a common abnormal mode on the premise of protecting privacy, realizing real-time intrusion decisions based on abnormal scores of the nodes/subgraphs, and combining attack classifier output attack types and tracing paths. The invention effectively improves the detection capability of unknown attack, the processing efficiency and the instantaneity of the large-scale graph, and reduces the rate of missing report and the rate of false report.

Inventors

ZENG NIANYIN
CHEN LIANG
WANG SIMENG
CHEN XI
QIN HONGXIAN
Wu Peishu
CHENG ZERUI

Assignees

厦门大学
国家工业信息安全发展研究中心

Dates

Publication Date: 20260512
Application Date: 20260128

Claims (10)

1. A network intrusion detection method based on heterogeneous information and a graph neural network is characterized by comprising the following steps: A multi-source heterogeneous data fusion step, namely collecting network traffic logs, host behavior logs and a system call sequence as multi-source data, uniformly mapping the multi-source data into an entity-relation structure through a timestamp, an IP address, a host ID, a process ID and a user name associated key, and constructing a multi-domain security knowledge base covering a host, a process, a user and an IP; the method comprises the steps of dynamic heterogeneous security graph construction and space-time subgraph extraction, defining a graph mode, mapping entity types into node types, mapping interaction event relations among entities into edge types, generating a time sequence graph snapshot in real time, dynamically adding/updating nodes and edges according to a security event stream, dividing a global graph sequence according to time windows, and triggering local space-time subgraph extraction based on suspicious nodes, wherein the node types comprise a host, a process, a user and an IP; A space-time diagram neural network GNN analysis step adopts a heterogeneous characteristic attention mechanism to distribute leavable weights for the attributes of different node/edge types, uses a diagram attention network to fuse neighbor information in a space dimension through a spectrum-time sequence heterogeneous converter structure, and uses a gating circulation unit to capture node behavior evolution in a time dimension; The method comprises the steps of cross-domain federal graph learning processing, wherein each domain locally trains a space-time graph neural network GNN model of a heterogeneous sub-graph, and uploads GNN model parameters to a server; And a real-time intrusion decision-making step, namely constructing a real-time intrusion detection and decision-making mechanism aiming at cross-domain federal learning and privacy protection, calculating an abnormal score according to the GNN output vector of the node/subgraph, marking the node/subgraph as intrusion if the abnormal score exceeds a threshold value, and outputting an attack type and a tracing path by combining an attack classifier.
2. The network intrusion detection method based on heterogeneous information and graph neural network according to claim 1, wherein in the steps of dynamic heterogeneous security graph construction and space-time subgraph extraction, the local space-time subgraph extraction satisfies the following conditions: when the abnormal score of a certain node exceeds a dynamic threshold, extracting a neighbor subgraph in K hops by taking the node as a center, and correlating historical interaction events in a delta t time window of the neighbor subgraph, wherein K represents sampling depth, and delta t represents the length of a retrospective historical time window.
3. The network intrusion detection method based on heterogeneous information and graph neural network according to claim 1, wherein the edge attributes of the dynamic heterogeneous security graph comprise a communication port, a protocol type, a process authority level and a file operation type, and the edge attributes of the dynamic heterogeneous security graph participate in aggregation calculation as edge features in GNN message passing of the space-time graph neural network.
4. The network intrusion detection method based on heterogeneous information and graph neural network according to claim 1, wherein in the space-time graph neural network GNN analysis step, GNN model supports unsupervised training, comprising: and learning a normal behavior mode by adopting a single-class neural network, and calculating abnormal scores of the nodes through reconstructing errors or representing distances.
5. The network intrusion detection method based on heterogeneous information and graph neural network according to claim 1, wherein in the space-time graph neural network GNN analysis step, the time-series graph snapshot is compressed by a graph summarization technique, comprising: the stable subnetworks are aggregated into super nodes, and the dynamic substructures retain the original topology, so that the complexity of model processing is reduced.
6. The network intrusion detection method based on heterogeneous information and graph neural network according to claim 1, wherein in the space-time graph neural network GNN analysis step, topology changes include connection surge and node deactivation.
7. The network intrusion detection method based on heterogeneous information and a graph neural network of claim 1, wherein the cross-domain federal graph learning processing step further comprises training a global structure guide model at a server, injecting a commonality abnormal mode into a local model through knowledge distillation, and dynamically integrating global guide signals and local features.
8. The network intrusion detection method based on heterogeneous information and a graph neural network according to claim 1, wherein in the real-time intrusion decision-making step, a real-time intrusion detection and decision-making mechanism adopts an edge cooperative architecture, comprising: And the central platform receives the edge alarm to perform global association analysis and issues updated GNN model parameters to the edge nodes.
9. The network intrusion detection method based on heterogeneous information and a graph neural network according to claim 1, wherein in the real-time intrusion decision-making step, the constructed real-time intrusion detection and decision-making mechanism is a combination of joint bayesian inference and multi-agent reinforcement decision-making.
10. A network intrusion detection system based on heterogeneous information and a graph neural network, comprising: The multi-source heterogeneous data fusion module is used for collecting network traffic logs, host behavior logs and system call sequences as multi-source data, uniformly mapping the multi-source data into an entity-relation structure through a timestamp, an IP address, a host ID, a process ID and a user name associated key, and constructing a multi-domain security knowledge base covering a host, a process, a user and an IP; The dynamic heterogeneous security graph construction and space-time subgraph extraction module is used for defining a graph mode, mapping entity types into node types and mapping interaction event relations among the entities into edge types, generating a time sequence graph snapshot in real time, dynamically adding/updating nodes and edges according to a security event stream, dividing a global graph sequence according to time windows, and triggering local space-time subgraph extraction based on suspicious nodes, wherein the node types comprise a host, a process, a user and an IP; The space-time diagram neural network GNN analysis module is used for distributing leavable weights for the attributes of different node/edge types by adopting a heterogeneous characteristic attention mechanism, fusing neighbor information by using a diagram attention network in a space dimension and capturing node behavior evolution by using a gating circulation unit in a time dimension through a spectrum-time sequence heterogeneous converter structure; the system comprises a cross-domain federal graph learning processing module, a server side, a feature-structure decoupling design and privacy protection module, a cross-domain federal graph learning processing module, a feature-structure decoupling module and a user interface module, wherein the cross-domain federal graph learning processing module is used for locally training a space-time graph neural network GNN model of a heterogeneous sub-graph through each domain and uploading GNN model parameters to the server side; The real-time intrusion decision-making module is used for constructing a real-time intrusion detection and decision-making mechanism aiming at cross-domain federal learning and privacy protection, calculating an abnormal score according to the GNN output vector of the node/subgraph, marking the node/subgraph as intrusion if the abnormal score exceeds a threshold value, and outputting an attack type and a tracing path by combining an attack classifier.

Description

Network intrusion detection method and system based on heterogeneous information and graph neural network Technical Field The application belongs to the technical field of network security, and particularly relates to a network intrusion detection method and system based on heterogeneous information and a graph neural network. Background With the continuous expansion of the internet and the internet of things, network attack means are increasingly complex, and multi-source heterogeneous log data become very common in network security monitoring. The traditional intrusion detection system is generally divided into two types of misuse detection based on characteristic signatures and anomaly detection based on behavior deviation, wherein the misuse detection can efficiently discover known threats by matching a known attack characteristic library, but unknown attacks (zero-day attacks) cannot be identified, the anomaly detection can discover anomalies by learning a normal behavior mode, and the unknown attacks can be detected with high false alarm rate if the unknown attacks lack sufficient training and tuning. In recent years, machine learning and deep learning techniques have been introduced in the field of network intrusion detection, for example, classifying network traffic or host logs using support vector machines, models such as random forests, or Convolutional Neural Networks (CNNs), cyclic neural networks (RNNs). However, these conventional schemes generally can only process a single type of data, and are difficult to fuse heterogeneous information such as network traffic, host behavior, system calls, and the like, and have insufficient correlation analysis capability on complex multi-stage attacks. In addition, when modeling sequence data, such as CNN/RNN, the association relation between different entities is difficult to embody, and network topology and context information cannot be fully utilized, so that limitations exist in terms of detection precision and robustness. The use of graph structure data in network security is of increasing interest. Hosts, processes, communication traffic and the like in the network system naturally form nodes and edges to form rich graph relations. If the heterogeneous security log can be converted to a graph representation and analyzed, it would be expected to capture the attack chain and hidden patterns across hosts and networks. The graph neural network (Graph Neural Network, GNN) has been demonstrated in recent years to have excellent performance in Network Intrusion Detection (NID) as a deep learning model for graph data. By converting network traffic, system logs and the like into graphs and applying a GNN model, abnormal modes can be mined in the correlation of nodes and edges, and the detection capability of complex attacks is improved. However, existing intrusion detection methods based on graph neural networks still face some challenges. For example, how to construct a graph that effectively fuses heterogeneous information of multiple sources, the existing scheme is often only aimed at a single source or adopts a fixed rule composition, which may lead to information loss or deficiency, for example, most GNN models are aimed at static graphs or single time batch data, which are difficult to adapt to dynamic behavior evolving in a network, and in addition, part of methods do not fully consider the importance of different characteristics of heterogeneous data, all information is simply processed together, and key attack signs may not be highlighted. Disclosure of Invention In order to solve the technical problems that the traditional intrusion detection method is difficult to fuse multi-source heterogeneous data, has insufficient modeling on dynamic network behaviors, has weak cross-domain cooperative capability, is poor in real-time performance and the like, the invention provides an intrusion detection scheme which can fuse heterogeneous security data, dynamically models a network structure and adopts an improved graph neural network model, and can improve the robustness to unknown attacks, the processing capability and the real-time detection capability to large-scale graphs and reduce the false alarm rate. The invention adopts the following technical scheme: In one aspect, a network intrusion detection method based on heterogeneous information and a graph neural network includes: A multi-source heterogeneous data fusion step, namely collecting network traffic logs, host behavior logs and a system call sequence as multi-source data, uniformly mapping the multi-source data into an entity-relation structure through a timestamp, an IP address, a host ID, a process ID and a user name associated key, and constructing a multi-domain security knowledge base covering a host, a process, a user and an IP; the method comprises the steps of dynamic heterogeneous security graph construction and space-time subgraph extraction, defining a graph mode, mapping entity types