CN-122027239-A - Abnormal network node detection method based on GRAPHSAGE and side note force mechanism

CN122027239ACN 122027239 ACN122027239 ACN 122027239ACN-122027239-A

Abstract

The invention belongs to the technical field of network behavior anomaly detection and analysis, and discloses an anomaly network node detection method based on GRAPHSAGE and a side note force mechanism. Aiming at the problems that the traditional network traffic analysis method has limited detection effect in a complex structure network attack scene and the traditional graph neural network model has insufficient excavation of side information in the graph and limited modeling capability of node communication behavior, the double-view graph node characteristic extraction method integrating the information of the IP graph and the IP port graph is realized, and an abnormal network node detection model based on GRAPHSAGE and side annotation force mechanisms is constructed. The model carries out weighted aggregation on the edge flow characteristics of the nodes through an edge attention mechanism, improves GRAPHSAGE layers to aggregate neighborhood node embedding and flow edge characteristics, simultaneously fuses graph topology information and edge flow information in the finally generated node embedding, and finally completes abnormal node detection by utilizing a shallow classifier. The invention has good generalization capability and higher detection accuracy.

Inventors

YE XIAOMING
LI XIN
Ou Lujin
Kong Tenglong

Assignees

成都信息工程大学

Dates

Publication Date: 20260512
Application Date: 20260121

Claims (8)

1. The abnormal network node detection method based on GRAPHSAGE and side note force mechanism is characterized by comprising the following steps: Based on network traffic data of the same time window, respectively constructing a first communication view angle and a second communication view angle, wherein the first communication view angle takes a communication entity as a node, the second communication view angle takes a combination mark of the communication entity and a port as a node, and the two view angles take network traffic as directed edges; Respectively extracting structural topological features of nodes in the first communication view angle diagram and the second communication view angle diagram, and aligning and mapping the node features of the first communication view angle diagram to corresponding nodes of the second communication view angle diagram to generate enhanced node feature vectors containing double-view angle communication structural features; calculating edge attention weights based on node edge behaviors and corresponding target node characteristics, and describing contribution degrees of different communication behaviors to node abnormality; Carrying out weighted aggregation on the outgoing communication behaviors of the nodes based on the edge attention weights to obtain a communication behavior representation of the nodes in the first stage; aggregating the neighborhood node behavior representation of the node with the corresponding communication behavior characteristics, and performing full neighborhood aggregation to obtain a second stage node representation of the node; based on the second-stage node representation, judging whether the node is an abnormal node or not through a shallow classifier; The dual-view feature alignment mechanism is used for fusing structural information of different communication abstract levels, the side attention computing mechanism is used for highlighting key communication relations corresponding to abnormal node behaviors, the full neighborhood aggregation is used for completely preserving local communication structural modes of nodes, and the three mechanisms form a collaborative abnormality detection mechanism.
2. The method of claim 1, wherein the node of the first communication view is identified as an IP address and the node of the second communication view is identified as a combination of an IP address and a port number.
3. The method of claim 1, wherein the two view node feature vectors include at least an out-degree, an in-degree, a degree-centrality, a medium centrality, a tight centrality, a feature vector centrality, and a PageRank value of the node.
4. The method of claim 1, wherein the feature alignment is performed by parsing a communication entity identifier in the node identifier of the second communication view, and performing stitching and fusion after searching the features of the corresponding nodes in the first communication view.
5. The method of claim 1, wherein the edge-injection-force weights are obtained by linear mapping and nonlinear transformation of the edge-out communication characteristics of the central node and the edge-out target node characteristics, and are normalized to obtain the relative weight relationship between the edges.
6. The method of claim 1, wherein the first stage node behavior representation is obtained by weighted summing communication behavior characteristics of all outgoing edges of a node by corresponding attention weights.
7. The method of claim 1, wherein the full neighborhood aggregation mechanism merges the first stage node behavior representation corresponding to all outgoing edges of the nodes with the communication behavior feature, does not sample and filter the neighborhood, and applies an aggregation function to aggregate all neighborhood information to update the embedded representation of the central node.
8. The method of claim 1, wherein the decision basis for the abnormal node is that the node represents a class result output by the classified network.

Description

Abnormal network node detection method based on GRAPHSAGE and side note force mechanism Technical Field The invention belongs to the technical field of network behavior anomaly detection and analysis, and particularly relates to an anomaly network node detection method based on GRAPHSAGE and a side note force mechanism. Background With the wide application of information technology, network systems have become key infrastructure in various fields such as global economy, industrial control, and social management. Network applications are becoming increasingly abundant, network structures are becoming more complex, and the manner in which attacks are suffered is constantly changing. Network attack means have been transformed from eavesdropping, tampering, etc. to a form of attack that is evolving gradually into more concealed and distributed. The network security posture report of month 12 of 2024 indicated that more than 413.92 million Web application attacks and about 253308 distributed denial of service (DistributedDenialofService, DDoS) attacks were detected in China in month 11. The report of DDoS attack statistics and observations at QratorLabs suggests that QratorLabs discovered a large-scale DDoS botnet with 133 ten thousand devices in the first quarter of 2025, and the number of devices was greatly increased compared to a large-scale botnet with 22.7 ten thousand devices discovered in 2024. DDoS attack, denial of service (DenialofService, doS) attack, botnet, web application attack, violent cracking, advanced persistence threat (ADVANCEDPERSISTENTTHREAT, APT) and other modes, once the attack is successful, the threats such as system paralysis, data leakage, confidential information theft and the like can be caused, and the normal and stable operation of society is seriously affected. Therefore, the network attack is accurately found, the defending measures are timely taken, and the reduction of the loss caused by the network attack becomes a necessary requirement for guaranteeing the security and stability of the network. From the perspective of an attack chain model, common means of attack utilization stages such as Web application program attack and brute force cracking are used for acquiring host control authority, and then an attacker completes the installation and control stages by implanting malicious software on a victim host to form a botnet. In the follow-up phase of the attack chain, these botnets are often used to implement DDoS, distributed port scan, doS, etc. network attacks. The network traffic analysis technology is used as a key technical means for guaranteeing the continuous safety of network space, can detect and block the threat, and protects key infrastructure, data and network. The existing network traffic analysis technology is mainly based on machine learning and deep learning methods, and the methods have certain complex attack detection capability, however, the methods usually only consider the characteristic modeling normal behavior detection abnormality of traffic data in isolation, and do not consider the equipment interaction relationship and network topology information contained in the traffic. With the complexity of network attack means, the method has the problem of limited detection effect in detecting network attacks of complex structures such as botnet attacks, distributed port scanning and the like. Because the data packets are transmitted in the network not only have statistical properties in time and content, but also reflect the spatial relationship between network entities, researchers often use graphs to model network communications, nodes can represent hosts or devices, and edges can represent communications between nodes. The modeling mode provides a structural basis for researchers to perform graph representation learning tasks by using a graph neural network. Therefore, graph representation learning based on graph neural network is becoming an important research direction in the field of flow analysis. However, most of the existing node classification researches focus on structural information of the graph, and the topological features and the edge flow features of the nodes are not fully combined, so that modeling capability of the model on the node communication behavior mode is limited. Through the above analysis, the problems and defects existing in the prior art are as follows: (1) Existing network traffic analysis techniques based on machine learning and deep learning generally consider only the characteristics of traffic data itself in isolation and do not consider network topology information. With the complexity of network attack means, the method can have the problem of limited detection effect when detecting abnormal nodes which send out complex-structure network attacks such as botnet attacks, distributed port scanning and the like. (2) The existing graph-based representation learning model for detecting abnormal nodes usually only pays atten