CN-121984782-A - Attack identification method, system, computer equipment and medium of operating system

CN121984782ACN 121984782 ACN121984782 ACN 121984782ACN-121984782-A

Abstract

The invention provides an attack identification method, a system, computer equipment and a medium of an operating system, belonging to the technical field of network security, wherein the method comprises the steps of collecting system logs from the operating system, an application program and the network equipment, generating a log text high-dimensional semantic vector by using a pre-training unsupervised semantic coding model after preprocessing, and reducing the dimension into a low-dimensional characteristic by a self-encoder; the method comprises the steps of mapping system entities into nodes, mapping log interaction relations into directed edges, combining low-dimensional features to construct a system traceability graph, then learning node high-dimensional characterization by using an enhanced graph attention network EGAT, inputting the node high-dimensional characterization into a bidirectional gating circulation unit after time sequence serialization to capture behavior time sequence dependency, outputting node abnormal classification probability, and finally generating a global model by a federation server to realize multi-system collaborative detection under privacy protection. The invention can accurately identify complex attack chains, and has the advantages of both detection precision and data privacy, and is suitable for the real-time intrusion detection and safety response requirements of a large-scale network environment.

Inventors

LIU XINQIAN
LIU HUIXUE
WANG SHUAI
Xu Benda
CHEN GUODONG

Assignees

山东理工大学

Dates

Publication Date: 20260505
Application Date: 20260323

Claims (10)

1. An attack recognition method of an operating system, comprising: Collecting system log texts from a target operating system, an application program and network equipment, wherein the system log texts comprise time stamps, event types, data source entity objects, target entity objects and text context information; The method comprises the steps of vectorizing log text information to be represented as a high-dimensional semantic vector, converting the high-dimensional semantic vector into a low-dimensional characteristic representation, mapping a process, a file, an IP address and a device connection relation of a target operating system to be nodes, counting interaction times among the nodes to be used as node characteristics, mapping the node characteristics to be directed edges, constructing a system tracing graph by using the nodes and the directed edges, carrying out multilayer propagation and aggregation on the system tracing graph of the target operating system, the nodes and the low-dimensional characteristic representation, and inputting the enhanced graph meaning network EGAT to output a node representation vector set; Based on the time stamp, serializing the node characterization vector set according to a time window to obtain a node characterization sequence, and inputting the node characterization sequence into a classifier to obtain the specific attack type of the target operating system.
2. The attack recognition method of the operating system according to claim 1, wherein the statistics of the interaction times among the nodes is used as node characteristics, specifically comprises the steps of calculating the interaction times of different event types of the nodes in the system log text, and using the interaction times as node characteristics.
3. The attack recognition method of an operating system according to claim 1, further comprising the steps that a federal server aggregates parameters of an enhanced graph annotation force network EGAT by adopting FedAvg algorithm to complete local model training, each client uploads the parameters of the local model to the federal server, the federal server aggregates received parameters or gradients to generate a global model and transmits the global model to each client, and the client uses the global model to conduct attack detection of a target system.
4. The method for identifying an attack of an operating system according to claim 1, wherein the log text information is vectorized and represented as a high-dimensional semantic vector by using an unsupervised semantic coding model, and the unsupervised semantic coding model is a SBERT model.
5. The method of claim 1, wherein the high-dimensional semantic vector is converted into a low-dimensional feature representation using a self-encoder, the low-dimensional feature representation representing initial features of the directed edge, the low-dimensional feature representation having dimensions of 32 dimensions.
6. The method of claim 1, wherein the enhanced graph attention network EGAT introduces both edge features and multi-head attention mechanisms in the attention calculation, and the classifier comprises a bi-directional gating loop unit BiGRU, a linear convolution layer, and a Softmax output layer.
7. The attack identification method of the operating system according to claim 1 is characterized by further comprising the steps of collecting a system log of actual operation of a target operating system, updating a system tracing graph, carrying out real-time reasoning on newly generated nodes and directed edges by using the updated system tracing graph based on a trained enhanced graph annotation force network EGAT and a classifier, outputting classification results of abnormal nodes, mapping suspicious behaviors back to the system tracing graph according to the classification results of the abnormal nodes, and generating a visual attack path and a causal chain for providing clear evidence.
8. An attack identification system for an operating system, comprising: the system log text comprises a time stamp, an event type, a data source entity object, a target entity object and text context information; The system comprises a log text processing module, a multi-layer propagation and aggregation module, a node characterization vector set, a system tracing graph, a multi-layer propagation and aggregation module and a multi-layer aggregation module, wherein the log text processing module is used for vectorizing and representing log text information into a high-dimensional semantic vector and converting the high-dimensional semantic vector into a low-dimensional feature representation; the attack classification module is used for carrying out serialization processing on the node characterization vector set according to time windows based on the time stamp to obtain a node characterization sequence, and inputting the node characterization sequence into the classifier to obtain the specific attack type of the target operating system.
9. A computer device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the method according to any one of claims 1 to 7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when loaded by a processor, is able to carry out the steps of the method according to any one of claims 1 to 7.

Description

Attack identification method, system, computer equipment and medium of operating system Technical Field The invention belongs to the technical field of network security, and particularly relates to an attack identification method, an attack identification system, computer equipment and a medium of an operating system. Background With the deep integration of computer technology and communication technology, network technology is rapidly developed, and the learning mode and life style of people are deeply changed. The continued expansion of network size, while bringing convenience, also poses an increasingly serious security challenge. Various network attack events such as zero day attacks (0-DAY ATTACK), worm spreading, network virus spreading and the like are frequent, and have caused serious threat to national information security and public economic benefit, and even cause serious economic loss. Therefore, ensuring network security has become a critical issue that needs to be addressed currently. In the network defense system, intrusion detection technology plays an indispensable role as an important means for actively identifying and responding to potential threats. According to the technology, by analyzing the behavior characteristics of the network traffic, whether abnormal or malicious activities exist or not is judged, so that early detection and early warning of attack behaviors are realized. Currently, the mainstream intrusion Detection methods are mainly divided into two types, misuse-based Detection (Misuse-based Detection) and anomaly-based Detection (Anomaly-based Detection). The misuse-based intrusion detection technique relies on a pre-built library of attack signatures to identify specific patterns of behavior of known attacks in the system operating sequence or network traffic by means of pattern matching. The method has the advantages of high accuracy in identifying the existing attack type and low false alarm rate. However, the limitations are also obvious, namely, the unknown attack (such as zero day attack) is difficult to be handled, the detection capability of the novel or variant attack is weak, the missing report rate is high, meanwhile, the response period is long because the attack characteristics need to be analyzed and updated to a rule base, and the worm attack with extremely high propagation speed (which can be widely spread in tens of seconds) cannot be handled usually in units of hours or even days. In contrast, anomaly-based intrusion detection techniques continuously monitor the actual traffic for deviations from normal network behavior by establishing a baseline model of the network, and determine a potential intrusion behavior once a significant deviation is found. The method does not need to rely on priori attack knowledge, has the capability of finding novel attack and polymorphic malicious codes, has strong adaptability, and is particularly suitable for complex and changeable modern network environments. The main challenges faced by the method are that the network behavior has high dynamic and uncertainty, the user activity modes are various, the normal behavior base line is difficult to accurately model, and moreover, the non-aggressive abnormal behavior can be misjudged as attack, so that a high false alarm rate is caused. More seriously, an attacker might induce the system to "learn" malicious activity as normal behavior by progressively changing its behavior pattern, thereby circumventing detection. In view of the obvious hysteresis of the traditional misuse-based detection method when facing a novel attack, the current network security protection requirement is difficult to meet, and the abnormality-based detection technology has stronger prospective and adaptability, but is still limited by the problems of high misinformation rate and the like. Therefore, researchers in recent years introduce intelligent analysis technologies such as data mining and machine learning into the field of intrusion detection, so as to improve the accuracy and robustness of anomaly detection, reduce the false alarm rate and promote construction of a more intelligent and self-adaptive network security defense system. The data mining-based method is to extract reduced information from the original data and compare it with the test data, and clustering and classification are the two most common means. Clustering is an unsupervised learning method that groups data according to a certain similarity measure. For example, cheng Xiaoxu et al propose an improved K-means algorithm that enables globally optimal clustering results and significantly reduces the time complexity of anomaly detection. Similarly, the students such as Al-Yaseen W L utilize the improved K-means algorithm to reduce the data quantity and improve the data quality, and the intrusion detection model is built by combining a C4.5 decision tree algorithm, so that the operation efficiency and the detection precision of the syste