CN-122027277-A - Classification method and system for detecting abnormal network traffic

CN122027277ACN 122027277 ACN122027277 ACN 122027277ACN-122027277-A

Abstract

The invention provides a classification method and a classification system for detecting abnormal network traffic, which relate to the technical field of network security, wherein the method comprises the steps of acquiring and preprocessing an original network traffic data packet to form an initial traffic data set; A stacked sparse denoising self-encoder network is constructed and trained, nonlinear dimension reduction and robust feature representation learning is carried out on high-dimensional basic features to obtain potential space feature vectors, reconstruction errors of each network flow are calculated, abnormal flow classification is achieved according to combined criteria of a reconstruction error threshold and dynamic cluster analysis, a dynamic abnormal flow prediction model is constructed and applied to predict future probability distribution, early warning signals are generated, the system comprises data acquisition and other modules, and comprehensive optimization of high abnormal detection rate, low false alarm rate and low calculation cost is achieved through depth feature compression and three-level progressive classification mechanisms, so that the system is suitable for various network deployment environments.

Inventors

FAN JIASHU
WU BO

Assignees

领创安达(北京)科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260210

Claims (10)

1. A classification method for detecting abnormal network traffic, comprising: s1, acquiring and preprocessing an original network flow data packet in a target network environment to form an initial flow data set to be analyzed; s2, constructing and training a stacked sparse denoising self-encoder network to perform nonlinear dimension reduction and robustness feature representation learning on basic features in an initial flow data set; s3, coding the preprocessed network flow basic characteristics by using a central bottleneck layer of the training-completed stacked sparse denoising self-encoder network to generate corresponding potential space characteristic vectors; s4, calculating a reconstruction error of each network flow based on the potential space feature vector, and realizing classification of abnormal network flow according to a combination criterion of a reconstruction error threshold and dynamic cluster analysis, and generating and outputting a classification result; And S5, constructing and applying a dynamic abnormal flow prediction model based on the classification result, analyzing a time sequence mode of the historical abnormal flow, predicting probability distribution of the abnormal flow in a future time window, and generating an early warning signal.
2. The method of claim 1, wherein the preprocessing in S1 includes packet flushing, quintuple-based session stream reassembly, and extracting predefined 41 base features from the first packet and the first few packets of each network stream, the packet flushing filtering checksum errors, length anomalies, and invalid packets targeted to the acquisition probe itself.
3. The method of claim 1, wherein the stacked sparse denoising self-encoder network of S2 comprises an input layer, three encoder layers, a central bottleneck layer, three decoder layers, and an output layer.
4. The method of claim 1, wherein S4 comprises the steps of: calculating a preliminary reconstruction error value of the network flow; If the primary reconstruction error value is larger than the first static threshold value, performing secondary confirmation by adopting a neighborhood density verification method based on local abnormal factors; If the initial reconstruction error value is smaller than or equal to the first static threshold value, judging by adopting a probability distribution verification method based on a Gaussian mixture model; Generating and outputting the classification result includes generating a classification record for each network flow for which classification decisions are made.
5. The method of claim 4, wherein the local anomaly factor-based neighborhood density verification method includes calculating a local anomaly factor score for the anomaly candidate stream, determining an anomaly traffic if the score is greater than a second dynamic threshold, and wherein the Gaussian mixture model-based probability distribution verification method includes calculating a negative log-likelihood value for the potential spatial feature vector of the network stream belonging to the Gaussian mixture model, determining an anomaly traffic if the value is greater than a third probability threshold.
6. The method of claim 1, wherein S2 further comprises abstracting the network flow into graph structures based on a traffic association analysis method of the graph rolling network, and performing feature propagation and aggregation by using the graph rolling network to enhance detection of the collaborative attack.
7. The method of claim 1, wherein the classification criteria of S4 further comprises a distributed decision method based on multi-agent reinforcement learning, wherein dedicated detection agents are respectively set for five attacks, namely DDoS, port scan, malware propagation, data leakage and internal threat, the outputs of which are fused by a central decision agent to generate a final classification decision, and wherein each dedicated detection agent is trained by a deep Q network algorithm, and the network structure is two hidden layers comprising 128 and 64 neurons.
8. The method of claim 1, wherein the method further comprises a multi-stream aggregation analysis method based on flow behavior level after the step S4, wherein the multi-stream aggregation analysis method comprises the steps of taking a time window W=10 continuous network streams as an analysis unit, aggregating anomaly scores in the window by using LogSumExp functions, inputting a single-class support vector machine classifier with a kernel function as a radial basis function to judge continuous attack behaviors, the step S5 comprises the steps of extracting historical anomaly flow sequences from classification results, constructing time sequence feature vectors, applying a gating circulation unit network model to predict, outputting probability values of anomaly flows in a future time window, generating early warning signals according to comparison of the probability values and dynamic early warning thresholds, and triggering an adaptive defense strategy.
9. A classification system for detecting abnormal network traffic for implementing the method of any of claims 1 to 8, the system comprising: the data acquisition module is used for capturing the full bidirectional network data packet in the target network environment; The data preprocessing module is used for cleaning an original network flow data packet, reorganizing a session flow and analyzing a basic characteristic field to form an initial flow data set; the feature learning model construction module is used for constructing a stacked sparse denoising self-encoder network; the feature learning model training module is used for training the self-encoder network; The potential feature coding module is used for generating potential space feature vectors by utilizing the self-encoder network after training; the abnormal flow classification module is used for realizing abnormal flow classification based on the potential space feature vector and the combined criterion; the abnormal flow prediction and early warning module is used for constructing and applying a dynamic abnormal flow prediction model based on the classification result of the abnormal flow classification module, analyzing the time sequence mode of the historical abnormal flow, predicting the probability distribution of the abnormal flow in a future time window and generating an early warning signal.
10. The system of claim 9, wherein the data preprocessing module comprises a data packet cleaning unit, a session stream reorganizing unit and a basic feature analyzing unit, and the abnormal traffic classification module further comprises a reconstruction error calculating unit, a local abnormal factor verifying unit and a Gaussian mixture model verifying unit.

Description

Classification method and system for detecting abnormal network traffic Technical Field The present invention relates to the field of network security technologies, and in particular, to a classification method and system for detecting abnormal network traffic. Background In the present highly interconnected digital age, network infrastructure has become a core support for social operation, however, with the popularization of cloud computing, internet of things and fifth generation mobile communication technologies, the network attack surface is continuously expanding, attack techniques are increasingly complicated and hidden, abnormal network traffic detection is used as a key defense line for network security, and the technical evolution of the abnormal network traffic detection always faces serious challenges. The prior art mainly relies on a traditional machine learning or deep learning method, the former is stable in performance in certain scenes, but is difficult to adapt to a rapidly evolving attack mode due to severe dependence on artificial feature engineering and high-quality labeling data, the latter is capable of automatically extracting features, but has the problems of huge calculation cost, poor interpretability, high deployment cost and the like, especially in actual network operation and maintenance, flow data often show extreme unbalance, normal flow is dominant, abnormal flow is extremely small in proportion and various in forms, and the model is easy to generate detection deviation. In an intelligent home or industrial Internet of things system, a large number of devices with limited resources continuously generate small-scale and low-power consumption flow, the mode of the devices changes frequently along with user behaviors or season replacement, the flow difference between legal device updating and malicious software propagation cannot be effectively distinguished in the prior art, for example, when novel software is disguised into normal data transmission through encrypted flow, the traditional method is extremely easy to generate missed report due to lack of deep understanding of essential characteristics of the flow, once the normal device fluctuation is judged as attack by mistake, the false report is caused, the system frequently alarms are caused, the real threat is covered instead, and if the computing capacity of the Internet of things device is limited, a complex model is difficult to run in real time, operation and maintenance personnel are forced to have to be hard to trade off between detection precision and resource consumption. Therefore, a new classification method and system for detecting abnormal network traffic are urgently needed in the market. Disclosure of Invention The invention aims to provide a classification method and a classification system for detecting abnormal network traffic, which are used for solving the problems of insufficient feature extraction, weak model generalization capability, high false alarm rate, poor adaptability and the like caused by high network traffic feature dimension, strong linear correlation and large noise interference in the prior art, and the problems of single and fixed detection threshold being relied on when the network traffic normally fluctuates and a novel unknown attack mode exist, and the specific technical scheme is as follows: The invention provides a classification method for detecting abnormal network traffic, which comprises the following steps: s1, acquiring and preprocessing an original network flow data packet in a target network environment to form an initial flow data set to be analyzed; S2, constructing and training a stacked sparse denoising self-encoder network to perform nonlinear dimension reduction and robust feature representation learning on basic features in the initial flow data set; S3, coding the preprocessed network flow basic characteristics by utilizing the central bottleneck layer of the stacked sparse denoising self-encoder network after training, and generating corresponding potential space characteristic vectors; s4, calculating a reconstruction error of each network flow based on the potential space feature vector, and realizing classification of abnormal network flow according to a combination criterion of a reconstruction error threshold and dynamic cluster analysis, and generating and outputting a classification result; And S5, constructing and applying a dynamic abnormal flow prediction model based on the classification result, analyzing a time sequence mode of the historical abnormal flow, predicting probability distribution of the abnormal flow in a future time window, and generating an early warning signal. Further, the preprocessing in S1 comprises data packet cleaning, session stream reorganization based on five-tuple and extraction of 41 pre-defined basic features from the first packet and the first plurality of data packets of each network stream, wherein the data packet cleaning