Search

CN-122001610-A - Network anomaly detection method, system, equipment and medium based on double-flow mixing

CN122001610ACN 122001610 ACN122001610 ACN 122001610ACN-122001610-A

Abstract

The invention belongs to the technical field of network security, and discloses a network anomaly detection method, a system, equipment and a medium based on double-flow mixing; the method comprises the steps of obtaining local time sequence characteristics and global context characteristics, obtaining fusion characteristic vectors, constructing a multi-task learning framework for synchronously realizing multi-task collaborative optimization of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task based on the fusion characteristic vectors, obtaining an optimal network anomaly detection model, inputting network flow to be detected into the optimal network anomaly detection model, and outputting an anomaly flow detection result, an attack type classification result, a flow prediction value and a future moment characteristic prediction value. According to the invention, through a double-flow cooperative architecture, the deep fusion of the local time sequence characteristics of the network traffic and the dependence of the global behavior is realized, and the problem of semantic rupture caused by single modeling in the traditional method is solved.

Inventors

  • TIAN BOYAN
  • WANG WENTING
  • GAO PENG
  • ZHANG YUN
  • LIU XIN
  • Gong Ningyuan
  • CONG ZHIPENG
  • GENG YUJIE
  • ZHANG JINHAO
  • LIU JING
  • PAN HAO
  • ZHAO BINCHAO

Assignees

  • 国网山东省电力公司电力科学研究院
  • 国网山东省电力公司烟台供电公司
  • 南京理工大学

Dates

Publication Date
20260508
Application Date
20251204

Claims (16)

  1. 1. The network anomaly detection method based on double-flow mixing is characterized by comprising the following steps: Acquiring network traffic data, and preprocessing the network traffic data to obtain a structured time sequence traffic sample; Constructing a double-flow mixed extraction model based on a multi-layer two-way long-short-term memory network and a multi-layer self-attention encoder, and extracting features of time sequence flow samples by utilizing the double-flow mixed extraction model to obtain local time sequence features and global context features; Performing dynamic self-adaptive weighted fusion on the local time sequence features and the global context features by using a cross-attention fusion layer to obtain fusion feature vectors; Based on the fusion feature vector, constructing a multi-task learning frame for synchronously realizing multi-task collaborative optimization of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task; Adopting an end-to-end training mode, and jointly optimizing parameters of a double-flow mixed extraction model, a cross-attention fusion layer and a multi-task learning frame through a back propagation algorithm to obtain an optimal network anomaly detection model; inputting the network flow to be detected into an optimal network anomaly detection model, and outputting an anomaly flow detection result, an attack type classification result, a flow prediction value and a future moment characteristic prediction value.
  2. 2. The dual-flow hybrid-based network anomaly detection method of claim 1, wherein the obtaining network traffic data and preprocessing the network traffic data to obtain a structured time-series traffic sample comprises: loading a network flow record from the dataset to obtain network traffic data; Performing characteristic decoupling processing on network flow data, decomposing an internet protocol address into four-dimensional numerical vectors, mapping port numbers into port type classifications, and converting a transmission control protocol flag bit into binary characteristics; generating derivative features based on the decoupled features, constructing a sequence, and calculating the number of data packets per second and the number of bytes per packet to obtain numerical features; Carrying out standardization treatment on the numerical type characteristics, and calculating a mean value and a standard deviation; And cutting the numerical type characteristic after the standardized treatment by adopting a fixed-length sliding window to obtain a structured time sequence flow sample.
  3. 3. The network anomaly detection method based on double-flow mixing according to claim 1, wherein the constructing a double-flow mixing extraction model based on a multi-layer bidirectional long-short-term memory network and a multi-layer self-attention encoder, and performing feature extraction on time sequence traffic samples by using the double-flow mixing extraction model, obtaining local time sequence features and global context features comprises: Constructing a double-flow mixed extraction model containing a local time sequence feature extraction flow and a global context feature extraction flow based on a multi-layer two-way long-short-term memory network and a multi-layer self-attention encoder; The local time sequence characteristic extraction flow is utilized to carry out sequential and reverse sequence bidirectional modeling on the time sequence flow samples, and the local time sequence dependency relationship and the short-term flow fluctuation characteristic are captured as local time sequence characteristics; And (3) carrying out parallel processing on the time flow samples by utilizing the global context feature extraction flow, and modeling a long-range dependency relationship of the time steps through a self-attention mechanism as a global context feature.
  4. 4. The dual-flow hybrid-based network anomaly detection method of claim 3, wherein the bi-directional modeling of sequential traffic samples sequentially and inversely sequentially using local timing feature extraction flows and capturing local timing dependencies and short-term traffic fluctuation features as local timing features comprises: adopting a multi-layer two-way long-short-term memory network to perform two-way modeling on the sequence flow samples in sequence and in reverse sequence, and processing the sequence from the beginning to the end through a forward long-short-term memory network; Capturing a history dependency relationship through a forward long-short-term memory network, and simultaneously utilizing a backward long-short-term memory network to process from the end of a sequence to the beginning direction, so as to capture future context information; and splicing the forward hidden state and the backward hidden state of each time step to form final characteristic representation, and capturing a local time sequence dependency relationship and a short-term flow fluctuation characteristic as local time sequence characteristics.
  5. 5. The dual-flow hybrid-based network anomaly detection method of claim 3, wherein the parallel processing of the time-series traffic samples using the global context feature extraction flow and modeling the long-range dependency of the stride across time as the global context feature by the self-attention mechanism comprises: Generating unique position codes for each position in the sequence by adopting a sine and cosine function; On the basis of position coding, a multi-head self-attention mechanism of a plurality of attention heads is adopted for parallel calculation, so that each attention head independently learns the dependency relationship of different characteristic subspaces; and adding a feedforward network and layer normalization in each layer of the multi-layer self-attention encoder, introducing residual connection to promote training stability, and finally completing modeling of the cross-time step-distance dependency relationship as global context characteristics.
  6. 6. The dual-flow hybrid-based network anomaly detection method of claim 1, wherein the performing dynamic adaptive weighted fusion on the local timing feature and the global context feature using the cross-attention fusion layer to obtain a fused feature vector comprises: Generating a query matrix from the global context feature using the cross-attention fusion layer, and generating a key matrix and a value matrix from the local timing feature; Calculating attention score through dot product of the query matrix and the key matrix, scaling the attention score and normalizing the flexible maximum value to obtain attention weight; Carrying out weighted summation on the value matrix by using the attention weight, and adding the weighted summation result and the original global context feature through residual connection; and carrying out feature projection on the residual connection addition result through a linear layer, and carrying out dimension reduction and normalization processing on the fusion features by adopting a group normalization method to obtain a fusion feature vector.
  7. 7. The dual-flow hybrid-based network anomaly detection method of claim 1, wherein constructing a multi-task learning framework for synchronously implementing a traffic classification task, an attack type recognition task, a traffic regression task, and a sequence prediction task based on the fusion feature vector comprises: establishing output layer structures of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task based on the fusion feature vector; An output layer of a flow classification task is configured, and two classification probabilities are output through a Sigmoid activation function to judge whether the flow is abnormal or not, so that network construction of the flow classification task is completed; an output layer of an attack type identification task is configured, fifteen classification probabilities are output through a Softmax activation function to identify attack types, and network construction of the attack type identification task is completed; configuring an output layer of a flow regression task, and predicting bandwidth consumption values through a linear output layer to complete regression network construction of the flow regression task; configuring an output layer of a sequence prediction task, generating a feature vector of the next moment through the output layer matched with the feature dimension, and completing the prediction network construction of the sequence prediction task; Configuring training parameters of a multi-task learning framework, adopting an adaptive moment estimation optimizer, setting an initial learning rate, and adjusting the learning rate through an exponential decay learning rate strategy; Presetting a batch value and a training round value, utilizing a random inactivation regularization technology and setting a discarding rate, defining a total loss function as weighted sum multi-task collaborative optimization of flow classification tasks, attack type identification tasks, flow regression tasks and sequence prediction task losses, and obtaining a multi-task learning framework.
  8. 8. Network anomaly detection system based on double-flow mixing, which is characterized by comprising: The data preprocessing module is used for acquiring network traffic data and preprocessing the network traffic data to obtain a structured time sequence traffic sample; The double-flow characteristic extraction module is used for constructing a double-flow mixed extraction model based on a multi-layer two-way long-short-term memory network and a multi-layer self-attention encoder, and extracting characteristics of a time sequence flow sample by utilizing the double-flow mixed extraction model to obtain local time sequence characteristics and global context characteristics; the cross-attention fusion module is used for carrying out dynamic self-adaptive weighted fusion on the local time sequence characteristics and the global context characteristics by utilizing a cross-attention fusion layer to obtain fusion characteristic vectors; the multi-task prediction module is used for constructing a multi-task learning frame for synchronously realizing multi-task collaborative optimization of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task based on the fusion feature vector; the model training module is used for obtaining an optimal network anomaly detection model by adopting an end-to-end training mode and jointly optimizing parameters of the double-flow mixed extraction model, the cross-attention fusion layer and the multi-task learning frame through a back propagation algorithm; The result output module is used for inputting the network flow to be detected into an optimal network anomaly detection model for processing and outputting an anomaly flow detection result, an attack type classification result, a flow prediction value and a future moment characteristic prediction value.
  9. 9. The dual-flow hybrid-based network anomaly detection system of claim 8, wherein the data preprocessing module comprises: the data acquisition unit is used for loading network flow records from the data set to acquire network flow data; the feature decoupling unit is used for performing feature decoupling processing on the network flow data, decomposing the Internet protocol address into four-dimensional numerical vectors, mapping the port number into port type classification, and converting the transmission control protocol flag bit into binary features; The derived feature unit is used for generating derived features based on the decoupled features, constructing a sequence, and calculating the number of data packets per second and the number of bytes per packet to obtain numerical features; The normalization unit is used for performing normalization processing on the numerical type characteristics and calculating the mean value and the standard deviation; and the sequence construction unit is used for segmenting the numerical type characteristic after the standardized treatment by adopting a fixed-length sliding window to obtain a structured time sequence flow sample.
  10. 10. The dual-flow hybrid-based network anomaly detection system of claim 8, wherein the dual-flow feature extraction module is configured to construct a dual-flow hybrid extraction model based on a multi-layer bidirectional long-short-term memory network and a multi-layer self-attention encoder, and when performing feature extraction on a time sequence flow sample by using the dual-flow hybrid extraction model to obtain local time sequence features and global context features, to construct a dual-flow hybrid extraction model containing a local time sequence feature extraction flow and a global context feature extraction flow based on the multi-layer bidirectional long-term memory network and the multi-layer self-attention encoder, to perform sequential and reverse sequential bidirectional modeling on the time sequence flow sample by using the local time sequence feature extraction flow and capture local time sequence dependency and short-term flow fluctuation features as local time sequence features, to perform parallel processing on the time sequence flow sample by using the global context feature extraction flow and to perform cross-time-step long-range dependency modeling by a self-attention mechanism as global context features.
  11. 11. The dual-flow hybrid-based network anomaly detection system of claim 10, wherein the bi-directional modeling of sequential traffic samples sequentially and inversely sequentially using local timing feature extraction flows and capturing local timing dependencies and short-term traffic fluctuation features as local timing features comprises: adopting a multi-layer two-way long-short-term memory network to perform two-way modeling on the sequence flow samples in sequence and in reverse sequence, and processing the sequence from the beginning to the end through a forward long-short-term memory network; Capturing a history dependency relationship through a forward long-short-term memory network, and simultaneously utilizing a backward long-short-term memory network to process from the end of a sequence to the beginning direction, so as to capture future context information; and splicing the forward hidden state and the backward hidden state of each time step to form final characteristic representation, and capturing a local time sequence dependency relationship and a short-term flow fluctuation characteristic as local time sequence characteristics.
  12. 12. The dual-flow hybrid-based network anomaly detection system of claim 10, wherein the parallel processing of the time-series traffic samples using the global context feature extraction flow and modeling long-range dependencies across time steps as global context features by a self-attention mechanism comprises: Generating unique position codes for each position in the sequence by adopting a sine and cosine function; On the basis of position coding, a multi-head self-attention mechanism of a plurality of attention heads is adopted for parallel calculation, so that each attention head independently learns the dependency relationship of different characteristic subspaces; and adding a feedforward network and layer normalization in each layer of the multi-layer self-attention encoder, introducing residual connection to promote training stability, and finally completing modeling of the cross-time step-distance dependency relationship as global context characteristics.
  13. 13. The dual-flow hybrid-based network anomaly detection system of claim 8, wherein the cross-attention fusion module comprises: The query matrix generation and key value matrix generation unit is used for generating a query matrix from global context characteristics by utilizing a cross-attention fusion layer and generating a key matrix and a value matrix from local time sequence characteristics; The attention weight calculation unit is used for calculating attention scores through dot products of the query matrix and the key matrix, and scaling and normalizing the flexible maximum value of the attention scores to obtain attention weights; the weighted fusion and residual connection unit is used for carrying out weighted summation on the value matrix by using the attention weight, and adding the weighted summation result and the original global context feature through residual connection; and the characteristic projection unit is used for carrying out characteristic projection on the residual connection addition result through the linear layer, and carrying out the dimension reduction and normalization processing on the fusion characteristic by adopting a group normalization method to obtain a fusion characteristic vector.
  14. 14. The network anomaly detection system based on double-flow mixing according to claim 8, wherein the multi-task prediction module is capable of synchronously establishing output layer structures of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task based on fusion feature vectors when constructing a multi-task learning frame for synchronously realizing the multi-task cooperative optimization of the flow classification task, the attack type identification task, the flow regression task and the sequence prediction task based on fusion feature vectors, configuring an output layer of the flow classification task, outputting a two-class probability through a Sigmoid activation function to judge whether the flow is abnormal or not, completing network construction of the flow classification task, configuring an output layer of the attack type identification task, outputting fifteen-class probability through a Softmax activation function to identify the attack type, completing network construction of the attack type identification task, configuring an output layer of the flow regression task, predicting bandwidth consumption values through a linear output layer, completing the regression network construction of the flow regression task, configuring an output layer of the sequence prediction task, generating a feature vector at the next moment through an output layer of feature dimension matching, completing the prediction network construction of the sequence prediction task, configuring training parameters of the multi-task learning frame, adopting an adaptive optimization device, setting a learning rule, optimizing the overall rule, optimizing the flow rate, optimizing the task, and optimizing the overall training rate by using a training frame loss, and optimizing the task, and optimizing the overall training rate, and optimizing the flow loss rate.
  15. 15. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
  16. 16. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.

Description

Network anomaly detection method, system, equipment and medium based on double-flow mixing Technical Field The invention relates to the technical field of network security, in particular to a network anomaly detection method, system, equipment and medium based on double-flow mixing. Background With the continuous evolution of network attack means and the high-dimensional complexity of network traffic data, the traditional network anomaly detection method faces serious challenges. The prior art has the following problems: 1) The limitations of traditional machine learning methods are that methods based on statistical analysis or shallow machine learning models (e.g., decision trees, support vector machines) have difficulty capturing high-dimensional nonlinear interactions and complex timing dependencies. For example, in detecting DNS tunnel attacks, these methods typically rely on manually designed features (e.g., packet length, entropy, request frequency), but these features are difficult to generalize to represent subtle timing correlations or to accommodate highly variable traffic structures, resulting in high false positive rates. 2) The LSTM approach suffers from the disadvantage that while it is excellent in capturing timing dependencies, it is inherently limited in its ability to handle long-range dependencies and understand global semantics, often suffers from gradient vanishing problems, and has limited effectiveness in modeling multi-stage attacks. 3) Although the transducer architecture can effectively capture long-range dependency through a self-attention mechanism, in a network anomaly detection scene, an original transducer model lacks fine modeling capability for a local time sequence mode, and has limited detection effect on short-term burst anomalies (such as malicious port scanning). 4) Modal fragmentation problem existing methods typically reduce network behavior to a single modality (pure time series or pure graph structure), ignoring the synergistic relationship between temporal and spatial features. For example, the topological relationship of the timing features of C2 communications in botnets to controlled hosts is modeled separately, resulting in the loss of critical semantic information. Therefore, how to provide a network anomaly detection method, system, device and medium based on dual-flow mixing is a problem to be solved at present. Disclosure of Invention The embodiment of the invention provides a network anomaly detection method, a system, equipment and a medium based on double-flow mixing, which are used for solving the problems of the prior art. According to a first aspect of an embodiment of the present invention, a network anomaly detection method based on dual-flow mixing is provided. In one embodiment, a network anomaly detection method based on dual-flow mixing includes: Acquiring network traffic data, and preprocessing the network traffic data to obtain a structured time sequence traffic sample; Constructing a double-flow mixed extraction model based on a multi-layer two-way long-short-term memory network and a multi-layer self-attention encoder, and extracting features of time sequence flow samples by utilizing the double-flow mixed extraction model to obtain local time sequence features and global context features; Performing dynamic self-adaptive weighted fusion on the local time sequence features and the global context features by using a cross-attention fusion layer to obtain fusion feature vectors; Based on the fusion feature vector, constructing a multi-task learning frame for synchronously realizing multi-task collaborative optimization of a flow classification task, an attack type identification task, a flow regression task and a sequence prediction task; Adopting an end-to-end training mode, and jointly optimizing parameters of a double-flow mixed extraction model, a cross-attention fusion layer and a multi-task learning frame through a back propagation algorithm to obtain an optimal network anomaly detection model; inputting the network flow to be detected into an optimal network anomaly detection model, and outputting an anomaly flow detection result, an attack type classification result, a flow prediction value and a future moment characteristic prediction value. In one embodiment, obtaining network traffic data and preprocessing the network traffic data to obtain a structured time-series traffic sample includes: loading a network flow record from the dataset to obtain network traffic data; Performing characteristic decoupling processing on network flow data, decomposing an internet protocol address into four-dimensional numerical vectors, mapping port numbers into port type classifications, and converting a transmission control protocol flag bit into binary characteristics; generating derivative features based on the decoupled features, constructing a sequence, and calculating the number of data packets per second and the number of bytes per packet to obt