CN-122027332-A - MobileViT-AESTF-based network intrusion detection method

CN122027332ACN 122027332 ACN122027332 ACN 122027332ACN-122027332-A

Abstract

The invention discloses a network intrusion detection method based on MobileViT-AESTF, which comprises the steps of S110, S120, S130, S140, S150, training a constructed intrusion detection classification model and giving classification results, wherein the S110 is used for imaging network traffic data, the S120 is used for constructing an adaptive efficient space-time fusion module AESTF (ADAPTIVE EFFICIENT Spatiotemporal Fusion), the S130 is used for designing an efficient space-time fusion unit ESTF (Efficient Spatiotemporal Fusion), the S140 is used for designing an adaptive characteristic exciter AFE (Adaptive Feature Exciter). The flow image conversion method based on the recursion map RP (Recurrence Plot) converts the flow sequence into the recursion map by using a phase space reconstruction technology, solves the problem of time sequence feature loss caused by traditional linear mapping by describing the recursion relation among sequence states, and improves the detection precision and robustness on the premise of keeping the light weight of the model by enhancing the interaction and the dependence modeling among feature channels by the designed AESTF module.

Inventors

LIU SHIHAN

Assignees

中北大学

Dates

Publication Date: 20260512
Application Date: 20260320

Claims (6)

1. The network intrusion detection method based on MobileViT-AESTF is characterized by comprising the steps of S110-S150, S110, preprocessing original network flow data, and then mapping a one-dimensional flow sequence into a two-dimensional image representation by adopting a recursion diagram RP (Recurrence Plot) method; the method comprises the steps of S120, constructing a self-adaptive efficient space-time fusion module AESTF, S130, designing a high-efficient space-time fusion unit ESTF (Efficient Spatiotemporal Fusion), performing feature separation and fusion to realize space-time feature fusion, S140, designing a self-adaptive feature exciter AFE (Adaptive Feature Exciter), responsible for feature refining and enhancement, and self-adaptively enhancing key feature representation, and S150, training a constructed intrusion detection classification model to give classification results.
2. The network intrusion detection method according to claim 1, wherein in the data processing, first data cleaning and feature extraction are performed on the network traffic data, and then a one-dimensional traffic sequence is mapped into a two-dimensional recursive image representation by using a recursive figure RP (Recurrence Plot) method. For a length of Time series of (2) Calculating two adjacent time points And Euclidean distance between: Converting the distance matrix into a binary recursive matrix Wherein Indicating a point in time And Is less than a threshold (I.e. similar), Indicating a distance greater than or equal to a threshold value (I.e., dissimilar): 。
3. The network intrusion detection method according to claim 1, wherein the AESTF module architecture is composed of a spatio-temporal fusion unit ESTF (Efficient Spatiotemporal Fusion) and an adaptive feature driver AFE (Adaptive Feature Exciter) and is embedded in a residual connection structure. Performing forward enhancement calculation on the input feature map once to generate an enhanced feature map: And accumulating the enhanced feature map and the original input feature map element by element to form a fusion feature map. In order to strengthen the capturing capability of the model to key features, residual connection is carried out on the fusion feature map and the original input feature map again, so that a final output feature map is obtained: 。
4. The AESTF module of claim 3 wherein, in the ESTF module, the input profile is to be displayed Split into two parallel processing paths, a spatial attention path (Branch X) and a channel attention path (Branch Y): Branch X, a spatial attention path, downsampled by adaptive max pooling (Adaptive Max Pooling, AMPool), uses a lightweight depth separable convolution (DEPTHWISE SEPARABLE CONVOLUTION) to extract the features while computing spatial variance, and then passes two learnable scaling parameters And And (5) carrying out weighted fusion: then go through Convolution sum GELU activates the function, upsamples to original resolution by nearest neighbor interpolation, and matches the original feature map Element-by-element multiplication is performed: branch Y, the path is the channel attention path, passing Depth separable convolution performs convolution operations in the spatial dimension and by The point-wise convolution (Pointwise Conv, PW) completes the channel expansion. Followed by introducing GELU an activation function to achieve nonlinear mapping, through the second Point-wise convolution maps features back to the original channel dimension: finally, carrying out feature fusion, adding the features subjected to spatial enhancement and the features subjected to channel enhancement element by element, and passing through one Convolution recovers the number of channels: 。
5. A AESTF module as claimed in claim 3 wherein the AFE module incorporates a AFE (Adaptive Feature Exciter) module for feature refinement based on ESTF output features. Calculating an energy function for each neuron of the input . For a characteristic diagram The energy function is calculated as follows: and wherein: is the spatial average value within the channel, Is the spatial variance in the channel, the refined features By performing an energy function The calculation formula is obtained by multiplying the scaled original characteristics and is as follows: 。
6. The network intrusion detection method according to claim 1, wherein in the classification model training, an end-to-end training is performed by using a cross entropy loss function, and an optimization objective is: Wherein In order to train the number of samples, Is the first The true class label (one-hot coding) of the individual samples, For model pair number Sample number 1 The probability of a prediction of a class is, Is all the learnable parameters of the model.

Description

MobileViT-AESTF-based network intrusion detection method Technical Field The invention relates to the field of network security, deep learning and lightweight model, in particular to a network intrusion detection method based on multi-scale feature fusion. Background With the rapid development of internet technology, network security problems are increasingly serious, the complexity of network traffic is continuously enhanced, attack means also show a large-scale and hidden trend, and the loss caused by network crime reaches 12 trillion dollars according to the latest report prediction issued by a Computer Crime Research Center (CCRC) in 2025. The conventional intrusion detection system IDS (Intrusion Detection System) based on the manual rules and statistical features has insufficient feature expression and real-time response bottleneck when facing network attack. In recent years, with the continuous development of artificial intelligence technology, intrusion detection methods based on machine learning and deep learning are widely used, wherein imaging feature expression and lightweight model design become two core directions. In the field of imaging processing, researchers convert network traffic into two-dimensional images to improve the recognition capability of network attacks, [1] such as Demmese converts malicious traffic into gray images, CNN is combined to realize accurate recognition of file malicious software, liu Wenqi and [2] introduce DEEPINSIGHT technology to optimize the traffic image conversion process, the problem of small sample generalization is solved through transfer learning, li and [3] fuse CNN and RNN in industrial scenes, and capture long-period attack modes from the imaged traffic. The lightweight technology aims at breaking through the resource limitation of equipment deployment, improving the recognition speed of intrusion detection, yao Jun and other [4] propose a MobileViT-based lightweight detection model, reducing the computational complexity while maintaining the visual feature extraction capability, alshehri and other [5] simplify convolution operation by using a self-attention mechanism, and realizing the detection precision of 98% on industrial equipment. These advances have driven the development of lightweight models in the field of intrusion detection. [1] DEMMESE F A, NEUPANE A, KHORSANDROO S,et al.Machine learning based fileless malware traffic classification using image visualization[J].Cybersecurity, 2023, 6(1): 1-18. [2] LIU Wenqi, HU Tao, YAN Jie, et al. Network intrusion detection technology based on DeepInsight and transfer learning[J]. Chinese Journal of Engineering, 2024, 46(12): 2238-2245. [3] LI S, CHAI G, WANG Y, et al. CRSF: An Intrusion Detection Framework for Industrial Internet of Things Based on Pretrained CNN2D-RNN and SVM[J]. IEEE Access, 2023, 11: 92041-92054. [4] YAO Jun, SUN FangChao. Research on lightweight intrusion detection model based on MobileViT[J]. Modern Electronics Technique, 2024, 47(19): 33-39. [5] TIWARI R S, LAKSHMI D, DAS T K, et al. A lightweight optimized intrusion detection system using machine learning for edge-based IIoT security[J]. Telecommunication Systems, 2024, 87: 605-624. Disclosure of Invention The invention provides a network intrusion detection method based on MobileViT-AESTF aiming at the problems, which is used for solving the problems of high consumption of computational resources and insufficient detection precision of the existing network intrusion detection. The invention adopts the following technical scheme for solving the problems: the network intrusion detection method based on MobileViT-AESTF is characterized by comprising the following steps of S110-S150: S110, preprocessing original network flow data, and then mapping a one-dimensional flow sequence into a two-dimensional image representation by adopting a recursion map RP (Recurrence Plot) method; s120, constructing a self-adaptive efficient space-time fusion module AESTF; S130, designing a high-efficiency space-time fusion unit ESTF (Efficient Spatiotemporal Fusion) to perform feature separation and fusion so as to realize fusion of space-time features; S140, designing an adaptive feature exciter AFE (Adaptive Feature Exciter) which is responsible for feature refining and enhancement and adaptively enhancing key feature representation; And S150, training the constructed intrusion detection classification model to give classification results. In the data processing, firstly, data cleaning and feature extraction are carried out on network traffic data, and then, a one-dimensional traffic sequence is mapped into a two-dimensional recursive image representation by adopting a recursive graph RP (Recurrence Plot) method. For a length ofTime series of (2)Calculating two adjacent time pointsAndEuclidean distance between: Converting distance matrix into binary recursive matrix WhereinIndicating a point in timeAndIs less than a threshold(I.e. similar),Indicati