CN-122027369-A - Botnet detection method and system based on fusion characteristics

CN122027369ACN 122027369 ACN122027369 ACN 122027369ACN-122027369-A

Abstract

The invention relates to a botnet detection method and system based on fusion characteristics, belongs to the technical field of network security, and solves the problem that the existing method cannot keep high recognition performance and stability when facing to conceal botnet behaviors or network noise interference. The method comprises the steps of firstly constructing a botnet detection data set, extracting traditional graph features and deep graph features, carrying out feature fusion by utilizing a multi-dimensional weight feature fusion module, constructing a botnet detector, respectively training and testing the botnet detector by utilizing a training set and a testing set, and detecting data to be detected by the tested botnet detector to output a botnet detection result. According to the method, the traditional graph features and the deep graph features are fused, the feature representation with better discrimination is constructed, and the botnet detector based on the fusion features and the deep learning is designed, so that the multidimensional structural information of the graph can be better captured, and the recognition accuracy, stability and processing efficiency of the abnormal behavior of the botnet are improved.

Inventors

WANG BO
LIU ZIXING
ZHAO YANPING
YAO BAOHUA
FENG ZHIYUAN
ZHANG HAIRONG
WANG HAIYAN

Assignees

吉林大学

Dates

Publication Date: 20260512
Application Date: 20260415

Claims (10)

1. The botnet detection method based on the fusion characteristics is characterized by comprising the following steps of: step 1, modeling enterprise network nodes and communication relations thereof into network flow diagrams through collecting enterprise network communication flow data, constructing a botnet detection data set comprising a normal network flow diagram and a botnet flow diagram, and dividing the botnet detection data set into a training set and a testing set; Step 2, extracting traditional graph features and deep graph features by utilizing a multi-branch multi-scale residual feature extractor aiming at each network flow graph, and fusing the traditional graph features and the deep graph features by utilizing a multi-dimensional weight feature fusion module to obtain a fusion feature matrix, wherein the traditional graph features comprise degree distribution features, cluster coefficient mean features, average path length features, graph diameter features and spectrum features; and 3, constructing a botnet detector taking the fusion feature matrix as input, training the botnet detector by using the training set, testing the trained botnet detector by using the testing set, and detecting data to be detected by using the tested botnet detector to output a botnet detection result.
2. The botnet detection method based on fusion features as set forth in claim 1, wherein step 1 includes the steps of: Step 1.1, acquiring a bidirectional NetFlow file with a tag from each scene of a CTU-13 data set, reserving core fields of the bidirectional NetFlow file, and preprocessing to obtain preprocessed basic flow data; Step 1.2, constructing a time window flow chart by adopting a time window dividing method according to the preprocessed basic flow data; Marking each time window flow chart according to a label in the flow to obtain a normal network flow chart and a botnet flow chart, wherein the normal network flow chart and the botnet flow chart form the botnet detection data set together; and 1.4, dividing the botnet detection data set into a training set and a testing set according to the proportion of 8:2.
3. The botnet detection method based on fusion features as set forth in claim 1, wherein the process of extracting deep map features using a multi-branch multi-scale residual feature extractor includes the steps of: step 2.2.1, calculating a residual matrix of the network flow diagram, wherein the calculation formula is as follows: ; Wherein, the A residual matrix representing the network traffic map; An actual adjacency matrix representing the network traffic map; representing the mean value of the adjacency matrix of all normal network traffic graphs; Step 2.2.2 constructing a multi-branch multi-scale residual feature extractor, the residual matrix And obtaining the deep map features after passing through the multi-branch multi-scale residual error feature extractor.
4. The botnet detection method based on fusion features of claim 3, wherein the multi-branch multi-scale residual feature extractor comprises an input projection layer, a multi-scale parallel branch module, a branch fusion module, a residual connection module and a global representation module; the input projection layer comprises a convolution layer, a normalization layer and a nonlinear activation layer; the multi-scale parallel branch module comprises a small-scale convolution branch, a medium-scale convolution branch, a large-scale cavity convolution branch, a row/column convolution branch and a nonlinear activation layer; the branch fusion module comprises a splicing layer, a channel attention mechanism layer, a convolution layer, a normalization layer and a random inactivation layer; The residual error connection module directly adds the output of the input projection layer and the output of the branch fusion module, and the obtained characteristics are output to the global representation module; the global representation module includes a global average pooling layer, a linear layer, a normalization layer, a nonlinear activation layer, and a random deactivation layer.
5. The botnet detection method based on fusion features as recited in claim 4, wherein the multi-branch multi-scale residual feature extractor pair input residual matrix The operation process of (1) comprises the following steps: Residual matrix for input Adding a channel dimension, and adding a post-addition residual matrix Is of the dimension of , wherein, In order to be of a batch size, And Representing residual matrices respectively Is the height and width of (2); Residual matrix after adding channel dimension First by A convolution layer for increasing the channel number from 1 to The initial characteristics are obtained through a batch normalization layer and a nonlinear activation layer Its data dimension is ; Initial characteristics The convolution characteristics of different scales are extracted through four convolution layers with different sizes respectively, and the convolution characteristics are expressed as follows: ; ; ; ; Wherein, the Representation of Is provided with a two-dimensional convolution operation of, Representing the passing of The features extracted after convolution; Representation of Is provided with a two-dimensional convolution operation of, Representing the passing of The features extracted after convolution; Representation of 、 Is subjected to a two-dimensional hole convolution operation, Representing the features extracted after cavity convolution; Representation of And Is provided with a two-dimensional convolution operation of, Representing the features extracted after row-column convolution, and obtaining four different convolution features 、、、 The characteristics extracted by the nonlinear activation layers are respectively as follows 、、、 The channel dimension of each branch output is kept consistent; After the nonlinear activation is completed, four sets of features are combined 、、 And Splicing along the channel dimension to form a fused multi-scale feature matrix The characteristic dimension is ; Matrix multi-scale features Input channel attention mechanism layer, obtain weighted characteristics Then via Convolution operation and batch normalization layer characterize Is retracted by passage pressure Part of neurons are discarded randomly through a random inactivation layer to obtain characteristics Its dimension is ; Features to be characterized With the initial characteristics Adding elements by elements to obtain the characteristics after residual connection Its dimension is ; Features to be characterized Inputting into global average pooling layer, and sequentially passing through A linear layer and a random deactivation layer with dimension to obtain a size of Is a deep layer diagram feature matrix of (1) 。
6. The botnet detection method based on fusion features of claim 5, wherein, 、、、、 The values of (2) are 64, 32, 64 and 128 respectively.
7. The botnet detection method based on fusion features as claimed in any one of claims 1 to 6, wherein the process of fusing the conventional map features and the deep map features by using a multi-dimensional weight feature fusion module to obtain a fusion feature matrix includes the following steps: Feature matrix of deep layer diagram through Cat operation And a conventional feature matrix Splicing in the channel dimension to obtain an overall feature matrix ; Matrix the integral characteristic The multi-dimensional weight feature fusion module is input into the multi-dimensional weight feature fusion module, and the multi-dimensional weight feature fusion module sequentially completes the following fusion process: generating global weight proportion of deep map features and traditional features by combining multi-layer perceptron with Softmax activation function ; Respectively for deep layer graph feature matrix And a conventional feature matrix Generating local dimension attention weight through a corresponding weighting sub-network, respectively obtaining a normalized weight matrix by two feature matrices after sigmoid activation, and respectively multiplying the two feature matrices with the corresponding normalized weight matrix element by element to obtain a locally adjusted deep map feature matrix And a locally adjusted conventional feature matrix ; Proportion global weight Part of a feature matrix of a corresponding deep map The depth map feature matrix is expanded into a depth map feature matrix after local adjustment through a broadcasting mechanism The dimensions of the match, noted as Then the deep layer diagram feature matrix after local adjustment And (3) with Element-by-element multiplication to obtain globally weighted deep-layer graph feature matrix Global weight ratio Part of corresponding to traditional feature matrix The method is expanded into a traditional feature matrix after local adjustment through a broadcasting mechanism The dimensions of the match, noted as Then locally adjusting the traditional feature matrix Will be connected with Element-by-element multiplication to obtain a globally weighted conventional feature matrix ; Globally weighted feature matrix of deep map And globally weighted conventional feature matrix Spliced into a whole, denoted as The spliced feature matrix Generating channel weights through one-dimensional convolution, multiplying the channel weights with the splicing features channel by channel to obtain a fusion feature matrix 。
8. The botnet detection method based on fusion features as set forth in claim 7, wherein the extracted degree distribution features, cluster coefficient mean features, average path length features, graph diameter features and spectrum features are integrated into a conventional feature vector The extracted features are batched to obtain a traditional feature matrix 。
9. The fusion feature-based botnet detection method of any one of claims 1-6, wherein the botnet detector comprises two linear modules, two linear layers, one nonlinear activation layer, and one Softmax layer, wherein each linear module comprises one linear layer, one normalization layer, one nonlinear activation layer, and one random deactivation layer, and wherein the botnet detector is trained using a cross entropy loss function and Adam optimizer.
10. Botnet detection system based on fusion characteristics, characterized by comprising: The data set construction module is used for constructing a botnet detection data set by collecting enterprise network communication flow data, modeling enterprise network nodes and communication relations thereof into a network flow chart, comprising a normal network flow chart and a botnet flow chart, and dividing the botnet detection data set into a training set and a testing set; The feature extraction and fusion module is used for extracting traditional graph features and deep graph features by utilizing a multi-branch multi-scale residual feature extractor aiming at each network flow graph, and fusing the traditional graph features and the deep graph features by utilizing a multi-dimensional weight feature fusion module to obtain a fusion feature matrix, wherein the traditional graph features comprise degree distribution features, cluster coefficient mean features, average path length features, graph diameter features and spectrum features; the detector module is used for constructing a botnet detector taking the fusion feature matrix as input, training the botnet detector by utilizing the training set, testing the trained botnet detector by utilizing the testing set, detecting data to be detected by the tested botnet detector, and outputting a botnet detection result.

Description

Botnet detection method and system based on fusion characteristics Technical Field The invention relates to the technical field of network security, in particular to a botnet detection method and system based on fusion characteristics. Background Botnets (Botnet) are a type of distributed network consisting of a large number of maliciously controlled terminal devices (bot nodes), which pose a serious threat to information systems and business operations in an enterprise network environment, where attackers can exploit infected terminals or servers inside the enterprise to launch distributed denial of service (Distributed Denial of Service, DDoS) attacks, send spam, click fraud, or steal sensitive data. This not only occupies enterprise network bandwidth and computing resources, causing traffic system communication delays and service disruption, but may also lead to leakage of business secrets and compromised customer privacy. Meanwhile, botnets often adopt strategies such as encryption tunnels, protocol confusion, covert communication and the like to avoid traditional security monitoring, so that enterprise data is stealed or malicious instructions are injected in the dark, and data integrity and service continuity are damaged. The distributed, self-propagating and layered control characteristics of the system further increase the difficulty of enterprises in tracking, isolating and protecting the system, and form a serious challenge for enterprise communication safety, data privacy and overall network stability. At present, the detection method for the botnet mainly comprises anomaly detection based on flow statistics and protocol analysis and flow behavior analysis based on machine learning. The former usually analyzes statistical characteristics such as traffic packet size, access frequency, connection mode and the like or analyzes communication protocol content to find abnormal behaviors, and the latter models and detects network traffic characteristics by using machine learning algorithms such as a support vector machine (Support Vector Machine, SVM), random Forest (RF) and the like. However, in a complex enterprise network environment, the existing method still faces the problems of insufficient recognition accuracy and processing efficiency due to large node scale, dense communication relation and various traffic flow modes. These methods often have difficulty in fully integrating statistical information and topological structure features of network traffic, thereby affecting accurate identification of hidden botnet behavior. Therefore, a new technical solution is needed to improve the perceptibility and processing efficiency of the computer system to the botnet in the enterprise network environment. Disclosure of Invention In order to solve the problem that the existing method cannot keep higher recognition performance and stability when facing to hidden botnet behaviors or network noise interference, the invention provides a botnet detection method and system based on fusion characteristics. The technical scheme adopted by the invention is as follows: the botnet detection method based on the fusion characteristics comprises the following steps: step 1, modeling enterprise network nodes and communication relations thereof into network flow diagrams through collecting enterprise network communication flow data, constructing a botnet detection data set comprising a normal network flow diagram and a botnet flow diagram, and dividing the botnet detection data set into a training set and a testing set; Step 2, extracting traditional graph features and deep graph features by utilizing a multi-branch multi-scale residual feature extractor aiming at each network flow graph, and fusing the traditional graph features and the deep graph features by utilizing a multi-dimensional weight feature fusion module to obtain a fusion feature matrix, wherein the traditional graph features comprise degree distribution features, cluster coefficient mean features, average path length features, graph diameter features and spectrum features; and 3, constructing a botnet detector taking the fusion feature matrix as input, training the botnet detector by using the training set, testing the trained botnet detector by using the testing set, and detecting data to be detected by using the tested botnet detector to output a botnet detection result. Correspondingly, the invention also provides a botnet detection system based on the fusion characteristics, which comprises: The data set construction module is used for constructing a botnet detection data set by collecting enterprise network communication flow data, modeling enterprise network nodes and communication relations thereof into a network flow chart, comprising a normal network flow chart and a botnet flow chart, and dividing the botnet detection data set into a training set and a testing set; The feature extraction and fusion module is used for extracting traditional graph