CN-121984723-A - Abnormal encryption flow detection method and storage medium based on anti-domain adaptation

CN121984723ACN 121984723 ACN121984723 ACN 121984723ACN-121984723-A

Abstract

The application discloses an abnormal encryption flow detection method and a storage medium based on a reactive domain adaptation, and relates to the field of network data monitoring. The method comprises the steps of obtaining training data according to target domain data and source domain data, obtaining time sequence characteristics after data encoding, obtaining encoding characteristics comprising predicted values according to the time sequence characteristics through a characteristic extractor, respectively predicting the respective loss values through a class classifier and a domain classifier, carrying out gradient back propagation to the characteristic extractor according to the loss values, updating parameters, and forming an abnormal encryption flow prediction model through the characteristic extractor and the class predictor. The method uses the time code to replace the position code characteristic to predict the accuracy rate, and introduces a mode of countering domain learning on the problem of abnormal encrypted flow detection, so that a large amount of unmarked actual scene data can also be used as input for training, the generalization capability is improved, and the automatic detection of abnormal encrypted flow data is realized.

Inventors

RUAN JUNHAO
YAN CHANGLONG
DENG QINGLIN

Assignees

烽火通信科技股份有限公司

Dates

Publication Date: 20260505
Application Date: 20260109

Claims (10)

1. The abnormal encryption traffic detection method based on the adaptation of the countering domain is characterized by comprising the following steps: Obtaining training data according to target domain data and source domain data, and obtaining time sequence characteristics after data encoding, wherein the target domain data comprises unlabeled data related to traffic, and the source domain data comprises labeled data for determining traffic category; Obtaining coding features including predicted values according to the time sequence features through a feature extractor; Obtaining a loss value for distinguishing whether the flow type is marked with data or not through a domain classifier according to the loss value predicted by the class classifier according to the flow type with the marked data; And after gradient back propagation is carried out to the feature extractor to update parameters according to the loss value, forming an abnormal encryption flow prediction model by the feature extractor and the class predictor.
2. The method for detecting abnormal encryption traffic based on the adaptation of the contrast domain as set forth in claim 1, wherein the baseline model of the feature extractor is a transducer encoder based on an attention mechanism, and the process of encoding the training data comprises replacing the position encoding of the training data with a time encoding by the transducer encoder, and then splicing the time encoding and the corresponding data features to obtain the time sequence features.
3. The method for detecting abnormal encrypted traffic based on the adaptation of the correlation domain according to claim 2, wherein the process of obtaining the coding feature including the predicted value from the time sequence feature by the feature extractor comprises: the time sequence characteristics are input into the self-attention layer for calculation, and then are encoded through the feedforward layer and output through the linear layer to obtain the predicted value.
4. The abnormal encryption traffic detection method based on the adaptation of the reactance domain as set forth in claim 3, wherein the calculation formula for replacing the position code of the training data with the time code is: wherein T represents a given input time sequence F represents a periodic activation function (Sin function is selected in this embodiment), k represents a characteristic dimension of the time code, The amplitude of the wave is represented and, Representing an offset; after the time sequence characteristics are input into the self-attention layer for calculation, a calculation formula of the feedforward layer coding is as follows: ; Where Z represents the calculation result calculated from the timing characteristics, Representing that a vector space obtained by weighting Z is subjected to linear transformation; the process of obtaining the predicted value after the feedforward layer coding and the linear layer output comprises the steps of marking the data Obtaining predicted value for predicting flow class with marked data after linear output by linear layer, and inputting to class predictor And no labeling And obtaining a predicted value for distinguishing whether the data is marked data or not after linear output through a linear layer, and inputting the predicted value into a domain classifier.
5. The abnormal encryption traffic detection method based on the adaptation of the reactance domain according to claim 1, wherein the loss value is obtained according to a predicted value and a true value; The loss function of the class predictor is: min(Lx)= where n represents the number of predicted samples, The expression is used to represent a regularized formula, The activation function is represented as a function of the activation, Representing a predicted value of the class predictor; the loss function of the domain classifier is: max(Ly)= Wherein the method comprises the steps of The activation function is represented as a function of the activation, Representing the predicted value of the class domain classifier, Representing the true label of the two categories.
6. The method for detecting abnormal encryption traffic based on the adaptation of the correlation domain according to claim 1, wherein the process of gradient back-propagating to the feature extractor update parameter according to the loss value comprises obtaining a total loss value according to the loss average value of the class predictor and the loss average value of the domain classifier; The calculation formula of the total loss value is as follows: representing the average value of the losses of the class predictor, Representation of Weights of (2); representing the average value of the losses of the class predictor, Representation of Weights of (2); is calculated as T represents the iteration index, T represents the value of the control task weight, The average loss value is shown.
7. The method for detecting abnormal encrypted traffic based on the adaptation of the reactive domain according to any one of claims 1 to 6, further comprising the step of inputting the traffic to be detected into an abnormal encrypted traffic prediction model, and outputting the predicted traffic classification by the abnormal encrypted traffic prediction model.
8. The abnormal encryption traffic detection method based on the adaptation of the reactive domain according to any one of claims 1 to 6, wherein the process of obtaining training data from the target domain data and the source domain data comprises: For source domain data: carrying out data cleaning on the marked data; when the data volume under one abnormal flow type is below a sample volume threshold, copying the data under the abnormal flow type; Converting the marked data into a value between 0 and 1; removing data irrelevant to abnormal flow from the marked data; Combining the data according to the correlation of the data to obtain a plurality of marked flow characteristics, wherein each marked flow characteristic comprises a plurality of data characteristics; For the target domain data: Converting the format of the non-marked data according to the format of the marked data; carrying out data cleaning on the non-marked data; converting the unlabeled data into a value between 0 and 1; and forming unlabeled flow characteristics corresponding to the data characteristics in the labeled flow characteristics.
9. An anti-domain adaptation based anomaly encryption traffic detection device comprising a processor, a memory, and an anti-domain adaptation based anomaly encryption traffic detection program stored on the memory and executable by the processor, wherein the anti-domain adaptation based anomaly encryption traffic detection program, when executed by the processor, implements the anti-domain adaptation based anomaly encryption traffic detection method steps of any one of claims 1 to 8.
10. A computer-readable storage medium, wherein an abnormal encrypted traffic detection program based on a reactive domain adaptation is stored thereon, wherein the abnormal encrypted traffic detection program based on a reactive domain adaptation, when executed, implements the steps of the abnormal encrypted traffic detection method based on a reactive domain adaptation as claimed in any one of claims 1 to 8.

Description

Abnormal encryption flow detection method and storage medium based on anti-domain adaptation Technical Field The application relates to the field of network data monitoring, in particular to an abnormal encryption traffic (data packet) detection method and a storage medium based on the adaptation of a countering domain. Background With the continuous evolution of communication technology, the scale and complexity of equipment are obviously increased, a network management system also evolves from a command line management system to a Web network management system, and an information network faces increasingly outstanding information security problems. In order to better protect the user's personal privacy and data from theft and tampering during use, there have been a number of network applications that use network encryption protocols to encrypt traffic. But not only normal traffic data can be encrypted, some abnormal traffic can also hide plaintext features through encryption technology, thereby reducing the risk of exposure of abnormal behavior. In particular, traffic safety is an unavoidable issue in practical networking use of communication devices. However, in the network encrypted traffic scenario, since the encryption protocol is used, the content in the message payload is changed from plaintext to ciphertext, so that the related field cannot be directly extracted for traffic detection, and the traditional method based on deep packet detection cannot effectively identify the abnormal encrypted traffic. Further, although some machine learning methods can perform abnormal encrypted traffic recognition by counting unencrypted features (such as stream duration, number of protocol fields, etc.) in traffic, different statistical features are required to be designed manually for different recognition scenes, expert experience is very dependent, and the quality of features directly affects recognition effects. Meanwhile, the public data sets related to abnormal encrypted traffic are fewer, the data distribution is very unbalanced through analysis of the public data set distribution, if the unbalanced data sets are used for classification learning, the accuracy of traffic class prediction with small data quantity is very low when the abnormal encrypted traffic is applied to an actual service scene, in addition, the generalization capability of a trained model is not high due to the fact that the data sets are fewer, and a lot of misjudgment occurs in actual application. Therefore, how to detect the abnormal encrypted traffic in the network in real time in practical application, or how to quickly locate the abnormal traffic type of the attack system when research, development and analysis are performed in the field attacked environment becomes a urgent problem to be solved. Disclosure of Invention Aiming at the defects in the prior art, the application solves the technical problem of how to detect abnormal encrypted flow data. To achieve the above object, in a first aspect, an embodiment of the present application provides a method for detecting abnormal encrypted traffic based on adaptation of a reactive domain, the method including the steps of: Obtaining training data according to target domain data and source domain data, and obtaining time sequence characteristics after data encoding, wherein the target domain data comprises unlabeled data related to traffic, and the source domain data comprises labeled data for determining traffic category; Obtaining coding features including predicted values according to the time sequence features through a feature extractor; Obtaining a loss value for distinguishing whether the flow type is marked with data or not through a domain classifier according to the loss value predicted by the class classifier according to the flow type with the marked data; And after gradient back propagation is carried out to the feature extractor to update parameters according to the loss value, forming an abnormal encryption flow prediction model by the feature extractor and the class predictor. In combination with the first aspect, in one implementation, the baseline model of the feature extractor is a transducer encoder based on an attention mechanism, and the process of data encoding the training data includes replacing the position encoding of the training data with a time encoding by the transducer encoder, and then splicing the time encoding and the corresponding data features to obtain the time sequence features. With reference to the first aspect, in an implementation manner, the process of obtaining, by the feature extractor, an encoding feature including a predicted value according to a time sequence feature includes: the time sequence characteristics are input into the self-attention layer for calculation, and then are encoded through the feedforward layer and output through the linear layer to obtain the predicted value. With reference to the first aspect, in one implementation