CN-121396686-B - Malicious encryption flow detection method and system based on ciphertext entropy value

CN121396686BCN 121396686 BCN121396686 BCN 121396686BCN-121396686-B

Abstract

The application discloses a malicious encryption flow detection method and system based on ciphertext entropy values, and the method is realized by the steps of acquiring a current encryption flow data packet, analyzing the current encryption flow data packet, extracting ciphertext data, dividing the ciphertext data into ciphertext blocks with preset sizes, calculating information entropy values corresponding to the ciphertext blocks, determining ciphertext entropy characteristics of the current encryption flow based on the information entropy values corresponding to the ciphertext blocks, matching the ciphertext entropy characteristics of the current encryption flow with preset judging conditions, and determining whether the current encryption flow is malicious encryption flow based on a matching result. The randomness of the flow data is measured by the information entropy, the ciphertext entropy features are incorporated into the judging basis to quantify the randomness and the stability, meanwhile, the dynamic entropy threshold is built by combining the malicious software sample features, the misjudgment rate is effectively reduced, the scene suitability is improved, the defect that the malicious flow entropy features are ignored in the existing scheme is overcome, and the accuracy and the practicability of malicious encryption flow detection are remarkably improved.

Inventors

LIAO HUIMIN
LIU LE
HUO YAOFENG

Assignees

卓望数码技术（深圳）有限公司

Dates

Publication Date: 20260512
Application Date: 20251226

Claims (7)

1. A malicious encrypted traffic detection method based on ciphertext entropy values, the method comprising: The method comprises the steps of carrying out hierarchical analysis on the acquired current encrypted traffic data packet, sequentially identifying and stripping header information of a network layer and a transmission layer, analyzing proprietary protocol header and control information aiming at an encryption protocol corresponding to the data packet, and finally accurately extracting core ciphertext data processed by an encryption algorithm from the data packet; analyzing the current encrypted flow data packet and extracting ciphertext data; Dividing the ciphertext data into ciphertext blocks with preset sizes, and calculating an information entropy value corresponding to each ciphertext block based on a shannon entropy formula, wherein the lengths of the ciphertext blocks are consistent; Determining a current encryption flow ciphertext entropy characteristic based on an information entropy value corresponding to each ciphertext block, wherein the current encryption flow ciphertext entropy characteristic comprises at least one of a current encryption flow ciphertext entropy average value, a current encryption flow ciphertext entropy standard deviation, a forward entropy value and a backward entropy value, wherein the ciphertext data are segmented into front-section ciphertext data and rear-section ciphertext data according to a preset segmentation rule, the front-section ciphertext blocks corresponding to the front-section ciphertext data and the rear-section ciphertext blocks corresponding to the rear-section ciphertext data are extracted, the forward entropy value is calculated based on the information entropy value corresponding to each front-section ciphertext block, the backward entropy value is calculated based on the information entropy value corresponding to each rear-section ciphertext block, the preset segmentation rule is that the front 1/3 ciphertext blocks are taken as the front-section ciphertext data and the rear 1/3 ciphertext blocks are taken as the rear-section ciphertext data according to the ciphertext block sequence, and the middle 1/3 ciphertext blocks can be ignored or included in auxiliary analysis; Constructing preset judging conditions based on ciphertext entropy features of malicious software samples, ciphertext entropy features of normal encrypted flow data and theoretical maximum entropy values of ciphertext bytes, wherein the preset judging conditions comprise at least one of a threshold value interval related to the theoretical maximum entropy values, a normal threshold value interval related to the normal encrypted flow ciphertext entropy features, a first malicious threshold value interval related to the malicious software ciphertext entropy features and a second malicious threshold value interval, a first entropy value fluctuation threshold value, a second entropy value fluctuation threshold value and a entropy value difference threshold value; And matching the entropy characteristics of the ciphertext of the current encrypted traffic with a preset judging condition, and determining whether the current encrypted traffic is malicious encrypted traffic or not based on a matching result.
2. The method for detecting malicious encrypted traffic based on ciphertext entropy as claimed in claim 1, wherein the constructing the preset determination condition based on the ciphertext entropy feature of the malicious software sample, the ciphertext entropy feature of the normal encrypted traffic data, and the theoretical maximum entropy value of the ciphertext byte comprises: Determining a malware ciphertext entropy feature based on the malware sample; determining normal encrypted traffic ciphertext entropy features based on the normal encrypted traffic data; Determining a theoretical maximum entropy value based on a theoretical probability distribution state of bytes in ciphertext data; And constructing the preset judging condition based on the malicious software ciphertext entropy characteristic, the normal encryption flow ciphertext entropy characteristic and the theoretical maximum entropy value.
3. The method for detecting malicious encrypted traffic based on ciphertext entropy as claimed in claim 2, wherein said matching the ciphertext entropy feature of the current encrypted traffic with a predetermined determination condition, and determining whether the current encrypted traffic is malicious encrypted traffic based on a result of the matching, comprises: If the average value of the entropy of the ciphertext of the current encrypted flow is in the threshold interval related to the theoretical maximum entropy value, the current encrypted flow is a normal encrypted flow, or If the entropy average value of the ciphertext of the current encrypted flow is in the normal threshold value interval, the current encrypted flow is the normal encrypted flow, or If the entropy average value of the ciphertext of the current encrypted flow is in the first malicious threshold interval, the current encrypted flow is malicious encrypted flow, or If the entropy standard deviation of the ciphertext of the current encrypted traffic is greater than or equal to the first entropy fluctuation threshold, the current encrypted traffic is malicious encrypted traffic, or If the average value of the entropy of the current encrypted traffic ciphertext is in the second malicious threshold interval and the standard deviation of the entropy of the current encrypted traffic ciphertext is greater than or equal to the second entropy fluctuation threshold, the current encrypted traffic is malicious encrypted traffic, or And if the absolute value of the difference value between the forward entropy value and the backward entropy value is greater than or equal to the entropy value difference threshold, the current encrypted traffic is malicious encrypted traffic.
4. The method for detecting malicious encrypted traffic based on ciphertext entropy values as claimed in claim 1, wherein the determining the current encrypted traffic ciphertext entropy feature based on the information entropy value corresponding to each ciphertext block comprises: calculating to obtain the current encrypted flow ciphertext entropy average value based on the information entropy value corresponding to each ciphertext block; calculating to obtain the entropy variance of the current encrypted flow ciphertext based on the information entropy value corresponding to each ciphertext block and the average value of the current encrypted flow ciphertext entropy; and calculating to obtain the current encrypted flow ciphertext entropy standard deviation based on the current encrypted flow ciphertext entropy variance.
5. The ciphertext entropy value-based malicious encrypted traffic detection method of claim 2, wherein the determining a malware ciphertext entropy feature based on a malware sample comprises: acquiring a preset number of malicious software samples, and converting the malicious software samples into byte codes; respectively carrying out encryption processing on the byte codes through different encryption algorithms with preset quantity to obtain a plurality of malicious ciphertext data; Calculating an information entropy value corresponding to each malicious ciphertext data; And calculating to obtain a malicious software ciphertext entropy average value and a malicious software ciphertext entropy standard deviation based on the information entropy value corresponding to each piece of malicious ciphertext data.
6. The ciphertext entropy value-based malicious encrypted traffic detection method of claim 2, wherein the determining normal encrypted traffic ciphertext entropy features based on the normal encrypted traffic data comprises: acquiring a normal encrypted flow data packet, analyzing the normal encrypted flow data packet, and extracting normal ciphertext data; dividing the normal ciphertext data into normal ciphertext blocks with preset sizes, and calculating information entropy values corresponding to the normal ciphertext blocks; and calculating to obtain the normal encryption flow ciphertext entropy average value based on the information entropy value corresponding to each normal ciphertext block.
7. A malicious encrypted traffic detection system based on ciphertext entropy values, the system comprising: The current encrypted flow data packet acquisition unit is used for acquiring the current encrypted flow data packet; The ciphertext data extraction unit is used for analyzing the current encrypted flow data packet and extracting ciphertext data, and comprises the steps of carrying out hierarchical analysis on the acquired current encrypted flow data packet, sequentially identifying and stripping header information of a network layer and a transmission layer, analyzing proprietary protocol header and control information aiming at an encryption protocol corresponding to the data packet, and finally accurately extracting core ciphertext data processed by an encryption algorithm from the data packet; The information entropy value determining unit is used for dividing the ciphertext data into ciphertext blocks with preset sizes, and calculating an information entropy value corresponding to each ciphertext block based on a shannon entropy formula, wherein the lengths of the ciphertext blocks are consistent; The current encryption flow ciphertext entropy characteristic determining unit is used for determining current encryption flow ciphertext entropy characteristics based on information entropy values corresponding to all ciphertext blocks, wherein the current encryption flow ciphertext entropy characteristics comprise at least one of a current encryption flow ciphertext entropy average value, a current encryption flow ciphertext entropy standard deviation, a forward entropy value and a backward entropy value, the ciphertext data are segmented into front-section ciphertext data and rear-section ciphertext data according to preset segmentation rules, front-section ciphertext blocks corresponding to the front-section ciphertext data and rear-section ciphertext blocks corresponding to the rear-section ciphertext data are extracted, the forward entropy value is calculated based on the information entropy values corresponding to all the front-section ciphertext blocks, the backward entropy value is calculated based on the information entropy values corresponding to all the rear-section ciphertext blocks, the preset segmentation rules are that front 1/3 ciphertext blocks are taken as the front-section ciphertext data and rear 1/3 ciphertext blocks are taken as the rear-section ciphertext data according to ciphertext block sequences, and the middle 1/3 ciphertext blocks can be ignored or included in auxiliary analysis; The preset judgment condition construction unit is used for constructing preset judgment conditions based on the ciphertext entropy characteristics of the malicious software sample, the ciphertext entropy characteristics of the normal encrypted flow data and the theoretical maximum entropy value of ciphertext bytes, wherein the preset judgment conditions comprise at least one of a threshold value interval related to the theoretical maximum entropy value, a normal threshold value interval related to the normal encrypted flow ciphertext entropy characteristics, a first malicious threshold value interval related to the malicious software ciphertext entropy characteristics and a second malicious threshold value interval, a first entropy value fluctuation threshold value, a second entropy value fluctuation threshold value and a paired entropy value difference value threshold value; And the malicious encrypted traffic judging unit is used for matching the current encrypted traffic ciphertext entropy characteristics with preset judging conditions and determining whether the current encrypted traffic is malicious encrypted traffic or not based on a matching result.

Description

Malicious encryption flow detection method and system based on ciphertext entropy value Technical Field The application relates to the technical field of encryption, in particular to a malicious encryption flow detection method and system based on ciphertext entropy values. Background With the popularization of encryption of network communication, encryption protocols such as SSL/TLS, SSH and the like are widely applied to various network scenes, so that the safety of data transmission is effectively ensured, but a hidden channel is provided for malicious encryption traffic (such as encryption traffic generated by malicious software propagation, data stealing and the like) and serious threat is formed to network safety. To cope with this problem, the prior art mainly forms two types of malicious encrypted traffic detection schemes, namely decryption detection and non-decryption detection. The decryption detection scheme needs to acquire an encryption key or crack an encryption protocol, and converts encrypted traffic into plaintext and then analyzes the plaintext through traditional modes such as deep packet detection, but the key acquisition difficulty in practical application is extremely high, and legal compliance risks exist in violent cracking behaviors, so that the method is difficult to popularize on a large scale. The non-decryption detection scheme does not need to decrypt the flow, becomes the current main stream research direction, and mainly comprises two types of implementation modes based on feature extraction and machine learning. The scheme based on the feature extraction analyzes the data packet header, then the extracted information is matched through a rule base to finish detection, and the scheme based on the machine learning is used for mining time sequence, spatial feature or abnormal cluster feature of the flow by means of models such as supervised learning, unsupervised learning, deep learning or integrated learning. However, the existing non-decryption detection scheme still has the obvious defects that on one hand, part of schemes do not consider the difference characteristics of traffic under different network scenes, so that the detection misjudgment rate is high, on the other hand, the existing threshold value lacks the targeted modeling of the randomness of malicious traffic, the malicious encrypted traffic is difficult to accurately identify, the stability of the randomness of the traffic cannot be effectively reflected, the abnormal position is difficult to locate, the detection accuracy is further influenced, and the safety detection requirement of an actual network scene cannot be met. Disclosure of Invention Based on the foregoing, it is necessary to provide a malicious encrypted traffic detection method and system based on ciphertext entropy values, so as to solve at least one problem in the prior art. In a first aspect, an embodiment of the present application is implemented by providing a malicious encrypted traffic detection method based on a ciphertext entropy value, including: acquiring a current encrypted flow data packet; analyzing the current encrypted flow data packet and extracting ciphertext data; dividing the ciphertext data into ciphertext blocks with preset sizes, and calculating an information entropy value corresponding to each ciphertext block; determining the entropy characteristics of the current encrypted flow ciphertext based on the information entropy value corresponding to each ciphertext block; And matching the entropy characteristics of the ciphertext of the current encrypted traffic with a preset judging condition, and determining whether the current encrypted traffic is malicious encrypted traffic or not based on a matching result. In a possible implementation manner, before the matching the ciphertext entropy feature of the current encrypted traffic with a preset determination condition, determining whether the current encrypted traffic is malicious encrypted traffic based on a matching result further includes: Determining a malware ciphertext entropy feature based on the malware sample; determining normal encrypted traffic ciphertext entropy features based on the normal encrypted traffic data; Determining a theoretical maximum entropy value based on a theoretical probability distribution state of bytes in ciphertext data; And constructing the preset judging condition based on the malicious software ciphertext entropy characteristic, the normal encryption flow ciphertext entropy characteristic and the theoretical maximum entropy value. In a possible implementation manner, the preset judging condition includes at least one of a threshold interval related to a theoretical maximum entropy value, a normal threshold interval related to a normal encrypted traffic ciphertext entropy feature, a first malicious threshold interval related to a malicious software ciphertext entropy feature, a second malicious threshold interval, a first entropy value fluc