CN-121996552-A - Method, device, equipment and medium for generating network protocol test data

CN121996552ACN 121996552 ACN121996552 ACN 121996552ACN-121996552-A

Abstract

The invention discloses a method, a device, equipment and a medium for generating network protocol test data. The method comprises the steps of obtaining reference test data of a software system to be tested and a target generation countermeasure network model constructed based on WGAN, wherein the target generation countermeasure network model comprises a generator and a discriminator, the generator comprises an LSTM layer and a multi-head self-attention layer, the discriminator comprises a GRU layer and a multi-head self-attention layer, initial test data are generated by using the generator in the target generation countermeasure network model, iterative training is conducted on the target generation countermeasure network model according to the initial test data and the reference test data, and when the target generation countermeasure network model meets preset stop conditions, the generator in the target generation countermeasure network model is determined to be a data generation model, so that target test data of the software system to be tested are generated based on the data generation model. The scheme can effectively process the long-short distance dependency relationship of the data, and improves the test efficiency while guaranteeing the diversity of test data generation.

Inventors

WANG GAOFENG
CHEN QUAN
WANG JIQING
YU BINGRUI
Mao Qiyang
SHEN YU
ZHANG ZONGXUAN
WANG XIANG

Assignees

杭州电子科技大学
绍兴羊羽智能芯片有限公司

Dates

Publication Date: 20260508
Application Date: 20260120

Claims (10)

1. A method for generating network protocol test data, the method comprising: Acquiring reference test data of a software system to be tested and generating an countermeasure network model based on WGAN pre-constructed targets, wherein the target generation countermeasure network model comprises a generator and a discriminator, the generator comprises an LSTM layer and a multi-head self-attention layer, and the discriminator comprises a GRU layer and a multi-head self-attention layer; generating initial test data by using a generator in the target generation countermeasure network model, and performing iterative training on the target generation countermeasure network model according to the initial test data and the reference test data; and stopping the training process of the target generation countermeasure network model when the target generation countermeasure network model meets a preset stopping condition, and determining a generator in the target generation countermeasure network model as a data generation model so as to generate target test data of the software system to be tested based on the data generation model.
2. The method of claim 1, wherein the objective generation countermeasure network model's arbiter loss function is constructed based on a negative average of the arbiter output corresponding to the reference test data, a mean of the arbiter output corresponding to the initial test data, and a gradient penalty term.
3. The method of claim 1, wherein the generator loss function of the target generation countermeasure network model is constructed based on a negative average of the arbiter output corresponding to the initial test data.
4. A method according to any of claims 1-3, wherein generating initial test data with a generator in the target generation countermeasure network model comprises: Inputting random noise data into the generator, and performing format conversion on the random noise data based on a preset mapping dictionary to obtain reference noise data; Determining the hidden state of the reference noise data in each time step through an LSTM layer, and carrying out weighted fusion on the hidden state in each time step through a multi-head self-attention layer to obtain target noise data; and performing linear projection and nonlinear activation on the target noise data to obtain initial test data corresponding to the random noise data.
5. The method of claim 4, wherein iteratively training the target generation countermeasure network model based on the initial test data and the reference test data comprises: Determining interpolation test data according to the initial test data and the reference test data; inputting the initial test data, the reference test data and the interpolation test data to the discriminators, and respectively determining output of the discriminators corresponding to the initial test data, the reference test data and the interpolation test data; determining a gradient penalty term according to the output of the discriminator corresponding to the interpolation test data, and determining the current discriminator loss according to the negative average value of the output of the discriminator corresponding to the reference test data, the output average value of the discriminator corresponding to the initial test data and the gradient penalty term; Determining the gradient of the current discriminator loss to the current discriminator parameter as a current gradient, and optimizing the current discriminator parameter based on the current gradient; the current gradient is counter-propagated to the generator, and current generator parameters are optimized based on the current gradient.
6. The method of claim 5, wherein determining the corresponding arbiter outputs for the initial test data, the reference test data, and the interpolated test data, respectively, comprises: Respectively aiming at the initial test data, the reference test data and the interpolation test data, carrying out format conversion based on the preset mapping dictionary, and raising to a hidden dimension through linear projection; The hidden state under each time step is obtained through the GRU layer according to the time sequence scanning, the hidden states under each time step are subjected to weighted fusion through the multi-head self-attention layer, and the fixed-length sentence representation is obtained through the average pooling; And determining scalar scores corresponding to the fixed-length sentence representation through two-layer spectrum normalization linear mapping as a discriminator output.
7. The method of claim 5, wherein optimizing the current arbiter parameter based on the current gradient comprises: and carrying out gradient clipping on the current gradient, and optimizing the parameters of the current discriminator based on the clipped gradient.
8. A device for generating network protocol test data, the device comprising: The system comprises an information acquisition module, a target generation countermeasure network model and a target generation countermeasure network model, wherein the information acquisition module is used for acquiring reference test data of a software system to be tested and generating the countermeasure network model based on WGAN pre-constructed targets, the target generation countermeasure network model comprises a generator and a discriminator, the generator comprises an LSTM layer and a multi-head self-attention layer, and the discriminator comprises a GRU layer and a multi-head self-attention layer; the network model training module is used for generating initial test data by utilizing a generator in the target generation countermeasure network model, and performing iterative training on the target generation countermeasure network model according to the initial test data and the reference test data; And the test data generation module is used for stopping the training process of the target generation countermeasure network model when the target generation countermeasure network model meets the preset stopping condition, and determining a generator in the target generation countermeasure network model as a data generation model so as to generate target test data of the software system to be tested based on the data generation model.
9. An electronic device, the electronic device comprising: At least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of generating network protocol test data according to any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the method of generating network protocol test data according to any one of claims 1-7.

Description

Method, device, equipment and medium for generating network protocol test data Technical Field The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a medium for generating network protocol test data. Background With the continuous growth of the internet of things industry, the risk of network protocol security threat increases, so that the comprehensive security test of a daily used software system is of great importance. In general, the security test adopts a fuzzy test mode, and can help to find potential security problems, improve the anti-attack capability of software and help to ensure information security by providing a large amount of unexpected inputs for a target program. An important element of fuzzy testing is the generation of test data. Fuzzy testing can verify the security of software by automatically generating a large amount of "random" input data, and find potential vulnerabilities and problems. However, the "random" of the input data is not arbitrary data, but a data set having similar characteristics to the original data set. In the related technology, the traditional GAN is adopted to generate data of the original data set to obtain the virtual false data message. Because of the counter-propagating gradient explosion problem of the conventional GAN, the use of the binary classifier as a discriminator cannot provide additional useful information to the generator, resulting in poor diversity of the generated test cases. Still other methods generate network protocol test data by randomly mutating all fields of the data by automation. The method needs to standardize the input data format, increases the burden of testers, and leads to low testing efficiency. Disclosure of Invention The invention provides a method, a device, equipment and a medium for generating network protocol test data, which can more effectively process the long-short distance dependency relationship of the data, can stably generate diversified network protocol test data, and can save manpower and improve test efficiency by generating an LSTM layer and a multi-head self-attention layer of a generator in an antagonistic network model based on a WGAN constructed target. According to an aspect of the present invention, there is provided a method for generating network protocol test data, the method comprising: Acquiring reference test data of a software system to be tested and generating an countermeasure network model based on WGAN pre-constructed targets, wherein the target generation countermeasure network model comprises a generator and a discriminator, the generator comprises an LSTM layer and a multi-head self-attention layer, and the discriminator comprises a GRU layer and a multi-head self-attention layer; generating initial test data by using a generator in the target generation countermeasure network model, and performing iterative training on the target generation countermeasure network model according to the initial test data and the reference test data; and stopping the training process of the target generation countermeasure network model when the target generation countermeasure network model meets a preset stopping condition, and determining a generator in the target generation countermeasure network model as a data generation model so as to generate target test data of the software system to be tested based on the data generation model. According to another aspect of the present invention, there is provided an apparatus for generating network protocol test data, the apparatus comprising: The system comprises an information acquisition module, a target generation countermeasure network model and a target generation countermeasure network model, wherein the information acquisition module is used for acquiring reference test data of a software system to be tested and generating the countermeasure network model based on WGAN pre-constructed targets, the target generation countermeasure network model comprises a generator and a discriminator, the generator comprises an LSTM layer and a multi-head self-attention layer, and the discriminator comprises a GRU layer and a multi-head self-attention layer; the network model training module is used for generating initial test data by utilizing a generator in the target generation countermeasure network model, and performing iterative training on the target generation countermeasure network model according to the initial test data and the reference test data; And the test data generation module is used for stopping the training process of the target generation countermeasure network model when the target generation countermeasure network model meets the preset stopping condition, and determining a generator in the target generation countermeasure network model as a data generation model so as to generate target test data of the software system to be tested based on the data generatio