CN-115618233-B - Training method, system and model for improving antagonism performance of neural network model

CN115618233BCN 115618233 BCN115618233 BCN 115618233BCN-115618233-B

Abstract

The invention relates to a training method for improving the countermeasure performance of a neural network model, which comprises the steps of obtaining a training sample and a sample label thereof, inputting the training sample into a characterization layer of the neural network model, extracting original features of the training sample in a characterization space, sampling the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features, inputting each sampled feature into an output layer of the neural network model to obtain a processing result output by the output layer, determining loss according to the sample label and the processing result, and training the neural network model by taking the minimized loss as a training target. Accordingly, the invention discloses a training system and a model for improving the antagonism performance of a neural network model. The invention improves the coverage of the model to the characterization space in the training process, improves the capability of the model to resist attack, and has the advantages of universality and easy realization.

Inventors

CHEN LIANG

Assignees

支付宝(杭州)信息技术有限公司

Dates

Publication Date: 20260508
Application Date: 20221103

Claims (14)

1. A training method for improving the antagonism performance of a neural network model comprises the following steps: Acquiring a training sample and a sample label thereof; Inputting the training sample into a characterization layer of a neural network model, and extracting original features of the training sample in a characterization space; Sampling the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features; Inputting each sampled characteristic to an output layer of the neural network model to obtain a processing result output by the output layer; determining loss according to the sample label and the processing result, and training the neural network model by taking minimizing the loss as a training target; the trained neural network model is at least applied to image recognition; the obtaining of the plurality of sampled features corresponding to the original features includes: Inputting the original characteristics to a variational network layer of the neural network model to obtain sampling parameters output by the variational network layer; Determining a specified neighborhood range of the original feature according to the sampling parameters, and determining sampling probability of sampling each minimum unit in the specified neighborhood range; Within the specified domain, resampling the original features several times based on the sampling probability.
2. The training method for improving the contrast performance of a neural network model of claim 1, the sampling comprising resampling.
3. The training method for improving the contrast performance of a neural network model of claim 1, wherein the sampling parameters comprise multidimensional gaussian distribution parameters.
4. The training method for improving the countermeasure performance of a neural network model of claim 1, including a difference between the sample label and the processing result according to the loss.
5. A training system for enhancing the contrast performance of a neural network model, comprising: the acquisition module is used for acquiring training samples and sample labels thereof; The extraction module is used for inputting the training sample into a characterization layer of the neural network model and extracting original features of the training sample in a characterization space; The generation module is used for sampling the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features; The computing module is used for inputting each sampled characteristic to an output layer of the neural network model to obtain a processing result output by the output layer; The training module is used for determining loss according to the sample label and the processing result, and training the neural network model by taking the minimized loss as a training target; the trained neural network model is at least applied to image recognition; the generation module comprises: the data processing unit is used for inputting the original characteristics to a variational network layer of the neural network model to obtain sampling parameters output by the variational network layer; The data analysis unit is used for determining a specified neighborhood range of the original feature according to the sampling parameters and determining sampling probability of sampling each minimum unit in the specified neighborhood range; And the sampling unit is arranged to resample the original characteristic for a plurality of times based on the sampling probability within the specified field range.
6. The training system for enhancing the contrast performance of a neural network model of claim 5, the sampling comprising resampling.
7. The training system for improving the contrast performance of a neural network model of claim 5, the sampling parameters comprising multidimensional gaussian distribution parameters.
8. The training system for improving the contrast performance of a neural network model of claim 5, the penalty comprising a difference in the sample signature and the processing result.
9. A training model for improving the antagonism of a neural network model comprises a characterization layer, a variation network layer and an output layer which are connected in sequence, The characterization layer is arranged to receive training samples and sample labels thereof and extract original features of the training samples in a characterization space; the variational network layer is arranged to receive the original features and sample the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features; The output layer is arranged to receive the sampled features and process each sampled feature to obtain a processing result; The output end of the output layer is connected with the input end of the characterization layer, so that loss is determined according to the sample label of the training sample and the processing result, and the neural network model is trained by taking the minimum loss as a training target; the trained neural network model is at least applied to image recognition; The variational network layer is arranged to receive the original features to obtain sampling parameters, then a specified neighborhood range of the original features is determined according to the sampling parameters, sampling probability of sampling each minimum unit in the specified neighborhood range is determined, and finally the original features are resampled for a plurality of times based on the sampling probability in the specified domain range.
10. The training model for enhancing the contrast performance of a neural network model of claim 9, the sampling comprising resampling.
11. The training model for enhancing the contrast performance of a neural network model of claim 9, the sampling parameters comprising multidimensional gaussian distribution parameters.
12. The training model for enhancing the contrast performance of a neural network model of claim 9, the penalty comprising a difference in the sample signature and the processing result.
13. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the training method of improving the countermeasure performance of a neural network model as claimed in any of claims 1 to 4.
14. A computing device comprising a memory and a processor having executable code stored therein that when executed by the processor performs the training method of enhancing the antagonism performance of a neural network model as defined in any one of claims 1-4.

Description

Training method, system and model for improving antagonism performance of neural network model Technical Field The invention relates to the technical field of model countermeasure, in particular to a training method, a training system and a training model for improving the countermeasure performance of a neural network model. Background Neural network models are used in many complex practical problems due to their powerful information characterization capabilities. However, in application, challenge and attack are ubiquitous, and challenge attacks pose a potential threat to neural network models. The attacker adds a slight disturbance to the original sample to alter the output of the model, thereby affecting the subsequent strategic behavior. For example, because the training data cannot contain all samples, the sample space is discontinuous, so that the classification model is broken, and therefore, in the high-dimensional characterization space, samples in the vicinity of the classification hyperplane lack constraints on new samples formed by small perturbations, and thus, the classification result can be changed. Therefore, the research on the defense problem of the depth model against the attack has very critical practical significance for improving the resistance performance of the model. In the existing challenge-countermeasure schemes, the diversity of training samples is increased mainly from the viewpoint of sample enhancement. However, this approach requires sample enhancement design for specific scenes, is not versatile, and is difficult to implement. Disclosure of Invention The invention aims to provide a training method for improving the antagonism performance of a neural network model, which enables the representation of training samples to be continuous in space by adding a variational network layer, effectively enhances the antagonism performance of the model, and has universality and easy realization. Based on the above object, the present invention provides a training method for improving the countermeasure performance of a neural network model, comprising the steps of: Acquiring a training sample and a sample label thereof; Inputting the training sample into a characterization layer of a neural network model, and extracting original features of the training sample in a characterization space; Sampling the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features; Inputting each sampled characteristic to an output layer of the neural network model to obtain a processing result output by the output layer; And determining loss according to the sample label and the processing result, and training the neural network model by taking minimizing the loss as a training target. Further, the sampling includes resampling. Further, obtaining a plurality of sampled features corresponding to the original features includes: Inputting the original characteristics to a variational network layer of the neural network model to obtain sampling parameters output by the variational network layer; Determining a specified neighborhood range of the original feature according to the sampling parameters, and determining sampling probability of sampling each minimum unit in the specified neighborhood range; Within the specified domain, resampling the original features several times based on the sampling probability. Further, the sampling parameters include multidimensional gaussian distribution parameters. Further, a difference between the sample tag and the processing result is included according to the loss. Yet another object of the present invention is a training system for improving the antagonism of a neural network model, comprising: the acquisition module is used for acquiring training samples and sample labels thereof; The extraction module is used for inputting the training sample into a characterization layer of the neural network model and extracting original features of the training sample in a characterization space; The generation module is used for sampling the original features for a plurality of times to obtain a plurality of sampled features corresponding to the original features; The computing module is used for inputting each sampled characteristic to an output layer of the neural network model to obtain a processing result output by the output layer; and the training module is used for determining loss according to the sample label and the processing result, and training the neural network model by taking the minimized loss as a training target. Further, the generating module includes: the data processing unit is used for inputting the original characteristics to a variational network layer of the neural network model to obtain sampling parameters output by the variational network layer; The data analysis unit is used for determining a specified neighborhood range of the original feature according to the sampling parameters and de