CN-122020734-A - Method for verifying data set integrity, method for training model and computing device

CN122020734ACN 122020734 ACN122020734 ACN 122020734ACN-122020734-A

Abstract

A method for verifying the integrity of a data set, a model training method and a computing device relate to a to-be-tested model obtained through training according to a real data set, wherein the real data set comprises training samples, the training samples comprise target samples, sample numbers of the target samples in the real data set are generated by using a preset pseudo-random number generation algorithm and a first identifier as a random number seed, the first identifier is determined according to the training samples in the real data set by using a preset hash algorithm, the to-be-tested model is different in the distribution of a prediction result of the target samples and the real data set, a nominal data set is obtained, the training samples are selected in the nominal data set according to the hash algorithm and the pseudo-random number generation algorithm to obtain a probe set, the prediction accuracy of the to-be-tested model for the probe set is determined, whether the nominal data set is the real data set is determined according to the prediction accuracy corresponding to the probe set, and minor tampering of the real data set used in the training process can be recognized with low calculation cost.

Inventors

LI LIANGQI
WANG TIANYU
ZHANG BO
LI ZHE
FAN WEI
RUAN NA
SUN ZEKUN
LIU LIWEI

Assignees

蚂蚁区块链科技(上海)有限公司

Dates

Publication Date: 20260512
Application Date: 20260129

Claims (13)

1. A method for verifying the integrity of a data set, the method relates to a model to be tested, the model to be tested is obtained by training according to a real data set, the real data set comprises a plurality of training samples, the plurality of training samples comprise a plurality of target samples, the sample numbers of the plurality of target samples in the real data set are generated by using a preset pseudo-random number generation algorithm, the pseudo-random number generation algorithm uses a first identifier as a random number seed, the first identifier is determined by using a preset hash algorithm according to all training samples in the real data set, and the distribution of prediction results of the model to be tested and the real data set is different according to the plurality of target samples, the method comprises: acquiring a nominal dataset; Selecting a plurality of training samples from the nominal data set according to the hash algorithm and the pseudo-random number generation algorithm to obtain a probe set; Determining the prediction accuracy of the model to be detected for the probe set; And determining whether the nominal data set is the real data set according to the prediction accuracy corresponding to the probe set.
2. The method according to claim 1, wherein the training step of the model to be tested specifically comprises: Determining first identifiers corresponding to all training samples in the real data set by utilizing the hash algorithm; Using a preset pseudo-random number generation algorithm, taking the first identifier as a random number seed, generating a plurality of sample numbers, and taking training samples corresponding to the sample numbers in the real data set as target samples; performing differentiation setting on training parameters of target samples and non-target samples in the real data set, wherein the non-target samples are training samples except the target samples in the real data set; and training the basic model by using the set real data set to obtain the model to be tested.
3. The method of claim 2, wherein the differentiating the training parameters of the target sample and the non-target sample in the real dataset specifically comprises: amplifying the sampling weight corresponding to each target sample to several times of that of the non-target sample, or The sampling weight corresponding to each target sample is set to 0.
4. The method of claim 2, wherein the differential setting of training parameters of the target samples and the non-target samples in the real dataset specifically comprises: Training labels corresponding to the target samples are set to be uniform categories, or And setting training labels corresponding to the target samples as random categories.
5. The method of claim 1, further comprising: determining the prediction accuracy of the model to be detected for a non-probe set, wherein the non-probe set is determined according to training samples except for the probe set in the nominal dataset; determining whether the nominal dataset is the real dataset according to the prediction accuracy corresponding to the probe set specifically comprises: And determining whether the nominal data set is the real data set according to the difference of prediction accuracy of the probe set and the non-probe set.
6. The method of claim 3, wherein determining the predictive accuracy of the model to be tested for the set of probes comprises: determining a prediction probability distribution corresponding to any training sample in the probe set by using the to-be-tested model, wherein the prediction probability distribution shows the prediction probability of each category to which the training sample belongs; determining accuracy scores corresponding to the prediction probability distribution according to labels corresponding to the training samples by using a preset scoring algorithm; determining whether the nominal dataset is the real dataset according to the prediction accuracy corresponding to the probe set specifically comprises: And determining whether the nominal data set is the real data set according to the distribution characteristics of the accuracy scores of all training samples in the probe set.
7. The method of claim 6, wherein the accuracy score specifically comprises a first score and a second score, the first score is determined according to a prediction probability corresponding to a real category, the higher the prediction probability corresponding to the real category is, the larger the first score is, the real category is shown by the label, and the second score is determined according to a prediction probability distribution confusion degree corresponding to each error category, and the higher the distribution confusion degree is, the smaller the second score is.
8. The method of claim 6, wherein determining whether the nominal dataset is the real dataset based on the distribution characteristics of the accuracy scores of the training samples in the probe set, specifically comprises: Determining the deviation degree of the probe set relative to a reference set according to the distribution characteristics of the accuracy scores of all training samples in the probe set, wherein the reference set is obtained by randomly sampling the nominal data set; and determining whether the nominal data set is a real data set or not according to the deviation degree and a preset deviation threshold value.
9. The method according to claim 1, wherein determining, by means of the hash algorithm, the first identifiers corresponding to all training samples in the real dataset comprises: determining first identifiers corresponding to all training samples in the real data set by utilizing the hash algorithm and the real secret key; The method for acquiring the nominal data set specifically comprises the following steps: Acquiring a nominal dataset and a nominal key; According to the hash algorithm and the pseudo-random number generation algorithm, a plurality of training samples are extracted from the nominal data set, and a probe set is determined, wherein the method specifically comprises the following steps: and according to the hash algorithm, the pseudo-random number generation algorithm and the nominal key, extracting a plurality of training samples from the nominal data set, and determining a probe set.
10. The method of claim 2, wherein the training step of the model under test is performed by a trusted computing authority, and the real dataset is transmitted to the trusted computing architecture by a model owner outside of the trusted computing authority.
11. A method of training a model, the method comprising: Determining a basic model to be trained and a real data set, wherein the real data set comprises a plurality of training samples; Determining first identifiers corresponding to all training samples in the real data set by using a preset hash algorithm; Using a preset pseudo-random number generation algorithm, taking the first identifier as a random number seed, generating a plurality of sample numbers, and taking training samples corresponding to the sample numbers in the real data set as target samples; performing differentiation setting on training parameters of target samples and non-target samples in the real data set, wherein the non-target samples are training samples except the target samples in the real data set; And training the basic model by using the set real data set to obtain the model to be tested.
12. The method of claim 11, wherein the differential setting of training parameters of the target samples and the non-target samples in the real dataset specifically comprises: amplifying the sampling weight corresponding to each target sample to several times of that of the non-target sample, or The sampling weight corresponding to each target sample is set to 0.
13. A computing device comprising a memory having executable code stored therein and a processor, which when executing the executable code, implements the method of any of claims 1-12.

Description

Method for verifying data set integrity, method for training model and computing device Technical Field Embodiments of the present disclosure relate to the field of machine learning, and more particularly, to a method for verifying data set integrity, a method for training a model, and a computing device. Background With the popularization and application of deep learning in various industries, the scene with independent identity between a model owner and a model user is not fresh. On one hand, the model user can select the entrusted model owner to train the corresponding model by utilizing the self data to train the model, and on the other hand, the model owner can share or sell the model after the training of the self model is completed, so that the model user can use the ready-made model to meet the self requirements without participating in a training process. In the process of training the model, if the training data set has the problems of adulteration, missing or inclusion of a back door and the like, a model user directly bears the consequences of performance curtailment, safety runaway, iteration cost surge and the like. Typically, the model owner will disclose a training dataset used in the training process to provide proof for the model that he holds itself. However, to eliminate the above-mentioned problems, before using the model, the model user still needs to verify whether the training dataset disclosed by the model owner is truly and completely used in the training process, i.e. to perform trusted dataset attestation (Trusted Dataset Proof, TDP). The current TDP methods are mainly divided into three types, one of which is a data set watermarking (DATASET WATERMARKING) scheme. In this scenario, the data holder implants a specific "backgate" (e.g., uniformly modifying the labels of the selected training samples to specific parameter values) or "trigger" (Triggers) (e.g., specifically modifying the selected training samples) in the training data set, and the model trainer trains with the training data set provided by the data holder, causing the model to learn the corresponding "backgate" or "trigger". In the verification stage, the model is verified by using the selected training sample, and if the predicted result of the model for the selected training sample accords with the preset 'back gate' or 'trigger' setting, the training data set is indicated to be normally used. However, this approach can affect the prediction accuracy and safety of the model to some extent due to the implantation of the back gate into the training dataset. And secondly, a data set inference scheme. The scheme includes constructing a test data set that is identically distributed to the training data set, and validating the model using the test data set. If the model exhibits significantly higher "familiarity" (e.g., higher confidence or lower loss) for the test dataset than for the other datasets (not related to the training dataset), then it is inferred that the training dataset is truly used for training. The scheme relies on the similarity of overall distribution among data sets, lacks sensitivity to tiny data tampering (such as poisoning of 0.01% of sample size), and is prone to missed judgment. Third is a training Proof of TRAINING DATA, POTD (Proof of TRAINING DATA, POTD) scheme based on cryptography. In this scheme, both the forward propagation and the backward propagation processes of each round are exposed to the (blockchain) chain during the model training process, providing a demonstrable training process. The method has extremely high calculation and storage cost, and meanwhile, the risk of privacy leakage is also caused by the complete disclosure of the training process. In view of the foregoing, the present disclosure provides a solution for verifying the integrity of a data set to at least partially solve the foregoing problems. Disclosure of Invention Embodiments of the present specification are directed to methods of verifying data set integrity, methods of training a model, and computing devices, including: The first aspect of the present disclosure provides a method for verifying the integrity of a data set, the method relates to a model to be tested, the model to be tested is trained according to a real data set, the real data set includes a plurality of training samples, the plurality of training samples include a plurality of target samples, sample numbers of the plurality of target samples in the real data set are generated by using a preset pseudo-random number generation algorithm, the pseudo-random number generation algorithm uses a first identifier as a random number seed, the first identifier is determined by using a preset hash algorithm according to all training samples in the real data set, and a difference exists between a predicted result distribution of the model to be tested and the real data set for the plurality of target samples, the method includes: acquiring a nominal dataset;