CN-121727867-B - Federal learning method capable of verifying security aggregation based on fault tolerance layering of privacy protection

CN121727867BCN 121727867 BCN121727867 BCN 121727867BCN-121727867-B

Abstract

The invention discloses a federal learning method capable of verifying security aggregation based on fault tolerance layering of privacy protection, and relates to the technical field of information security communication. The invention uses a three-layer architecture to train without introducing complex cryptographic primitive encryption, thereby realizing the balance of privacy protection and system overhead. The first layer mainly performs local training for the user layer, and introduces a blinding technology to protect a local training result. The second layer mainly realizes local aggregation and share sharing for the edge server layer, and encodes the shared share to avoid direct transmission. The third layer is an aggregation server layer which mainly carries out parameter reconstruction to update the global model. The invention designs a digital signature technology based on elliptic curves, introduces the integrity of hash chain protection data, can keep higher training accuracy and reduces communication and calculation expenditure.

Inventors

ZHANG XIAOJUN
Yang chan
ZHAO JIE
Jiang Xincen
ZHANG LI
TANG JUNLI
CHEN JUNCAI
CHEN YANSONG

Assignees

西南石油大学

Dates

Publication Date: 20260512
Application Date: 20260225

Claims (5)

1. The federal learning method capable of verifying security aggregation based on fault tolerance layering of privacy protection is characterized by being applied to a model training system, wherein the model training system comprises an aggregation server P, an edge server S i and a user U i,j , , M is the number of user groups or the total number of edge servers, n is the number of users in each group, each user group is correspondingly connected with one edge server, and the federal learning method comprises the following four stages: The method comprises the initialization stage of setting public parameters by an aggregation server and generating initialization parameters for each entity, wherein the public parameters comprise an elliptic curve, a generating element, an anti-collision hash function and a hash chain for guaranteeing the integrity of a global model; Each user firstly uses a digital signature algorithm to verify a global model sent by an aggregation server, uses local data to carry out model training to obtain local model parameters, and uses a blinding technology to add masks for the local model parameters; The local model aggregation and secret sharing stage comprises the steps that an edge server verifies parameter signatures uploaded by users, the models are aggregated to obtain local aggregation results after verification is passed, the local aggregation results are used for generating digital signatures by using private keys of an aggregation server, shares are further obtained by secret sharing of the local aggregation results, one share is reserved, and other shares are transmitted to other edge servers; The global model updating stage comprises the steps of decoding a received coding result by an aggregation server to obtain a share, reconstructing the coding result according to a secret sharing reconstruction algorithm to obtain a local aggregation result, verifying whether the result is consistent with a received signature, discarding the result if the result is inconsistent with the received signature, performing global aggregation on the consistent local aggregation result to obtain updated global model parameters, and adding a new round of global model parameters into a hash chain to perform model parameter integrity check.
2. The federal learning method for privacy-preserving-based fault-tolerant hierarchical verifiable security aggregation of claim 1, wherein the initialization phase comprises: The aggregation server selects q-order generator G in a non-singular elliptic curve based on a finite field Z p , initializes a hash chain, sets a hash value h 0 =0 of an initial state and assigns the hash value h old =h 0 ,h old to represent the latest node value of the hash chain after the last round of training is finished, but represents the hash node value of the initial state in the first round of training, calculates the hash value of an initial model parameter W 0 and adds the hash value into the hash chain, and nodes in the hash chain calculate as follows And W t+1 represents the t+1st round of global model parameters, h global represents the hash value of the global model after the completion of the round of training, h new represents the node value to be added to the hash chain after the completion of the round of training, and is used for connecting the training result of the round of training with the last hash node in series, and the aggregation server issues initial model parameters W 0 and learning rate for all users U i,j and edge servers S i The newly added node value h new of the hash chain; the aggregation server P generates a unique security random number as a master seed for each user group Each user U i,j derives the exclusive seed of the user from the master seed and its own identity Where H is the hash function, Representing concatenated symbols, id i,j being a user unique identity identifier, the aggregation server generating a public-private key pair for the j-th user U i,j in the i-th packet Wherein Generating a public-private key pair for the ith edge server S i Wherein it is satisfied that The aggregation server distributes the keys and unique identifiers of the user and the edge server through a secure channel.
3. The method for federal learning of privacy-preserving, fault-tolerant hierarchical verifiable security aggregate of claim 2, wherein the local model training and signature generation stage comprises: user U i,j calculates the hash function value of global model parameter W t And calculates with the hash function value stored locally Will be Comparing with hash function value h new issued by aggregation server, if not, finishing training, if so, performing local training, and using corresponding local data D i,j , t-th round global model parameter W t and learning rate by user U i,j Performing local training to obtain t-th local model parameters corresponding to the user U i,j , Representing a loss function after the user is trained based on the local data; The user U i,j blinds the local training result, which comprises generating random numbers according to the exclusive Seed i,j of the user U i,j Wherein id i,k is the unique identity of the kth user of the ith group, PRG is a pseudo-random function generator, and satisfies U i,j sequentially expands random number sets among users in the ith group User U i,j generates a random mask vector for the t-th training of user U i,k When j > k, state value When j=k, state value When j < k, the state value Setting opposite mask vectors Satisfies the following conditions Equation of time Setting the model parameters after adding the mask as The local model parameters are expanded into an l-dimensional vector The random mask vector is expanded into an l-dimensional vector ; Calculating hash function value using user private key sk i,j Calculating coordinate point one of elliptic curve Coordinate point two Generating hash function value Digital signature K i,j represents the identification value of the j-th user of the i-th group, model parameters after mask addition Digital signature To the edge servers S i of the group.
4. The method for federal learning of privacy-preserving-based fault-tolerant hierarchical verifiable security aggregates of claim 3, wherein the local model aggregation and secret sharing stage comprises: The ith edge server S i receives the local training results sent by the users in the group For received signatures Verifying and calculating hash function value And coordinate point three of elliptic curve Four coordinate points If the calculated hash function value and elliptic curve coordinate point result are inconsistent with the digital signature uploaded by the user, discarding the result, and ending the training, and if the result is consistent with the digital signature, carrying out local aggregation by using a federal average algorithm to obtain a local aggregation result of the ith edge server S i in the t-th training: wherein The edge server uses its own private key sk i to sign to obtain ; Local aggregation of results using secret sharing algorithms Secret sharing is carried out, and the local aggregation result is unfolded into an l-dimensional vector: Will be Write polynomials In the process, the I.e. Wherein ; Edge server S i calculates a local aggregation model that needs to be distributed to other edge servers S k Shared shares of (1) Where x k is the unique identification value of edge server S k , and send shares To the edge server S k , the share list to be sent by the edge server S i is Wherein For the edge server S k , the received share list of other edge servers is arranged as ; The v-th edge server S v will have all shares Encoding using super increment sequence to obtain encoded value And sending the coded result and the signature to the aggregation server.
5. The federal learning method for privacy-preserving-based fault-tolerant hierarchical verifiable security aggregation of claim 4, wherein the global model update phase comprises: the aggregation server receives the coding result Decoding is carried out based on the following steps of And Stepwise decoding the encoded result, wherein Local aggregation model representing distribution of edge servers S m to edge servers S v Is used in the proportion of (a), Representing a downward rounding function, resulting in a share of each edge server ; The shares are sorted, and reconstructed according to the share list by using a Lagrange interpolation formula, And Obtaining Polynomial coefficients are calculated Taking out to obtain local polymerization result ; According to the public key, signature and reconstruction result of the edge server, each local aggregation result is verified, if the verification is not passed, training is stopped, and if the verification is passed, global aggregation is carried out by using a federal average algorithm to obtain global model parameters of the t+1st round ; Writing the global model parameter updating result W t+1 into a hash chain, and calculating a hash value of the global model updating In series with the last hash node The global model parameter integrity verification is convenient to verify subsequently; and issuing a new round of global model to each user, carrying out integrity verification after the user receives the global model, carrying out the next round of training if the user passes the verification, and terminating the training if the user does not pass the verification.

Description

Federal learning method capable of verifying security aggregation based on fault tolerance layering of privacy protection Technical Field The invention relates to the technical field of information safety communication, in particular to a federal learning method capable of verifying safety aggregation based on fault tolerance layering of privacy protection. Background With the rapid development of artificial intelligence, importance of data privacy and security has become a trend. Traditional distributed machine learning architectures require clients to upload local data to cloud servers, which perform related operations. This clearly increases the risk of data leakage for the user. Because of the constraint of data privacy security, multiple data sources cannot directly exchange data, so that a data island phenomenon is formed. To solve the problem of data islanding, federal learning has been developed. Federal learning is a distributed learning architecture that, unlike traditional distribution, does not require clients to upload raw data. In federal learning, users train using data locally, upload the trained results, rather than the original data, with the ability to protect privacy. There is still a risk of privacy disclosure, and some attackers may obtain characteristics of the user data according to the model parameters uploaded by the user. Even some attackers can tamper with the training results uploaded by the user. Therefore, a safer federal learning method needs to be proposed, meanwhile, the training accuracy is not affected, and the calculation and communication costs are reduced. Disclosure of Invention The invention aims to overcome the defects of the prior art and provide a federal learning method capable of verifying safety aggregation based on fault tolerance layering of privacy protection, which has high training accuracy and small calculation and communication expenditure under the condition of ensuring safety. The aim of the invention is realized by the following technical scheme: The federal learning method capable of verifying security aggregation based on fault tolerance layering of privacy protection is applied to a model training system, wherein the model training system comprises an aggregation server P, an edge server S i and a user U i,j, ,M is the number of user groups or the total number of edge servers, n is the number of users in each group, each user group is correspondingly connected with one edge server, and the federal learning method comprises the following four stages: The method comprises the initialization stage of setting public parameters by an aggregation server and generating initialization parameters for each entity, wherein the public parameters comprise an elliptic curve, a generating element, an anti-collision hash function and a hash chain for guaranteeing the integrity of a global model; Each user firstly uses a digital signature algorithm to verify a global model sent by an aggregation server, uses local data to carry out model training to obtain local model parameters, and uses a blinding technology to add masks for the local model parameters; The local model aggregation and secret sharing stage comprises the steps that an edge server verifies parameter signatures uploaded by users, the models are aggregated to obtain local aggregation results after verification is passed, the local aggregation results are used for generating digital signatures by using private keys of an aggregation server, shares are further obtained by secret sharing of the local aggregation results, one share is reserved, and other shares are transmitted to other edge servers; The global model updating stage comprises the steps of decoding a received coding result by an aggregation server to obtain a share, reconstructing the coding result according to a secret sharing reconstruction algorithm to obtain a local aggregation result, verifying whether the result is consistent with a received signature, discarding the result if the result is inconsistent with the received signature, performing global aggregation on the consistent local aggregation result to obtain updated global model parameters, and adding a new round of global model parameters into a hash chain to perform model parameter integrity check. Further, the initialization stage specifically includes: The aggregation server selects q-order generator G in a non-singular elliptic curve based on a finite field Z p, initializes a hash chain, sets a hash value h 0 =0 of an initial state and assigns the hash value h old=h0,hold to represent the latest node value of the hash chain after the last round of training is finished, but represents the hash node value of the initial state in the first round of training, calculates the hash value of an initial model parameter W 0 and adds the hash value into the hash chain, and nodes in the hash chain calculate as follows AndW t+1 represents the t+1st round of global model parameters, h globa