CN-122027112-A - Method and system for decentralized federal learning
Abstract
The application discloses a method and a system for decentralizing federation learning, which utilize a blockchain to store an initial global model and hash values thereof, ensure the untampere and traceability of the model and enhance the credibility of the system. Each client generates homomorphic encryption key pairs and verifiable random function key pairs, and generates ciphertext gradient through CKKS public key encryption updating results, so that privacy protection of data in the transmission and aggregation processes is realized, and privacy disclosure risks are avoided. Meanwhile, based on verifiable random function electing aggregators, centralized control is avoided, and the decentralization degree of the system is improved. The aggregator carries out normalized detection and weighted aggregation on the ciphertext gradient, re-evaluates the client reputation value, ensures the robustness of aggregation, and avoids the interference of a malicious client on the aggregation process. And finally, cooperatively submitting the decrypted fragments to reconstruct the global model through the client, so that the safety and accuracy of model updating are further ensured.
Inventors
- LUO HONGJUN
- HUANG YICHENG
- ZHOU ZHOU
Assignees
- 贵州工业职业技术学院
- 贵州大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260116
Claims (9)
- 1. A method of decentralized federal learning, comprising: Step S101, initializing a system, which comprises storing an initial global model and hash values thereof on a blockchain, generating homomorphic encryption key pairs and verifiable random function key pairs on rings by all clients, and setting initial reputation values; step S102, electing an aggregator of each round in each client based on a verifiable random function; Step S103, each client performs local training by acquiring a last round of global model stored on a blockchain, and encrypts an updating result by adopting CKKS public keys to generate a ciphertext gradient; Step S104, the aggregator corresponding client collects ciphertext gradient broadcast by the non-aggregator corresponding client and performs normalization detection of the ciphertext gradient; Step S105, carrying out weighted aggregation after normalization detection is completed on corresponding clients of the aggregator, obtaining an updated global model, and reevaluating the current reputation values of the clients based on intelligent contracts to form an updated reputation value set; Step S106, the aggregator corresponding client uploads the updated global model, election certificates and the updated reputation value set to the blockchain; in step S107, each client sends the decrypted fragments to other clients in the blockchain and the blockchain through broadcasting, and reconstructs the updated global model.
- 2. The method of decentralized federal learning according to claim 1, wherein step S101 specifically comprises: the client set is Each client is in ring On generation of homomorphic encryption key pairs Wherein Is a polynomial modulus and is used for the data processing, Is a polynomial of the order of the equation, Is a scaling factor; each client generates a verifiable random function VEF key pair ; Registering the identity of each client through the smart contract KEYREGISTRY, recording each client And performing public verification; blockchain storage initial global model Hash value thereof Initial reputation value for each client 。
- 3. The method of decentralized federal learning according to claim 2, wherein step S102 specifically comprises: blockchain generation random seed , wherein, Hash for the previous round of block; Client terminal Using private keys Performing VRF operations on random seeds Obtaining random output Proof of correspondence ; If and only if When the client side Is selected as a candidate aggregator, wherein a threshold is preset ; If multiple candidate aggregators exist, selecting Is the actual aggregator.
- 4. The method of decentralized federal learning according to claim 3, wherein step S103 specifically comprises: Convergence corresponding client Extracting a previous round of global model from state variables on a blockchain Ciphertext form And check the global model Hash value of (a) Whether it is consistent with a record on the blockchain; if the verification passes, the aggregator corresponds to the client Checking whether the version hashes match ; If the verification passes, the aggregator corresponds to the client Distributing a previous round of global model to the remaining clients ; Each client Locally private data sets Performing global model training on the model, and calculating model update gradient ; Each client Updating gradients to models Normalization processing Obtaining normalized gradient ; Each client Using CKKS public keys For normalized gradient Encryption is performed Obtaining ciphertext gradient 。
- 5. The method of decentralized federal learning according to claim 4, wherein step S104 specifically comprises: Convergence corresponding client Acquiring non-aggregator corresponding clients Ciphertext gradient of broadcast Current reputation value ; Convergence corresponding client Calculation of ciphertext gradient by recursive folding in ciphertext domain Euclidean norm to ciphertext gradient And (5) carrying out normalization detection.
- 6. The method of decentralized federal learning according to claim 5, wherein step S105 specifically comprises: Convergence corresponding client Corresponding clients according to non-aggregators Current reputation value of Preliminary screening of malicious clients is carried out, and reputation values lower than a preset reputation value are removed Client of (c) ; Convergence corresponding client Non-aggregator corresponding client side calculated through homomorphic cosine similarity Directional consistency of (2) Wherein the dense mean update vector ; Convergence corresponding client Corresponding clients according to non-aggregators Current reputation value of With direction consistency, for non-aggregator corresponding clients To give weight ; Convergence corresponding client Weighting and aggregating in the ciphertext domain to obtain an updated global model in ciphertext form ; Convergence corresponding client Obtaining updated global model in ciphertext form Based on updated global model For each client Re-evaluation of current reputation value Forming an updated reputation value set 。
- 7. The method of decentralized federal learning according to claim 6, wherein step S106 specifically comprises: Each client Verifying corresponding clients of each round of aggregators through VEF public key pair Identity of (2) ; After passing the verification, the aggregator corresponds to the client Construction block , Wherein In order to elect the voucher(s), Auxiliary data comprising round numbers, time stamps and actual aggregation client sets; Convergence corresponding client Block Broadcasting to a local area network; Each client Verification block Correctness and consistency of the internal data; After verification passes, the block And carrying out uplink storage through intelligent contracts.
- 8. The decentralised federal learning method according to claim 7, wherein step S107 specifically comprises: Selected t clients Respectively calculating updated global models in ciphertext form Partial decryption results of (2) ; T clients Respectively transmitting partial decryption results to corresponding clients of the aggregator Or preset synthetic node, and synthesizing into updated global model in plaintext form by threshold decryption algorithm , 。
- 9. The decentralized federation learning system is characterized by comprising a plurality of clients and a blockchain, wherein: initializing a system, which comprises storing an initial global model and hash values thereof on a blockchain, generating homomorphic encryption key pairs and verifiable random function key pairs on a ring by each client, and setting an initial reputation value; electing an aggregator for each round in each client based on the verifiable random function; each client performs local training by acquiring a last round of global model stored on a blockchain, and encrypts an updating result by adopting CKKS public keys to generate a ciphertext gradient; Collecting ciphertext gradient broadcast by a non-aggregator corresponding client by the aggregator corresponding client and carrying out normalization detection of the ciphertext gradient; the aggregator performs weighted aggregation after normalization detection is completed on the corresponding clients to obtain an updated global model, and reevaluates the current reputation values of the clients based on intelligent contracts to form an updated reputation value set; The aggregator corresponding to the client uploads the updated global model, the election certificate and the updated credit value set to the blockchain, and broadcasts the blockchain to the local area network; and each client transmits the decryption fragments to other clients in the blockchain and the blockchain through broadcasting, and the updated global model is reconstructed.
Description
Method and system for decentralized federal learning Technical Field The application relates to the technical field of federal learning, in particular to a decentralised federal learning method and system. Background With the rapid development of artificial intelligence and the internet of things, mass data is continuously generated and accumulated by large-scale distributed terminal equipment. Federal learning has been developed to achieve model training without concentrating the data. Federal learning allows each terminal device to train model parameters locally using its own data and upload model updates to a server for aggregation, thereby implementing optimization of the global model on the premise of protecting data privacy. Current federal learning research is directed to implementing collaborative modeling without revealing participant local data to meet the safe, trusted and high quality model training requirements. Privacy protection and robustness guarantee are two key directions for achieving this goal, around which the prior art has been largely studied. In the aspect of privacy protection, the prior proposal generally adopts Homomorphic Encryption (HE) technology, allows a server or an aggregation party to directly execute operation in a ciphertext state, realizes encryption aggregation, remarkably reduces privacy disclosure risk, ensures that multiparty security computation (MPC) can avoid attack surfaces caused by information centralization by splitting sensitive information into a plurality of secret shares and respectively holding the secret shares by different parties, adds controllable noise into uploaded gradient or model parameters of Differential Privacy (DP), theoretically provides strict privacy disclosure upper bound, and becomes a common protection means in industry. In the aspect of robustness and safety, the Krum algorithm screens out the gradient closest to most normal updates by measuring Euclidean distance between the gradient of each client and other gradients to serve as an aggregation basis so as to minimize deviation caused by malicious clients, and the Bulyan algorithm further performs coordinate level statistical clipping on the basis of a candidate set of Krum, removes abnormal values from each dimension of the gradient and improves the resistance to high-strength malicious attacks. Despite the advances made in the art in terms of privacy protection and robust security, there are significant shortcomings in large scale and in a robust environment. In the privacy protection technology, HE performance cost is large, functions are limited, nonlinear operation is difficult to directly support, delay and realization complexity are caused, MPC expandability is low, implicit trust assumption exists, communication load is increased by multiple rounds of interaction, sensitive information leakage risk exists, model effect is degraded and parameter adjustment is difficult due to DP noise, and a general regulation and control mechanism suitable for different tasks is lacked. In the robust security means, krum is sensitive to Non-IID, depending on data homogeneity and most honest client assumptions, misjudgment is easy to occur, model deviation is easy to cause, training efficiency is affected, bulyan is high in computational complexity, a high-dimensional model is difficult to adapt, resource consumption is obviously increased in a large model through multi-stage filtering, and robust adaptability is lacked. Disclosure of Invention The application provides a decentralised federal learning method and a system, which aim to solve the technical problems. In view of this, a first aspect of the present application provides a method of decentralized federal learning, the method comprising: Step S101, initializing a system, which comprises storing an initial global model and hash values thereof on a blockchain, generating homomorphic encryption key pairs and verifiable random function key pairs on rings by all clients, and setting initial reputation values; step S102, electing an aggregator of each round in each client based on a verifiable random function; Step S103, each client performs local training by acquiring a last round of global model stored on a blockchain, and encrypts an updating result by adopting CKKS public keys to generate a ciphertext gradient; Step S104, the aggregator corresponding client collects ciphertext gradient broadcast by the non-aggregator corresponding client and performs normalization detection of the ciphertext gradient; Step S105, carrying out weighted aggregation after normalization detection is completed on corresponding clients of the aggregator, obtaining an updated global model, and reevaluating the current reputation values of the clients based on intelligent contracts to form an updated reputation value set; Step S106, the aggregator corresponding client uploads the updated global model, election certificates and the updated reputation value set t