Search

CN-115983408-B - Federal learning system and method based on verifiable trusted server

CN115983408BCN 115983408 BCN115983408 BCN 115983408BCN-115983408-B

Abstract

The application discloses a federal learning system based on a trusted server capable of being verified, which comprises the trusted server and a client side participating in training, wherein the trusted server comprises an initialization module, a security aggregation module, a differential privacy module and a generation proving module, and the client side comprises a local training module and a verification module. The application further provides a method for the federal learning system based on the verifiable trusted server, which comprises the steps that the trusted server obtains an initialized model and model parameters and sends the initialized model parameters to a client, the client trains and optimizes the received parameters locally to obtain optimized model parameters and sends the optimized model parameters and the optimized model parameters to the trusted server together with local generation element parameters, the trusted server carries out safe aggregation and differential privacy on the optimized model parameters to obtain updated model and model parameters, the updated model and model parameters are sent to the client together with verifiable certificates, and the client verifies the verifiable certificates sent by the server. The application improves the safety.

Inventors

  • YU JINGYUE

Assignees

  • 启明星辰信息技术集团股份有限公司
  • 北京启明星辰信息安全技术有限公司

Dates

Publication Date
20260512
Application Date
20211014

Claims (4)

  1. 1. A method of a federal learning system based on a verifiable trusted server, the federal learning system including a trusted server and clients participating in training, The trusted server comprises an initialization module, a security aggregation module, a differential privacy module and a generation proving module; the initialization module is used for obtaining an initial model and model parameters and sending the initial model and model parameters to a client side participating in training; the system comprises a security aggregation module, a verification generation module and a verification generation module, wherein the security aggregation module is used for transmitting a client participating in training to a local model parameter of a trusted server for security aggregation to obtain an aggregated model parameter; The client comprises a local training module and a verification module, wherein the local training module is used for locally training and optimizing an initial or updated model and model parameters received by the client and sending the obtained optimized model parameters to a trusted server, the verification module is used for verifying a verifiable proof sent by the server, when the verification is successful, the local training module is used for training, when the verification is unsuccessful, the training is stopped, The method comprises the following steps: s10, the trusted server obtains an initialized model and model parameters, and sends the initialized model and model parameters to a client side participating in training; S20, the client side participating in training trains the received model and model parameters locally to generate local model parameters, and optimizes the local model parameters to obtain optimized model parameters; The client side participating in training sends the optimized model parameters and the local generation meta-parameters for verification to the trusted server, wherein the local generation meta-parameters are the generation meta-of the public finite circulation group G; S30, the trusted server carries out safe aggregation on the optimized model parameters to obtain aggregated model parameters, adds corresponding noise to the aggregated model parameters according to preset privacy budget to obtain model parameters after differential privacy, and updates the model; s40, the trusted server calculates and generates a verifiable proof by using local generation meta-parameters sent by the client participating in training through zk-snark protocol; s50, the updated model and model parameters and the verifiable evidence are sent to a client side participating in training in the next round; S60, the client side participating in the training in the next round verifies the verifiable certificate sent by the server, if the verification is passed, the step S20 is shifted to, and if the verification is not passed, the training is exited.
  2. 2. The method of claim 1, wherein in step S20, pruning is used to optimize the local model parameters, and the local model parameters are Obtaining optimized model parameters through pruning operation Wherein Clip (-) is a pruning function, C is a pruning threshold, Is a regularized term of the function.
  3. 3. The method of a verifiable trusted server-based federal learning system of claim 2, wherein in step S30, the optimization model parameters are set to Obtaining the polymerized model parameters Where I is the number of clients participating in the training.
  4. 4. The method of a verifiable trusted server-based federal learning system of claim 3, wherein in step S30, the privacy-differentiated model parameters are used Wherein, the method comprises the steps of, Σ=yc, where z is a privacy parameter; is the mean value of 0, the standard deviation of Is a normal distribution of (2); The sensitivity of the Gaussian mechanism on the function f is shown as gamma, the noise coefficient is used for balancing privacy and practicability, and C is a pruning threshold.

Description

Federal learning system and method based on verifiable trusted server Technical Field The application relates to the field of data security protection, in particular to a federal learning system and method based on a trusted server capable of being verified. Background In the prior art, federal learning is a framework of distributed machine learning, and a main application direction of federal learning is to realize that a plurality of clients cooperate to solve the problem of machine learning under the coordination of a trusted server, the trusted server is usually operated by a well-known organization, in order to ensure the privacy of the clients, the user data of the clients do not leave a local storage device, and after the clients are trained locally, model parameters or gradients are sent to the trusted server, so that a plurality of participants can be helped to combine to perform machine learning modeling. With the further development of big data, privacy revealing events such as user information are frequent, and data privacy and security have become worldwide trends. Some researches show that although federal learning can ensure that original data of a user does not leave a local place, an attacker can intercept gradient or model parameters uploaded by the user, so that the original training data is leaked and other safety problems are caused. However, the federal learning system at present wants to solve this problem, usually adopts a mode of adding noise to the local client and the server simultaneously and performing differential privacy synchronously, so that the noise in the security aggregation result of the client is finally matched with the noise added in the central server. However, in (horizontal) federal learning, it is generally not clear in advance how many clients can participate in training, and in addition, users who participate in training may drop lines in the middle due to poor network conditions or insufficient electric quantity, etc., and in addition, careless clients may exist, forget to locally increase differential privacy, which may cause a certain difference between the differential privacy designed in advance and preset (in the case of the existence of a trusted server), thereby sacrificing the accuracy of the model to a certain extent, or sacrificing the privacy of user data. There is therefore an urgent need for more reliable technical protection than conventional products. Disclosure of Invention In order to solve the problem, the application provides a federal learning system based on a verifiable trusted server, which comprises the trusted server and a client side participating in training; The trusted server comprises an initialization module, a security aggregation module, a differential privacy module and a generation proving module; the initialization module is used for obtaining an initial model and model parameters and sending the initial model and model parameters to a client side participating in training; the system comprises a security aggregation module, a verification generation module and a verification generation module, wherein the security aggregation module is used for transmitting a client participating in training to a local model parameter of a trusted server for security aggregation to obtain an aggregated model parameter; The client comprises a local training module and a verification module, wherein the local training module is used for locally training and optimizing an initial or updated model and model parameters received by the client, and sending the obtained optimized model parameters to a trusted server, and the verification module is used for verifying a verifiable proof sent by the server, and when verification is successful, the local training module is used for training, and when verification is unsuccessful, training is stopped. The present application also provides a method of federal learning system using a trusted server capable of authentication as described above, the steps comprising: s10, the trusted server obtains an initialized model and model parameters, and sends the initialized model and model parameters to a client side participating in training; S20, the client side participating in training trains the received model and model parameters locally to generate local model parameters, and optimizes the local model parameters to obtain optimized model parameters; The client side participating in training sends the optimized model parameters and the local generation meta-parameters for verification to the trusted server, wherein the local generation meta-parameters are the generation meta-of the public finite circulation group G; s30, the trusted server carries out safe aggregation on the optimized model parameters in a black box mode to obtain aggregated model parameters, adds corresponding noise to the aggregated model parameters according to preset privacy budget to obtain model parameters after differential privacy, an