CN-115983340-B - Plug-and-play heterogeneous model-based federal learning method

CN115983340BCN 115983340 BCN115983340 BCN 115983340BCN-115983340-B

Abstract

The application provides a federal learning method based on a plug-and-play heterogeneous model, which comprises the steps of 1, training a generation model, 2, initializing the model, 3, training a client model M i in each client, and step 4, aggregating sampled data d i ～ x ～ ,y ～ ∈d ～ from a client i and parameters including data quantity participating in training in the server, 5, distributing data of the server, namely, redistributing the sampled data set d ～ into a sub data set d i according to data category distribution weight of private data of each client, and distributing the sub data set d i to each client, and 6, testing the model, and 7, judging whether to continue the next communication by the server. The application improves the effect and efficiency of model training.

Inventors

DU XIAOYU
ZHAO CONGJIAN
TANG JINHUI

Assignees

南京理工大学

Dates

Publication Date: 20260505
Application Date: 20221221

Claims (5)

1. A federal learning method based on a plug-and-play heterogeneous model, the method comprising: step 1, training a generated model, namely initializing the generated model at a server side Discrimination model Training a generative model using a training method that generates a countermeasure network Discrimination model Wherein; For generating data sets In which For discriminating from common data sets Is a real image of (a) , Step 2, initializing a model: Step 3, training client model in each client Comprising: Step4, server aggregation, wherein aggregation from the client side is carried out at the server side Is a sampling data of (a) And parameters including the amount of data involved in the training; Step 5, server-side data distribution, namely, distributing the data Reassigning data class distribution weights according to individual client private data into sub-data sets And distributed to each client; Step 6, model test, in each round of communication, using the classification model after the parameters are updated by the client in the round The test accuracy of the test set is weighted to obtain the average test accuracy of the model; Step 7, the server side judges whether to continue the next communication, if so, the step 3 is returned, if not, the communication is ended, and the global network model parameters are saved; Training client models in individual clients Comprising: Step 3-1 client uses private data Training client model Obtaining a loss function of the client model; Step 3-2, from the second round of communication, using the pictures received from the server side Model as an additional dataset Training is carried out; Step 3-3 from a randomly generated hidden code Initially, a markov chain monte carlo sampler is defined by a majorand's algorithm; Sampler uses client model In order for the network to be conditional, To generate a network, activation can be maximized to specify categories Conditional image generation to update the code To obtain Pictures of the category Picture is taken As data for delivery to a server side ; For data sets Class labels of the medium data; Step 3-4, sampling data Sending to a server side; the sampling rule in the step 3-3 is as follows: ; Wherein the method comprises the steps of Representing the probability of occurrence of the corresponding distribution, Representing random noise conforming to a normal distribution, The posterior loss, the prior loss and the noise parameter of the image are represented respectively.
2. The plug and play heterogeneous model based federal learning method of claim 1, wherein the model is generated The system comprises a full connection layer, a batch standardization layer, an activation function layer and a deconvolution layer; Generating a model The input of (a) is the hidden code h and the target output is from the common data set A distributed image; Discrimination model The system comprises a full connection layer, an activation function layer and a deconvolution layer; Generating a model Discrimination model Adam was used as the optimizer.
3. The plug and play heterogeneous model-based federal learning method of claim 1, wherein initializing the model comprises: Data set Dividing into private data sets according to dirichlet distribution Different clients have different numbers of data with different category distribution; initializing a target classification model M and classifying the target classification model Propagating to individual clients as their local classification model Sharing generative model for all clients Communication between the client and the server is started.
4. The federal learning method based on a plug-and-play heterogeneous model according to claim 1, wherein the model structure of the classification model includes lenet, cnn and mobilenet _v3, and wherein the classification model is used The output dimensions of the hidden layer and the last layer of (a) initialize the length of the hidden code h and the target data set, respectively The number of categories of data in (a).
5. The federal learning method based on a plug-and-play heterogeneous model according to claim 4, wherein the sampler has two modes of operation: For each client, the sampler is for a private data set The same number of samples are taken for all the different classes of generated samples; Sampling for each client the generated samples for different target categories as per the current client private data set The number of training samples of each category is weighted and sampled.

Description

Plug-and-play heterogeneous model-based federal learning method Technical Field The application relates to the technical field of federal learning, in particular to a federal learning method based on a plug-and-play heterogeneous model. Background Federal learning cooperatively trains a global model on the premise of ensuring that a group of clients do not upload local data sets, and each user can only access own data, thereby protecting the privacy of the users participating in training. Federal learning has wide application prospects in industries such as medicine, finance, artificial intelligence and the like due to the advantage of privacy protection, and is a research hotspot in the last years. However, federal learning focuses on obtaining a high-quality global model by learning local data of all participating clients, but because the data of each client in a real scene is heterogeneous, when the problem of data heterogeneity is faced, it cannot train out a global model applicable to all clients, and meanwhile, privacy protection, communication efficiency constraint and personalized model requirements become main directions of federal learning research. In federal learning research in recent years, most methods solve the problems of data isomerism and model isomerism in federal learning by using methods such as transfer learning, meta learning and reinforcement learning, and meanwhile, protection of user privacy by combining methods such as differential privacy, homomorphic encryption and the like with federal learning is also a mainstream direction. However, most of the above methods cannot simultaneously optimize model isomerism and data isomerism, or have some problems in terms of privacy protection, and cannot solve the problems of communication overhead of federal learning and the like. Disclosure of Invention The application provides a federal learning method based on a plug-and-play heterogeneous model, which can be used for solving the technical problem that the optimization of model isomerism and data isomerism cannot be simultaneously realized in the prior art. The application provides a federal learning method based on a plug-and-play heterogeneous model, which comprises the following steps: Step 1, training a generating model, namely initializing a generating model G and a judging model C at a server side, training the generating model G and the judging model C by using a training method for generating an countermeasure network, wherein G is used for generating a false picture of a data set D *, C is used for judging a real image x from a public data set D *, Step 2, initializing a model: Step 3, training a client model M i in each client, including: step 4, server aggregation, namely aggregating the sampling data from the client i at the server side Parameters including the amount of data involved in training; step 5, server-side data distribution, namely re-distributing the sampled data set d ～ into a sub data set d i according to the data category distribution weight of private data of each client-side; Step 6, model test, in each round of communication, using the classification model M i and the test set after the parameters are updated by the client side in the round of communication to test the accuracy, and weighting to obtain the average test accuracy of the model; and 7, the server side judges whether to continue the next communication, if so, the step 3 is returned, and if not, the communication is ended, and the global network model parameters are saved. Optionally, the generated model G includes a full connection layer, a batch normalization layer, an activation function layer, and a deconvolution layer; The input of the generated model G is a hidden code h, and the target output is an image distributed from a public data set D *; the judging model C comprises a full-connection layer, an activation function layer and a deconvolution layer; both the generation model G and the discrimination model C use Adam as an optimizer. Optionally, initializing the model includes: Dividing the data set D into a private data set D i according to Dirichlet distribution as private data of each client; initializing a target classification model M, transmitting the target classification model M to each client as a local classification model M i, sharing a generation model G for all clients, and starting communication between the client and a server. Optionally, training the client model M i in each client includes: Step 3-1, the client trains a client model M i by using private data D i to obtain a loss function of the client model; Step 3-2, training the model M i by using the picture d i received from the server side as an additional data set from the second round of communication; step 3-3, starting from a randomly generated hidden code h, defining a Markov chain Monte Carlo sampler by a general Convergence algorithm; The sampler uses the client model M i as a conditional network, G generates