CN-116862022-B - Efficient privacy protection personalized federal learning method for communication
Abstract
The invention discloses a personalized federal learning method with efficient communication and privacy protection. The invention researches personalized federal learning based on feature fusion mutual learning, and can realize efficient personalized learning of communication by performing interactive training on a sharing model, a private model and a fusion model of the client. Specifically, only the sharing model is shared with the global model to reduce the communication cost, and the private model can be subjected to personalized design, and the fusion model can adaptively fuse local knowledge and global knowledge in different stages. Secondly, in order to further reduce the communication cost and enhance the privacy of gradients, the invention designs a privacy protection method based on gradient compression. The method can well realize privacy protection and lightweight compression by constructing a chaotic encryption cyclic measurement matrix. In addition, the invention also provides a self-adaptive iterative hard threshold algorithm based on sparsity, so as to improve flexibility and reconstruction performance.
Inventors
- CHEN SIGUANG
- WANG QIAN
- ZENG WENJUN
- WU MENG
Assignees
- 南京邮电大学
Dates
- Publication Date
- 20260505
- Application Date
- 20230628
Claims (4)
- 1. A personalized federal learning method with efficient privacy protection for communication is characterized by comprising the following steps: (1) Based on the difference of privacy protection requirements of different clients, designing a personalized federal learning network model with privacy protection, wherein the model consists of a privacy client, a public client and a central server; (2) All clients learn knowledge through respective neural networks and realize personalized learning by combining the knowledge learned by the local data sets with global knowledge, wherein after the clients complete training on the local neural networks by utilizing the local data sets, the privacy clients perform gradient compression after grouping and aggregating the neural network gradients of the privacy clients and the neural network gradients of the public clients, so that communication cost is reduced and gradient information is protected, and the privacy clients upload the compressed gradients to a central server for global aggregation; (3) The central server decompresses and reconstructs the compressed gradient uploaded by the received clients, carries out global aggregation on the reconstructed gradient, updates the global neural network and finally distributes model parameters of the global neural network to each client; The specific method for compressing the gradient after polymerization is as follows: in privacy client i, let it be assumed that the layer j gradient Is of the dimension of Gradient pair using a measurement matrix Compression and measurement matrix The generation method of (2) is as follows; 1) Generating length to be based on Chebyshev mapping Is of chaotic sequence of (2) The chaotic sequence is the first Individual elements The generated expression is: ; Wherein q is the Chebyshev order, For the nth element of the chaotic sequence, Initial value Selecting a sampling interval d, and performing interval sampling on the chaotic sequence, wherein the sequence is set as a j-th layer measurement matrix Is arranged in the first row of the (c), Standard deviation of (2) ; 2) For the one generated in step 1) Execution of Cyclic shift to the left a time to generate a dimension of A kind of electronic device The method comprises the following steps: ; Wherein, the Is constant and meets For a pair of Normalization is performed, namely: ; When (when) When the value of the temperature is constant, For the normalized coefficient, then, ; 3) Repeating the above two steps to generate gradient Measurement matrix for each layer Finally, a measurement matrix is formed , wherein, Satisfying limited equidistant properties; privacy client i is based on Performing gradient compression, gradient after compression Expressed as: ; Wherein, the In order to perceive the matrix of the device, Representing the discrete cosine transformed sparse basis matrix, Is a sparse coefficient vector; The specific method of the step (3) is as follows: the central server reconstructs after receiving all the compressed gradients, globally aggregates the reconstructed gradients and updates the global neural network; the reconstruction target is based on a given compression gradient Sensing matrix Solving sparse coefficient vectors Reconstruction is achieved by optimizing the minimization problem based on the L2 norm as follows: ; ; wherein the constraint indicates that the number of non-zero values in the reconstructed gradient does not exceed , The specific estimation method of (1) is defined as: ; Wherein, the Representation of The number of elements in the method is closely related to the sparsity, namely the iteration number of the whole solving process is Symbol' "Means rounding down; In the course of the optimization process, The updating is as follows: ; Wherein, the Representing the first iteration , Is a nonlinear operator function which will Except for the maximum absolute value Elements other than the individual elements are all set to zero, defined as: ; Wherein, the Represented as a pair of The absolute value of the element is ordered A maximum value; Through the process of Obtaining the optimal sparse coefficient vector after multiple iterations Then the reconstructed gradient is ; And carrying out global aggregation of gradients and updating a global neural network, wherein the updating process of the global neural network is as follows: ; Wherein, the Is a parameter of the global neural network, Represent the learning rate of the global model, the total sample size is , Is that Is a sample number of (a) in a sample.
- 2. The method of claim 1, wherein step (1) comprises setting the central server to be within range of Personal privacy client Public client and privacy client From public groups Is selected randomly The common clients form a new group, and all clients are divided into The number of groups of the device, The private client and the public client train the neural network by utilizing own local data sets, and perform federal learning through sharing the neural network gradient under the assistance of the central server, so that the cooperation between the clients is realized.
- 3. The communication efficient privacy preserving personalized federal learning method of claim 2, wherein step (2) comprises: Step 2.1), client i uses its local data set Training a neural network, wherein the neural network consists of three sub-models, namely a private model, a shared model and a fusion model, and the private model is extracted from a local data set to the characteristics The sharing model extracts features Will (i) be Through the bridging layer After splicing, inputting a fusion model to finish feature fusion to obtain fusion features The three sub-models learn each other based on knowledge distillation to complete local training, wherein the loss of each sub-model consists of two parts, one is the cross entropy loss between the predicted hard object and ground-truth The other is the Kullback-Leibler divergence of the soft target between the sub-models Designing a balance weight which is independent of time For a pair of Scaling is performed, and the weight slowly rises from 0 to 1 with the gaussian process, namely: ; Wherein, the And Respectively are And stopping the run-up iteration number, when the whole training process reaches a preset stability, The training loss of private model, shared model and fusion model is fixed to a value of 1 , , The definition is as follows: ; ; ; Wherein, the Hard objects respectively representing private model, shared model and fusion model, corresponding to 、 、 Are their soft targets, which, among other things, To control the distillation temperature of softness, when In the time-course of which the first and second contact surfaces, As a softmax function; In the formula Is a weighted set soft target of private and shared models, defined as: ; Wherein, the Is a set weighting factor, and uses a run-up method to dynamically scale over time The adjustment is made from 0.5 to 1, It is And Is a weighted integration feature of (2); from the above analysis, the total loss model of the local client i is defined as: ; Uploading the gradient of the shared model to a central server for global aggregation, wherein the gradient of the shared model The method comprises the following steps: ; Wherein, the Representing model parameters of the shared model in client i with respect to sample b in the local dataset, Is that Is used for the number of samples of (a), Representing training loss of the shared model in client i with respect to sample b in the local dataset; The private model and the fusion model are only updated locally, and the specific process is as follows: ; ; Wherein, the Is the learning rate of the private model and the fusion model, Parameters of the private model and the fusion model in the client i respectively, Model parameters of a private model and a fusion model in the client i relative to a sample b in the local data set; Step 2.2) when all clients have done local training, the privacy client will perform group aggregation of gradients, then the gradient of privacy client i The updating is as follows: ; Wherein, the Gradient of common client k in the same group as privacy client i; Step 2.3) to be applied to the gradient after polymerization Compression is performed.
- 4. A communications efficient privacy preserving personalized federal learning method according to claim 3, wherein the local dataset employs a classified dataset of MNIST or CIFAR.
Description
Efficient privacy protection personalized federal learning method for communication Technical Field The invention belongs to the fields of privacy protection and federal learning, and particularly relates to a personalized federal learning method with efficient communication and privacy protection. Background Today, as computing power of devices increases, many excellent models are developed to extract potential patterns in the vast amounts of data that are routinely generated, which motivates the rapid development of deep learning (DEEP LEARNING, DL). However, conventional centralized deep learning may cause communication congestion due to massive data transmission, and the quality of service cannot be effectively ensured. Moreover, privacy leakage in the data collection process is a major threat to it. Federal learning (FEDERATED LEARNING, FL) is a promising paradigm in distributed deep learning, leading to substantial advances in both privacy protection and communication overhead. While FL produces more advantages than centralized DL, the model update process still introduces significant communication overhead, especially when the client has a large-scale model. While many communication efficient FL schemes with privacy protection use redundant encryption algorithms, this results in more computational costs. In view of the above problems, the current research direction mainly includes the following three schemes. The first type of scheme improves communication efficiency by reducing the number of communication rounds, selecting a part of participating clients, or minimizing communication time. Although the scheme is helpful for improving the communication efficiency, the data scale of the model parameters is not fundamentally reduced, namely, massive parameters still exist in single-round communication, especially in the case of large client model scale. The second type of scheme reduces the data amount of the uploading or downloading parameters by various compression methods. This approach does save communication costs by reducing communication bits. But they do not achieve efficient training performance when the data is Non-IID. At the same time, it is assumed that all client models are homogenous, which limits the personalization of the clients. In order to overcome the difficulty brought by the isomerism of the data set and the model, a third class of communication high efficiency is provided, and the performance of the model under the isomerism data set and the model can be improved. The personalized scheme can be well suitable for Non-IID data distribution and model heterogeneous conditions, and brings a lot of inspires to ingenious utilization of KD. However, they do not reasonably integrate local knowledge and global knowledge, and it is difficult for a client who has a Non-IID dataset to achieve stable model training. From the three schemes, particularly the most representative and most advanced studies in the third class, KD can be found to be beneficial for reducing communication costs, but also to provide FL with good heterogeneous fault tolerance. However, small models do not mean that there are few parameters. Some schemes only transmit the last layer of logic output to further reduce communication traffic, but this can induce a knowledge transfer deficiency. At the same time, it is also important how to let clients better absorb knowledge from global knowledge and not conflict with local knowledge. Furthermore, while communication efficient FL with privacy preserving methods were developed in the first two classes, they typically introduce additional encryption techniques that can lead to more computational costs and higher demands on hardware. Therefore, there is a need to design a more efficient, more personalized FL with privacy protection. Disclosure of Invention Aiming at the defects in the prior art, the invention aims to design a high-efficiency privacy protection personalized federal learning method for communication, which can enable a client to complete personalized learning at low communication cost, overcome the performance loss problems caused by model isomerism and data isomerism, realize privacy protection and lightweight compression, and improve the reconstruction flexibility. In order to achieve the above purpose, the invention provides a personalized federal learning method with efficient communication and privacy protection, which comprises the following steps: (1) Based on the difference of privacy protection requirements of different clients, designing a personalized federal learning network model with privacy protection, wherein the model consists of a privacy client, a public client and a central server; (2) All clients learn knowledge through respective neural networks and realize personalized learning by combining the knowledge learned by the local data sets with global knowledge, wherein after the clients complete training on the local neural networks by utilizing