CN-115422562-B - Coding domain aggregation method and system for federally learning multiparty gradients

CN115422562BCN 115422562 BCN115422562 BCN 115422562BCN-115422562-B

Abstract

The coding domain aggregation method and system for the federal learning multiparty gradients comprises the steps that a plurality of federal participation terminals respectively convert respective local data sets into quantization gradients and input the quantization gradients into a pre-training coder to obtain coding gradients, a server obtains the coding gradients corresponding to each federal participation terminal, aggregates all the coding gradients to obtain an aggregation result, inputs the aggregation result into the pre-training decoder to obtain decoding gradients, and the server updates a federal learning model according to the decoding gradients and sends the updated federal learning model to each federal participation terminal. According to the invention, the encoding gradients are obtained by respective calculation of the pre-training encoders by the plurality of federal participant terminals, so that only the encoding gradients are needed to be communicated during communication, and the server adds up the encoding gradients and uses the pre-training decoders to aggregate the results, thereby reducing the calculation cost and the communication burden.

Inventors

XU ZENGLIN
ZENG DUN
LIU SHIYU
WANG HUI
LI GE

Assignees

鹏城实验室

Dates

Publication Date: 20260512
Application Date: 20220819

Claims (6)

1. A method for federally learning coding domain aggregation of multi-party gradients, comprising: the plurality of federal participant terminals respectively convert the respective local data sets into quantization gradients, and input the quantization gradients into a pre-training encoder to obtain encoding gradients; The server acquires the coding gradients corresponding to each federal participation terminal, aggregates all the coding gradients to obtain an aggregation result, and inputs the aggregation result into a pre-training decoder to obtain a decoding gradient; the server updates the federal learning model according to the decoding gradient and sends the updated federal learning model to each federal participation terminal; the pre-training step of the pre-training encoder and the pre-training decoder comprises: the server initializes the encoder, the decoder and the quantization parameter, and randomly generates a training data set according to the quantization parameter; Sampling vectors in the training dataset, updating the encoder and the decoder based on a preset optimization objective training; Outputting a pre-training encoder and a pre-training decoder when the encoder and the decoder reach the preset optimization target and the encoding noise meets the standard Gaussian distribution; the preset optimization target is ; Wherein the said As a decoder function, said As an encoder function, said For the number of federal participating terminals in a federal round, the Is a data sample in a training data set, said Is a loss function; the coding noise is ; The plurality of federal participant terminals convert respective local data sets into quantization gradients and input the quantization gradients into a pre-training encoder, and before obtaining encoding gradients, the method further comprises: the server sends the pre-training coder to a plurality of federal participation terminals; The server initializes an encoder, a decoder, and quantization parameters, and randomly generates a training data set based on the quantization parameters, comprising: the server defines an encoder, a decoder, a quantization space size, and an input vector size; according to the quantization space size and the input vector size, a training data set is randomly generated, and each data sample in the training data set is an integer vector; And unifying the pre-training encoders and the pre-training decoders in pairs at the server, wherein the plurality of federal participant terminals encode by using the unified pre-training encoders, and the server directly sums the encoding gradients by using the pre-training decoders.
2. The code domain aggregation method of federal learning multiparty gradients of claim 1, wherein the plurality of federal participant terminals respectively convert respective local data sets into quantization gradients and input the quantization gradients into a pre-training encoder to obtain the code gradients, comprising: calculating an original gradient based on the neural network by the plurality of federal participant terminals by using the respective local data sets; Quantizing the original gradient to obtain a quantized gradient, and mapping the quantized gradient to an integer domain; And inputting the quantized gradient into a pre-training encoder to obtain an encoding gradient.
3. The code domain aggregation method of federally learned multi-party gradients according to claim 1, wherein sampling vectors in the training dataset, updating the encoder and decoder based on preset optimization objective training, comprises: Selecting m data samples from the training data set, and training the encoder and the decoder based on a preset optimization target; and calculating a loss function according to a training result, calculating a gradient according to the inverse derivative of the loss function, and updating an encoder and a decoder according to the gradient.
4. The code domain aggregation method of federally learned multi-party gradients according to claim 1, further comprising, after sampling vectors in the training dataset and updating the encoder and decoder based on preset optimization objective training: And when the encoder and the decoder do not reach the preset optimization target and/or the encoding noise does not meet the standard Gaussian distribution, continuing to select m data samples from the training data set, and training and updating the encoder and the decoder based on the preset optimization target.
5. The code domain aggregation system for federally learning multiparty gradients is characterized by comprising a plurality of federally participating terminals and a server, wherein the federally participating terminals are connected with the server through a network; the federal participation terminal is used for converting respective local data sets into quantization gradients, and inputting the quantization gradients into the pre-training encoder to obtain encoding gradients; The server is used for acquiring the coding gradients corresponding to each federation participation terminal, aggregating all the coding gradients to obtain an aggregation result, inputting the aggregation result into a pre-training decoder to obtain a decoding gradient, updating a federation learning model according to the decoding gradient, and sending the updated federation learning model to each federation participation terminal; the pre-training step of the pre-training encoder and the pre-training decoder comprises: the server initializes the encoder, the decoder and the quantization parameter, and randomly generates a training data set according to the quantization parameter; Sampling vectors in the training dataset, updating the encoder and the decoder based on a preset optimization objective training; Outputting a pre-training encoder and a pre-training decoder when the encoder and the decoder reach the preset optimization target and the encoding noise meets the standard Gaussian distribution; the preset optimization target is ; Wherein the said As a decoder function, said As an encoder function, said For the number of federal participating terminals in a federal round, the Is a data sample in a training data set, said Is a loss function; the coding noise is ; The server sends the pre-training coder to a plurality of federal participation terminals; The server initializes an encoder, a decoder, and quantization parameters, and randomly generates a training data set based on the quantization parameters, comprising: the server defines an encoder, a decoder, a quantization space size, and an input vector size; according to the quantization space size and the input vector size, a training data set is randomly generated, and each data sample in the training data set is an integer vector; And unifying the pre-training encoders and the pre-training decoders in pairs at the server, wherein the plurality of federal participant terminals encode by using the unified pre-training encoders, and the server directly sums the encoding gradients by using the pre-training decoders.
6. A computer readable storage medium storing a computer program executable to implement the steps of the federally learned multi-party gradient encoding domain aggregation method according to any one of claims 1-4.

Description

Coding domain aggregation method and system for federally learning multiparty gradients Technical Field The invention relates to the technical field of deep learning, in particular to a coding domain aggregation method and a coding domain aggregation system for federal learning multiparty gradients. Background Federal learning is a new distributed deep learning paradigm that combines data holders in a wide area network to co-train a deep learning model. At the same time, the privacy data of the data holder does not leave the local, thus protecting the privacy data of the participants of each party. Previous studies have shown that direct transfer of model parameters or gradient information still causes privacy data leakage problems, and thus security techniques are still needed to protect gradient information in the federal learning process. Existing cryptographic protection schemes, such as multiparty secure computing (MPC) and Homomorphic Encryption (HE) techniques, can place a significant computational and communication burden on federal learning systems. The homomorphic encryption technology realizes that data can execute specific operation in ciphertext space, and ensures that the result of ciphertext decoding is equal to the direct operation result of original data. This process (taking the addition homomorphism as an example) can be described as E (x+y) =e (x) +e (y), and x+y=d [ E (x) +e (y) ]. In practical applications, the above procedure introduces additional computational costs and communication burden. The prior practical research shows that homomorphic encryption can enlarge the iteration time of a federal learning system by 135 times, and the traffic is enlarged to 13.1GB from the original 85.89 MB. Accordingly, the prior art has drawbacks and needs to be improved and developed. Disclosure of Invention The technical problem to be solved by the invention is to provide a code domain aggregation method and a code domain aggregation system for federal learning multiparty gradient aiming at the defects in the prior art, and aims to solve the problems of high calculation cost and communication burden introduced in the federal learning process in the prior art. The technical scheme adopted for solving the technical problems is as follows: A method of code domain aggregation for federally learning a multiparty gradient, comprising: the plurality of federal participant terminals respectively convert the respective local data sets into quantization gradients, and input the quantization gradients into a pre-training encoder to obtain encoding gradients; The server acquires the coding gradients corresponding to each federal participation terminal, aggregates all the coding gradients to obtain an aggregation result, and inputs the aggregation result into a pre-training decoder to obtain a decoding gradient; and the server updates the federal learning model according to the decoding gradient and sends the updated federal learning model to each federal participation terminal. In one implementation, the plurality of federal participant terminals respectively convert respective local data sets into quantization gradients and input the quantization gradients into a pre-training encoder to obtain encoding gradients, including: calculating an original gradient based on the neural network by the plurality of federal participant terminals by using the respective local data sets; Quantizing the original gradient to obtain a quantized gradient, and mapping the quantized gradient to an integer domain; And inputting the quantized gradient into a pre-training encoder to obtain an encoding gradient. In one implementation, the pre-training step of the pre-training encoder and pre-training decoder comprises: the server initializes the encoder, the decoder and the quantization parameter, and randomly generates a training data set according to the quantization parameter; Sampling vectors in the training dataset, updating the encoder and the decoder based on a preset optimization objective training; and outputting a pre-training encoder and a pre-training decoder when the encoder and the decoder reach the preset optimization target and the encoding noise meets the standard Gaussian distribution. In one implementation, the preset optimization objective is Wherein D c is the decoder function, E c is the encoder function, m is the number of federal participating terminals in a federal round, L is the data samples in the training data set, and L is the loss function. In one implementation, the encoding noise is Wherein D c is the decoder function, E c is the encoder function, m is the number of federal participating terminals in a federal round, and l is the data samples in the training data set. In one implementation, the plurality of federal participant terminals convert respective local data sets into quantization gradients and input the quantization gradients into a pre-training encoder, and before obtaining the encoding gra