CN-122021805-A - Split federal learning method and system

CN122021805ACN 122021805 ACN122021805 ACN 122021805ACN-122021805-A

Abstract

The invention discloses a split federal learning method and a split federal learning system, which belong to the technical field of data security, wherein the method comprises the steps that a task publisher splits a global model into a client terminal model and a server sub-model; the method comprises the steps of obtaining target crushing data through local training of a client group, obtaining second gradient and third gradient through local training of the server group based on the target crushing data, updating a server sub-model based on the third gradient, obtaining candidate sub-models through updating of the client sub-model based on the second gradient by the client, aggregating the candidate sub-models through a leader server and initiating consensus so that all servers execute synchronous Bayesian fault tolerance consensus agreements to achieve consensus, sending an agreed aggregation result to the client group so that the client group obtains an updated client sub-model by the leader server, and repeatedly executing relevant training steps until splitting federal learning is completed. The method can improve the privacy and stability of model training.

Inventors

LIU YIZHONG
JIANG ZIXU
WANG GANG
JIA ZIXIAO
LI DAWEI
GUAN ZHENYU
LIU YING
WU QIANHONG
DU HAOHUA

Assignees

北京航空航天大学
广西电网有限责任公司

Dates

Publication Date: 20260512
Application Date: 20260127

Claims (9)

1. A split federal learning method, comprising: Step S101, a task publisher splits a global model into a client terminal model and a server sub-model and respectively transmits the client terminal model and the server sub-model to a client group and a server group, wherein the server group comprises a leader server and a plurality of common servers; step S102, each client transmits the client terminal model forward by utilizing a local data set to obtain target crushing data, and transmits the target crushing data to a server group; Step S103, training all servers on the server sub-model based on the respective target crushing data to obtain a first gradient, and The leader server aggregates all the first gradients corresponding to the same client to obtain second gradients, transmits each second gradient to the corresponding client, aggregates all the second gradients to obtain third gradients, and Updating the server sub-model by all servers based on the third gradient; Step S104, each client updates the client terminal model based on the corresponding second gradient respectively to obtain candidate sub-models, and transmits the candidate sub-models to the server group; Step S105, the leader server aggregates all candidate sub-models and initiates consensus based on the aggregation result, so that all servers execute synchronous Bayesian fault-tolerant consensus protocol to agree on the aggregation result, and the leader server sends the agreed aggregation result to the client group, so that the client group takes the agreed aggregation result as an updated client sub-model; Step S106, repeatedly executing step S102 to step S105 until split federation learning is completed.
2. The method of claim 1, wherein the task publisher splits the global model into a client terminal model and a server sub-model and issues to the client group and the server group, respectively, comprising: The task publisher determines the position of a split layer of the global model, takes the split layer and a model at the bottom end of the split layer as client terminal models, and takes a model at the top end of the split layer as a server sub-model; The task publisher issues the client sub-model to the client group and the server sub-model to the server group.
3. The method of claim 1, wherein each client uses the local data set to forward propagate the client terminal model to obtain target crush data and transmit the target crush data to the server farm, comprising: For each client, forward propagating the client sub-model with a local data set to obtain raw crush data, and performing local differential privacy processing on the raw crush data to obtain target crush data, and And sending the target crushing data and the local label data to the server group.
4. A method according to claim 3, wherein the training of all servers on the server sub-model based on the respective target crush data, obtaining a first gradient, comprises: for each server, forward propagation is performed on the server sub-model based on each target crushing result to obtain predictive label data corresponding to each client, and And respectively calculating loss values between each piece of predictive tag data and the corresponding local tag data through the loss function, and respectively carrying out back propagation on the server submodel based on each loss value to obtain a first gradient corresponding to each client.
5. The method of claim 1, wherein each client updates the client terminal model based on the corresponding second gradient to obtain candidate sub-models, respectively, comprising: Each client performs back propagation based on the corresponding second gradient data to obtain a corresponding fourth gradient; For each client, a clipping boundary, a privacy budget, and a probability relaxation function are obtained, an Determining a clipped gradient using a gradient clipping algorithm based on the corresponding fourth gradient and clipping boundary, and Determining a random noise vector meeting differential privacy requirements using a local noise generation algorithm based on a privacy budget, a probability relaxation function, and a clipping boundary, and Based on the clipped gradient and random noise vector, the clipped gradient is disturbed by utilizing a local disturbance algorithm to obtain the disturbed gradient, and And updating the client terminal model based on the disturbed gradient to obtain a candidate sub model.
6. The method of any one of claims 1 to 5, wherein the leader server aggregates all candidate sub-models and initiates consensus based on the aggregate results, such that all servers execute a synchronous bayer fault-tolerant consensus protocol consensus on the aggregate results, comprising: step S201, a leader server aggregates all candidate sub-models to obtain an aggregation result; step S202, the leader server builds Propose proposal message based on the aggregation result and broadcasts in the server group; step S203, each ordinary server receives Propose the proposal message and verifies if the leader server is bad in the case that Propose the proposal message is legal, and If it is verified that the leader server is not disliked, generating and broadcasting a vot message, and when the received broadcasting of the vot message by the rest of the servers meets the preset requirement, generating and broadcasting a determination message, and If the leader server is verified to be bad, or if the leader server is verified to be not bad, but when the received broadcast vot messages of the rest servers do not meet the preset requirement, triggering a quick view conversion mechanism, determining a new leader server and a plurality of corresponding common servers, returning to execute the steps S201 to S202 until the number of the confirmation messages received by the current leader server is more than or equal to a first preset threshold, and considering that all the servers agree on the aggregation result of the current leader server.
7. The method of claim 6, wherein triggering the fast view transition mechanism, determining a new leader server and a corresponding plurality of generic servers, comprises: Triggering a quick conversion mechanism, and determining a new view according to a preset view sequence; The leader server in the new view is used as a new leader server, and the common server in the new view is used as a new common server to determine the new leader server and the corresponding plurality of common servers.
8. The method of claim 7, wherein the pre-set view order is determined by: Determining the computing capacity of all servers, and numbering each server in turn according to the strength of the computing capacity; Determining a view number based on each server number, and determining corresponding views based on each view number, wherein one view number corresponds to one view, one view comprises a leader server and a plurality of common servers, the leader server is a server consistent with the view number, and the common servers are the rest servers except the leader server in a server group; view numbers are taken as view orders.
9. A split federal learning system, comprising: The task publisher is configured to split the global model into a client terminal model and a server sub-model and respectively send the client terminal model and the server sub-model to a client group and a server group, wherein the server group comprises a leader server and a plurality of common servers; the client group is configured to forward propagate the client terminal model by utilizing the local data set by each client to obtain target crushing data, and transmit the target crushing data to the server group; A server group configured such that all servers train on a server sub-model based on respective target crush data, a first gradient is obtained, and The leader server aggregates all the first gradients corresponding to the same client to obtain second gradients, transmits each second gradient to the corresponding client, aggregates all the second gradients to obtain third gradients, and Updating the server sub-model by all servers based on the third gradient; the client group is further configured to update the client terminal model based on the corresponding second gradient respectively by each client to obtain candidate sub-models, and the candidate sub-models are transmitted to the server group; The server group is further configured to aggregate all candidate sub-models by the leader server, and initiate consensus based on the aggregate results, so that all servers execute synchronous Bayesian fault-tolerant consensus protocol to agree on the aggregate results, and the leader server sends the agreed aggregate results to the client group, so that the client group takes the agreed aggregate results as updated client sub-models; and the ending module is configured to repeatedly execute the steps corresponding to the client group and the server group until the split federation learning is completed.

Description

Split federal learning method and system Technical Field The invention belongs to the technical field of data security, and particularly relates to a split federal learning method and system. Background With the rapid development of the internet of things technology, mass terminal devices are widely interconnected, and a distributed network with huge scale and high isomerism is formed. These devices continue to generate and collect large-scale, decentralized data, including environmental awareness, user behavior, and device status, providing an important basis for data-based intelligent applications and services. The method has obvious application potential for effectively utilizing widely-distributed heterogeneous data resources, and particularly for a distributed collaborative machine learning method. The method allows a plurality of terminal devices to cooperatively complete model training on the premise of not directly sharing local data, so that data privacy is protected, scattered data resources can be integrated, and a high-quality artificial intelligent model is built together. By realizing trusted collaboration among distributed nodes, the collaborative machine learning system can improve the overall decision-making capability, optimize the resource utilization efficiency and enhance the safety and reliability of the system. The main current distributed collaborative machine learning scheme mainly comprises two types of federal learning and split learning. Federal learning supports multiple participants to cooperatively train a model based on local data, aggregate model parameters or gradients from clients through a central server, and implement common modeling without exposing the raw data. Split learning splits a machine learning model into a plurality of parts, different devices respectively bear training tasks of part structures, and a serial collaboration mechanism is generally adopted to complete updating of the whole model. Although the method has application prospect in a distributed environment, the key problems that a centralized aggregation mechanism has potential safety hazards, federal learning and split learning usually depend on a central server to execute model aggregation, once a central node breaks down, the training process is interrupted and lost, if the server is maliciously controlled, aggregation operation can be executed in an unreliable mode, or inconsistent models are distributed to a client, so that the training process is interfered, privacy data of a participant is more likely to be leaked, and the privacy and stability of model training are not high. Thus, there is a strong need for a split federal learning method and system that can improve the privacy and stability of model training. It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the application and thus may include information that does not form the prior art that is already known to those of ordinary skill in the art. Disclosure of Invention The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview, and is intended to neither identify key/critical elements nor delineate the scope of such embodiments, but is intended as a prelude to the more detailed description that follows. The embodiment of the disclosure provides a split federal learning method and system, so that the privacy and stability of model training can be improved. In some embodiments, a split federal learning method includes: Step S101, a task publisher splits a global model into a client terminal model and a server sub-model and respectively transmits the client terminal model and the server sub-model to a client group and a server group, wherein the server group comprises a leader server and a plurality of common servers; step S102, each client transmits the client terminal model forward by utilizing a local data set to obtain target crushing data, and transmits the target crushing data to a server group; Step S103, training all servers on the server sub-model based on the respective target crushing data to obtain a first gradient, and The leader server aggregates all the first gradients corresponding to the same client to obtain second gradients, transmits each second gradient to the corresponding client, aggregates all the second gradients to obtain third gradients, and Updating the server sub-model by all servers based on the third gradient; Step S104, each client updates the client terminal model based on the corresponding second gradient respectively to obtain candidate sub-models, and transmits the candidate sub-models to the server group; Step S105, the leader server aggregates all candidate sub-models and initiates consensus based on the aggregation result, so that all servers execute synchronous Bayesian fault-tolerant consensus protocol