EP-4736076-A1 - METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR UPDATING MACHINE LEARNING MODEL

EP4736076A1EP 4736076 A1EP4736076 A1EP 4736076A1EP-4736076-A1

Abstract

Embodiments of the present disclosure relate to a solution for updating machine learning model. In a solution in accordance with the embodiments of the present disclosure, a certain node of a plurality of nodes locally determines a plurality of slices of a model parameter corresponding to the certain node, and uses a slice of the model parameter corresponding to the certain node and at least one slice of the model parameter corresponding to other nodes of the plurality of nodes to determine a target model parameter for the certain node. In this way, it can ensure the data privacy and security of a single participant without being exposed or derived by other participants during the model training for the machine learning system.

Inventors

ZHANG, JINSHENG
GE, XIN
LIU, YUEHUA

Assignees

Koninklijke Philips N.V.

Dates

Publication Date: 20260506
Application Date: 20240619

Claims (1)

2023PF00230 WHAT IS CLAIMED IS: 1. A method comprising: determining a set of model parameters corresponding to a first node, the first node being a medical node of a plurality of nodes; determining a plurality of slices of a model parameter in the set of model parameters corresponding to the first node, the plurality of slices of the model parameter including at least a first slice and a second slice of the model parameter corresponding to the first node; and generating a first target model parameter corresponding to the first node based on the first slice of the model parameter corresponding to the first node and a second slice of a model parameter corresponding to a second node of the plurality of nodes. 2. The method according to claim 1, wherein generating the first target model parameter corresponding to the first node comprises: determining a random Gaussian noise corresponding to the first node based on a Gaussian distribution; and generating the first target model parameter corresponding to the first node based on the determined random Gaussian noise corresponding to the first node, the first slice of the model parameter corresponding to the first node being reserved at the first node, and the second slice of the model parameter corresponding to the second node being received from the second node. 3. The method according to claim 1, wherein determining the model parameter corresponding to the first node comprises: receiving a model for an iteration; updating the received model with local sample data at the first node; and determining the model parameter corresponding to the first node based on a difference between parameters of the received model and corresponding parameters of the updated model. 4. The method according to claim 1, wherein determining the plurality of slices of the model parameter corresponding to the first node comprises: splitting, at the first node, the model parameter corresponding to the first node into 2023PF00230 the plurality of slices of the model parameter corresponding to the first node, the number of the plurality of slices of the model parameter corresponding to the first node being equal to the number of the plurality of nodes; reserving, at the first node, the first slice of the model parameter corresponding to the first node; transmitting, to the second node, the second slice of the model parameter corresponding to the first node; and receiving, from the second node, the second slice of the model parameter corresponding to the second node. 5. The method according to claim 1, wherein determining the model parameter corresponding to the first node comprises: determining a first model parameter corresponding to the first node based on a norm value of a gradient corresponding to the first node; and determining a second model parameter corresponding to the first node based on a product of multiplying the gradient corresponding to the first node and the reverse of the first model parameter corresponding to the first node. 6. The method according to claim 5, wherein determining the plurality of slices of the model parameter corresponding to the first node comprises: splitting the first model parameter corresponding to the first node into a plurality of slices of the first model parameter corresponding to the first node, the plurality of slices of the first model parameter including at least a first slice and a second slice of the first model parameter, and the number of the plurality of slices of the first model parameter corresponding to the first node being equal to the number of the plurality of nodes; and splitting the second model parameter corresponding to the first node into a plurality of slices of the second model parameter corresponding to the first node, the plurality of slices of the second model parameter including at least a first slice and a second slice of the second model parameter, and the number of the plurality of slices of the second model parameter corresponding to the first node being equal to the number of the plurality of nodes. 7. The method according to claim 6, further comprising: reserving, at the first node, the first slice of the first model parameter corresponding 2023PF00230 to the first node and the first slice of the second model parameter corresponding to the first node; transmitting, to the second node, the second slice of the first model parameter corresponding to the first node and the second slice of the second model parameter corresponding to the first node; and receiving, from the second node, a second slice of a first model parameter corresponding to the second node and a second slice of a second model parameter corresponding to the second node. 8. The method according to claim 7, further comprising: determining a third model parameter based on a statistic of a plurality of slices of the first model parameters corresponding to the plurality of nodes, the plurality of slices of the first model parameters corresponding to the plurality of nodes including at the plurality of slices of the first model parameter corresponding to the first node and a plurality of slices of a first model parameter corresponding to the second node; and determining a plurality of slices of the third model parameter based on the third model parameter, the plurality of slices of the third model parameter including at least a first slice of the third model parameter corresponding to the first node and a second slice of the third model parameter corresponding to the second node, wherein the number of the plurality of slices of the third model parameter corresponds to the number of the plurality of nodes, and the third model parameter is based on at least one of a maximum value, a median value, an average value, or quantile value. 9. The method according to claim 8, wherein generating the first target model parameter corresponding to the first node comprises: determining a set of slices of the second model parameter at the first node, the set of slices of the second model parameter including at least the first slice of the second model parameter reserved at the first node and the second slice of the second model parameter received from the second node; determining, for each slice in the set of slices of the second model parameter at the first node, a product of multiplying the first slice of the third model parameter corresponding to the first node and the slice of the second model parameter at the first node; and determining a product sum for the first node based on the product of multiplying the 2023PF00230 first slice of the third model parameter corresponding to the first node and each slice in the set of slices of the second model parameter at the first node. 10. The method according to claim 9, wherein generating the first target model parameter corresponding to the first node further comprises: determining a random Gaussian noise corresponding to the first node locally based on a Gaussian distribution for the plurality of nodes, the additivity of the determined random Gaussian noises of the plurality of nodes matching the Gaussian distribution; and generating the first target model parameter corresponding to the first node by adding the determined random Gaussian noise corresponding to the first node and the determined product sum for the first node. 11. The method according to claim 1, further comprising at least one of the following: determining a second target model parameter based on a threshold number of multiple first target model parameters corresponding to the plurality of nodes, the multiple first target model parameters corresponding to the plurality of nodes including at least the first target model parameter corresponding to the first node and a first target model parameter corresponding to the second node; transmitting the first target model parameter corresponding to the first node to the second node, such that the second node determines the second target model parameter based on the threshold number of the multiple first target model parameters corresponding to the plurality of nodes; or transmitting the first target parameter corresponding to the first node to a server, such that the server determines the second target model parameter based on the threshold number of the multiple first target model parameters corresponding to the plurality of nodes, wherein the threshold number is equal or less than the number of the multiple first target model parameters corresponding to the plurality of nodes. 12. The method according to claim 1, further comprising: determining a second target model parameter based on the Gaussian distribution; updating a model for a next iteration based on the determined second target model parameter; and 2023PF00230 transmitting the updated model for the next iteration to the first node. 13. An apparatus comprising: means for determining a set of model parameters corresponding to the apparatus, the apparatus being a medical node of a plurality of nodes; means for determining a plurality of slices of the model parameter corresponding to the apparatus, the plurality of slices of the model parameter including at least a first slice and a second slice of the model parameter corresponding to the apparatus; and means for generating a first target model parameter corresponding to the apparatus based on the first slice of the model parameter corresponding to the apparatus and a second slice of a model parameter corresponding to a second node of the plurality of nodes. 14. A device comprising: at least one processor; and a memory coupled to the at least one processor and having instructions stored thereon, the instructions, when executed by the at least one processor, causing the device to perform the method according to any of claims 1 to 12. 15. A computer-readable storage medium storing computer-readable instructions thereon which, when executed, cause a computer to perform the method according to any of claims 1 to 12. 16. A computer program product being tangibly stored on a non-stationary computer readable medium and comprising computer-executable instructions which, when executed, cause a computer to perform the method according to any of claims 1 to 12.

Description

2023PF00230 METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM FOR UPDATING MACHINE LEARNING MODEL FIELD [0001] Embodiments of the present disclosure generally relate to the field of computer technology, and in particular to a method, an apparatus, a device and a computer-readable storage medium for updating machine learning model. BACKGROUND [0002] Machine learning is a branch of artificial intelligence (AI) in the field of computer science technology, which focuses on the use of data and algorithms to train machine learning models. The machine learning models can be used to imitate the way that humans learn to conduct computation and prediction. For obtaining a high-quality model, it requires multiple iterations of training the machine learning models using training data. [0003] Federated Learning (FL) is a machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of participants. The Federated Learning enables the participants to collaboratively train the machine learning models while keeping the raw training data on each user’s device, decoupling the ability to conduct the machine learning from the need to store the data in the a centralized storage. SUMMARY [0004] In general, embodiments of the present disclosure provide a solution for updating machine learning model. [0005] In a first aspect, the present disclosure provides a method. The method comprises: determining a set of model parameters corresponding to a first node, the first node being a medical node of a plurality of nodes; determining a plurality of slices of a model parameter in the set of model parameters corresponding to the first node, the plurality of slices of the model parameter including at least a first slice and a second slice of the model parameter corresponding to the first node; and generating a first target model parameter corresponding to the first node based on the first slice of the model parameter corresponding to the first node and a second slice of a model parameter corresponding to a 2023PF00230 second node of the plurality of nodes. In this way, the present disclosure can ensure the data privacy and security of a single participant without being exposed or derived by other participants during the model training for the machine learning system. [0006] In some embodiments of the first aspect, generating the first target model parameter corresponding to the first node comprises: determining a random Gaussian noise corresponding to the first node based on a Gaussian distribution; and generating the first target model parameter corresponding to the first node based on the determined random Gaussian noise corresponding to the first node, the first slice of the model parameter corresponding to the first node being reserved at the first node, and the second slice of the model parameter corresponding to the second node being received from the second node. In this way, the present disclosure can not only guarantee the privacy and security of the data from the multiple participants by using the random Gaussian noise, but also not affect the accuracy of the output of the machine learning system. [0007] In some embodiments of the first aspect, determining the model parameter corresponding to the first node comprises: receiving a model for an iteration; updating the received model with local sample data at the first node; and determining the model parameter corresponding to the first node based on a difference between parameters of the received model and corresponding parameters of the updated model. In this way, the present disclosure can be accurately suitable for updating the machine learning models involving multiple participants for each iteration. [0008] In some embodiments of the first aspect, determining the plurality of slices of the model parameter corresponding to the first node comprises: splitting, at the first node, the model parameter corresponding to the first node into the plurality of slices of the model parameter corresponding to the first node, the number of the plurality of slices of the model parameter corresponding to the first node being equal to the number of the plurality of nodes; reserving, at the first node, the first slice of the model parameter corresponding to the first node; transmitting, to the second node, the second slice of the model parameter corresponding to the first node; and receiving, from the second node, the second slice of the model parameter corresponding to the second node. In this way, the present disclosure can enable multiple participants to share model parameters with each other, while avoiding local data being exposed to or derived by either other participants or the central server. [0009] In some embodiments of the first aspect, determining the model parameter 2023PF00230 corresponding to the first node comprises: determining a first model parameter corresponding to the first node based on a norm value of