KR-20260067724-A - APPARATUS, METHOD AND PROGRAM FOR LEARNING LOCAL MODEL

KR20260067724AKR 20260067724 AKR20260067724 AKR 20260067724AKR-20260067724-A

Abstract

The local model training device includes a dynamic gradient adjustment unit that adjusts the magnitude of the gradient by setting a magnitude threshold for the training data and model parameters for training a local model, a dynamic noise scale calculation unit that calculates the magnitude of the noise based on the variance and characteristic parameters of the gradient, a gradient update unit that adds a noise vector corresponding to a Gaussian distribution calculated based on the magnitude of the noise to the gradient, and a gradient transmission unit that transmits the gradient with the added noise vector to a server for training a global model.

Inventors

강승민

Assignees

주식회사 케이티

Dates

Publication Date: 20260513
Application Date: 20241106

Claims (15)

In a local model training device for federated learning, A dynamic gradient adjustment unit that adjusts the magnitude of the gradient by setting a magnitude threshold for the training data and model parameters for training a local model; A dynamic noise scale calculation unit that calculates the magnitude of noise based on the variance and characteristic parameters of the above gradient; A gradient update unit that adds a noise vector corresponding to a Gaussian distribution calculated based on the magnitude of the noise to the gradient; and A gradient transmission unit that transmits the gradient with the above noise vector added to a server that trains a global model; A local model training device including
In Article 1, The above dynamic gradient adjustment unit A local model learning device that updates the magnitude threshold of the gradient based on the magnitude distribution of the gradient representing the change in magnitude of the gradient for each learning round of the local model.
In Article 2, The above dynamic gradient adjustment unit A local model learning device that updates the magnitude threshold of the gradient by adjusting the weight set at the current magnitude threshold of the gradient and the weight set at the magnitude distribution of the gradient.
In Paragraph 3, The above dynamic gradient adjustment unit A local model learning device that, for each learning round of the local model, decreases the weight for the current magnitude threshold of the gradient and increases the weight set for the magnitude distribution of the gradient.
In Article 1, The above dynamic noise scale calculation unit A local model learning device that calculates the characteristic parameters based on the number of samples of the above-mentioned training data, the importance of the above-mentioned local model, and the level of privacy requirements of the above-mentioned local model.
In Article 5, The above dynamic noise scale calculation unit A local model learning device that calculates the characteristic parameters based on weights assigned relatively to the number of samples of the training data, the importance of the training data, and the level of privacy requirements of the local model, respectively.
In Article 1, The above global model A local model learning device that is updated by integrating the gradient to which the above noise vector is added.
In a local model training method of a local model training device for federated learning, A step of adjusting the magnitude of the gradient by setting a magnitude threshold for the training data and model parameters for training a local model; A step of calculating the magnitude of noise based on the variance and characteristic parameters of the above gradient; A step of adding a noise vector corresponding to a Gaussian distribution calculated based on the magnitude of the noise to the gradient; and A step of transmitting the gradient with the noise vector added to it to a server that trains a global model; A local model training method including
In Article 8, The step of adjusting the size of the above gradient A local model training method that updates the magnitude threshold of the gradient based on the magnitude distribution of the gradient representing the change in magnitude of the gradient in each training round of the local model.
In Article 9, The step of adjusting the size of the above gradient A local model learning method that updates the magnitude threshold of the gradient by adjusting the weight set at the current magnitude threshold of the gradient and the weight set at the magnitude distribution of the gradient.
In Article 10, The step of adjusting the size of the above gradient A local model training method wherein, in each training round of the local model, the weight for the current magnitude threshold of the gradient is decreased and the weight set for the magnitude distribution of the gradient is increased.
In Article 8, The step of calculating the magnitude of the above noise A local model learning method that calculates the characteristic parameters based on the number of samples of the training data, the importance of the local model, and the level of privacy requirements of the local model.
In Article 12, The step of calculating the magnitude of the above noise A local model learning method that calculates the characteristic parameters based on weights assigned relatively to the number of samples of the training data, the importance of the training data, and the level of privacy requirements of the local model, respectively.
In Article 8, The above global model A local model training method that is updated by integrating the above noise-added gradient.
In a computer program stored on a computer-readable recording medium comprising instructions that provide a local model learning method, Adjust the magnitude of the gradient by setting a magnitude threshold for the training data and model parameters for training a local model, and Calculate the magnitude of the noise based on the variance and characteristic parameters of the above gradient, and A noise vector corresponding to a Gaussian distribution calculated based on the magnitude of the noise is added to the gradient, and A computer program stored on a computer-readable recording medium, comprising a sequence of instructions for transmitting a gradient with the noise vector added to it to a server that trains a global model.

Description

Apparatus, Method and Program for Learning Local Model The present invention relates to a local model learning device, method, and program for federated learning. The importance of data privacy in the AI field is growing to the point where it is mentioned as a key element in actual Responsible AI. To protect data privacy in this AI sector, the concept of federated learning has emerged. In federated learning, individuals or institutions possessing various data can train models by exchanging only training parameters without sharing their raw data. However, even in federated learning, training parameters such as gradients must be shared with a server hosting a global AI model. During this process, it is possible to infer whether specific data is included in the model's training dataset. Furthermore, there are data privacy risks, such as the exposure of sensitive information belonging to individuals or institutions by backtracking input data using the output and gradients of the global AI model. To address the limitations of federated learning, there have been attempts to further protect data privacy by applying techniques that add uniform noise to gradients. However, this method of adding uniform noise can significantly impair training accuracy due to gradient heterogeneity. Gradient heterogeneity occurs when the characteristics and distribution of data held by each institution differ; for example, if different hospitals participating in federated learning possess different types of patient data, their gradients may vary widely. Since the magnitude and direction of gradients differ for each institution, adding uniform noise when integrating them into a single global model causes the gradients of each institution to be affected differently, leading to imbalances during the training process. More specifically, because the existing method of adding uniform noise does not consider the magnitude or direction of gradients, institutions with small gradients may experience a greater relative impact from the noise, while those with large gradients may experience a relatively smaller impact. For this reason, the method of adding uniform noise poses a problem in that it can distort the contribution of each institution during training and degrade the training accuracy of the overall model. FIG. 1 is a configuration diagram for explaining a location-local model learning device system according to one embodiment of the present invention. FIG. 2 is a configuration diagram for explaining a local model learning device according to one embodiment of the present invention. FIG. 3 is a configuration diagram for explaining a global model learning server according to one embodiment of the present invention. FIG. 4 is a flowchart illustrating a local model learning method performed on a global model learning server and a local model learning device for federated learning according to an embodiment of the present invention. Embodiments of the present invention are described below with reference to the attached drawings so that those skilled in the art can easily implement the invention. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Furthermore, in order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and similar parts throughout the specification are denoted by similar reference numerals. Throughout the specification, when a part is described as being "connected" to another part, this includes not only cases where they are "directly connected" but also cases where they are "electrically connected" with other elements interposed between them. Furthermore, when a part is described as "including" a component, this means that, unless specifically stated otherwise, it does not exclude other components but rather allows for the inclusion of additional components; it should be understood that this does not preclude the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. In this specification, the term "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. Additionally, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware. Some of the operations or functions described in this specification as being performed by a terminal or device may instead be performed by a server connected to said terminal or device. Likewise, some of the operations or functions described as being performed by a server may also be performed by a terminal or device connected to said server. An embodiment of the present invention will be described in detail below with reference to the attached drawings. FIG. 1 is a configuration diagram for explaining a local model learning system according to an embodiment of the present invention. Referring to FIG. 1, t