CN-122021801-A - Personalized federal learning method and system based on parameter self-adaptive decoupling

CN122021801ACN 122021801 ACN122021801 ACN 122021801ACN-122021801-A

Abstract

The invention provides a personalized federal learning method and a system based on parameter self-adaptive decoupling, wherein the method calculates the Fisher information of local model parameters and performs hierarchical normalization; and analyzing the bimodal distribution characteristics of the parameter sensitivity by using a Baseller minimum error threshold method, adaptively calculating an optimal segmentation threshold by taking a minimized Bayesian classification error as a target, and decoupling the parameter accuracy into two types of sharing and individuation, wherein the local training adopts a two-stage strategy, namely, the first stage applies global constraint to the sharing parameter to anchor general knowledge, and the second stage introduces maximum entropy regularization to the individuation parameter to improve the local generalization. The invention solves the problem of evaluation deviation caused by the parameter scale difference of the multi-level network, enforces the sensitivity normalization processing to be independently carried out in each layer through the level normalization mechanism, eliminates the influence of the inter-layer scale difference, ensures the fairness of mask generation, and maintains the integrity of the network structure and the layering of feature extraction.

Inventors

OU YUYI
ZHENG MINGYU

Assignees

广东工业大学

Dates

Publication Date: 20260512
Application Date: 20260112

Claims (10)

1. The personalized federal learning method based on parameter self-adaptive decoupling is characterized by comprising the following steps of: S1), initializing global model parameters through a server; S2), the server selects the client side participating in training and sends the global model parameters to the client side participating in training, and the client side participating in training takes the global model parameters as initial parameters of the local model; s3), the client calculates the Fisher information value of the local model parameters by using the local data, and performs hierarchical normalization processing; S4), the client performs logarithmic transformation on the normalized Fisher information value, calculates an optimal logarithmic division threshold value through a Basle minimum error threshold value method, reversely maps the optimal logarithmic division threshold value to obtain an original threshold value, and generates a classification mask according to the original threshold value; S5, the client side executes a two-stage optimization strategy based on the classification mask, the sharing parameters execute global constraint updating, and the personalized parameters execute updating introducing maximum entropy regularization; S6), uploading the updated local model parameters and the current training sample data quantity to a server by the client, and updating a global model by the server through weighting and aggregating the local model parameters of each client; s7), circulating the steps S2) -S6) until a preset global training round is reached, and obtaining a final classification model.
2. The personalized federal learning method based on adaptive decoupling of parameters according to claim 1, wherein in step S3), the client extracts a small batch of samples from the local training data set before training, calculates the gradient square of the cross entropy loss function with respect to the local model parameters to approximate the Fisher information value, namely: ; In the formula, Representing a fisher information value; Representing a diagonal extraction operator; is a Fisher information matrix; Representing a desired operation; Representing a small batch of samples; For the weight parameters of the client local model, Sample and label for input; the log likelihood is output as the log probability of the local model and is used for measuring the weight parameters of the local model Lower data The degree of confidence in the appearance.
3. The personalized federal learning method based on adaptive decoupling of parameters of claim 2, wherein in step S3), the client local model is divided into multiple levels by identifying its structure, for each level In the fischer information value Performing min-max normalization independently to map to And obtaining normalized Fisher information values in the section.
4. The personalized federal learning method based on parameter adaptive decoupling of claim 3, wherein in step S4), each client performs logarithmic transformation on the normalized Fisher information value to obtain a logarithmic Fisher vector, namely: ; In the formula, In order to make a logarithmic fisher vector, Is a very small constant.
5. The method for personalized federal learning based on adaptive decoupling of parameters according to claim 4, wherein in step S4), the threshold in the logarithmic Fisher vector field is traversed by using a Kevlar algorithm to find the best threshold for minimizing the probability of classification error, which is used to divide the personalized parameters of high-sensitivity Fisher values and the shared parameters of low-sensitivity Fisher values.
6. The personalized federal learning method based on parameter adaptive decoupling of claim 5, wherein in step S4), the optimal logarithmic partition threshold is calculated using a Kevlar algorithm The expression of (2) is: ; ; In the formula, Representing an optimal logarithmic partition threshold; Expressed in terms of threshold Dividing a minimum error criterion function after logarithmic Fisher vectors; And Respectively expressed in threshold values Dividing the prior probability of the two types of parameter elements in the logarithmic fisher vector distribution; And Respectively representing the variances of the logarithmic fisher values of the two types of parameter elements; representing solving independent variables that minimize functions 。
7. The personalized federal learning method based on adaptive parameter decoupling of claim 6, wherein in step S4), the log-optimal segmentation threshold is restored to the original threshold Comparing the normalized Fisher information of the local model with the original threshold value one by one to generate a binary mask; For each parameter in the client local model, if it is a fisher information value And marking the personalized parameters, otherwise marking the shared parameters.
8. The personalized federal learning method based on parameter adaptive decoupling according to claim 1, wherein in step S5), the two-stage optimization strategies are respectively: freezing personalized parameters, optimizing shared parameters only by using SGD, wherein a loss function of the shared parameter optimization in the first stage comprises cross entropy loss and L2 regular terms; and the second stage of optimization, namely freezing the shared parameters, optimizing the personalized parameters only, wherein the loss function of the second stage of personalized parameter optimization comprises cross entropy loss and maximum entropy regularization term.
9. The personalized federal learning method based on parameter adaptive decoupling of claim 1, wherein in step S6), the server collects the update amount of the clients participating in the round, according to the number of local samples of each client Weighted averaging is performed and the global model is updated, namely: ; In the formula, Is the global model after updating; The number of clients participating in training; The total number of samples for all clients participating in training; is the first Local model parameters updated by the clients; is the first The number of local data set samples for each client; Representing the run of training.
10. A personalized federal learning system based on parameter adaptive decoupling, comprising: the server is used for initializing the global model and sending the global model parameters to the clients participating in training, and simultaneously receiving the updated local model parameters uploaded by the clients and the current training sample data volume for weighted aggregation so as to update the global model; the system comprises a plurality of clients, a plurality of local model parameters, a base-Teller minimum error threshold method, a classification mask, a two-stage optimization strategy based on the classification mask, global constraint updating of the shared parameters, updating of the personalized parameters by introducing maximum entropy regularization, and updating of the local model parameters and the current training sample data to a server, wherein the clients are used for executing local model training according to an initialized global model, calculating the Fisher information value of local model parameters by utilizing local data, carrying out hierarchical normalization processing, carrying out logarithmic transformation on the normalized Fisher information value, calculating an optimal logarithmic division threshold by the base-Teller minimum error threshold method, carrying out reverse mapping to obtain an original threshold, and generating the classification mask based on the original threshold.

Description

Personalized federal learning method and system based on parameter self-adaptive decoupling Technical Field The invention relates to the technical field of artificial intelligence and distributed machine learning, in particular to a personalized federal learning method and system based on parameter self-adaptive decoupling. Background As the internet of things device is popular and privacy regulations are perfect, federal learning is widely focused as a distributed collaborative training model which does not share original data, and a traditional federal learning framework updates a global model by aggregating client parameters to pursue a general global model. In an actual scene, client data has high non-independent same distribution characteristics, data heterogeneity causes drift, namely local characteristics of local model fitting deviate from global optimum, and individuality is killed by forced aggregation, so that local performance is poor. Existing personalized federal learning methods address the above challenges primarily through regularization or model decoupling. Among them, the model decoupling method of the main stream generally performs coarse granularity division (such as sharing feature extractor and privately classifying header) based on the network hierarchy. This rigid structural segmentation ignores the significant difference in the contribution of different parameters to a particular task within the same layer, and is difficult to adaptively cope with dynamic changes in the roles of the parameters in the deep neural network. To overcome the limitations of coarse-grained partitioning, some studies have attempted to make finer-grained, parameter-by-parameter partitioning of models based on parameter importance. However, such methods face a core challenge in practical applications where the threshold is difficult to determine. Too high a threshold setting can lead to too few personalized parameters, the model is difficult to fully fit the unique features of the local data, and too low a threshold setting can lead to insufficient shared parameters, and the model is easy to lose global generalization capability and even has catastrophic forgetfulness. Therefore, how to adaptively determine the optimal segmentation threshold according to the parameter distribution characteristics without manual intervention is a key technical bottleneck to be solved in order to realize efficient fine-grained parameter decoupling at present. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a personalized federal learning method and system based on parameter self-adaptive decoupling, which are used for solving the problem that the prior art is difficult to consider model stability and personalized adaptation capability in a data heterogeneous environment, and realizing the aims of releasing individuation and anchoring universality by adaptively decoupling model parameters into individuation sets and sharing sets and optimizing the individuation sets in stages. In order to achieve the above purpose, the present invention adopts the following technical scheme: In a first aspect, the present invention provides a personalized federal learning method based on parameter adaptive decoupling, including the steps of: S1), initializing global model parameters through a server; S2), the server selects the client side participating in training and sends the global model parameters to the client side participating in training, and the client side participating in training takes the global model parameters as initial parameters of the local model; s3), the client calculates the Fisher information value of the local model parameters by using the local data, and performs hierarchical normalization processing; S4), the client performs logarithmic transformation on the normalized Fisher information value, calculates an optimal logarithmic division threshold value through a Basle minimum error threshold value method, reversely maps the optimal logarithmic division threshold value to obtain an original threshold value, and generates a classification mask according to the original threshold value; S5, the client side executes a two-stage optimization strategy based on the classification mask, the sharing parameters execute global constraint updating, and the personalized parameters execute updating introducing maximum entropy regularization; S6), uploading the updated local model parameters and the current training sample data quantity to a server by the client, and updating a global model by the server through weighting and aggregating the local model parameters of each client; s7), circulating the steps S2) -S6) until a preset global training round is reached, and obtaining a final classification model. Preferably, in step S1), the server simulates a non-independent co-distributed scenario according to dirichlet distribution, divides the training data set into a plurality of clients, and eac