CN-121998038-A - Federal learning global model training method based on fuzzy weighting and dynamic clustering

CN121998038ACN 121998038 ACN121998038 ACN 121998038ACN-121998038-A

Abstract

The invention belongs to the technical field of model training, in particular to a federal learning global model training method based on fuzzy weighting and dynamic clustering, the method comprises the steps of client local training and information reporting, frequency domain feature extraction, feature normalization and joint vector construction, dynamic cluster quantity determination and division, intra-group fuzzy weight aggregation, inter-group fuzzy weight aggregation, weight lower limit protection, learning rate scheduling and global model evaluation and termination. According to the method, the multi-mode joint characterization based on training loss, model parameter frequency domain characteristics and class entropy is introduced in federal learning, and a dynamic clustering and double-layer fuzzy weighted aggregation mechanism is constructed, so that the global model has remarkably higher stability, precision and generalization capability under a non-independent co-distributed environment. By using the loss, frequency domain statistics and class entropy in combination, heterogeneity between clients in data distribution, training process and model update structure can be more accurately characterized.

Inventors

GAO MINGLIANG
LI QILEI
YUE WENTAO

Assignees

山东理工大学

Dates

Publication Date: 20260508
Application Date: 20260128

Claims (10)

1. The federal learning global model training method based on fuzzy weighting and dynamic clustering is characterized by comprising the following steps of: S1, a server transmits current global model parameters to a selected client, the client executes random gradient descent update with momentum, calculates local average training loss and class entropy, and reports the local average training loss and class entropy to the server together with the updated model parameters; s2, flattening the received updated model parameters by the server to obtain one-dimensional vectors, and performing fast Fourier transform to obtain frequency domain statistical characteristics; S3, the server executes maximum and minimum normalization processing on the local average training loss and the frequency domain statistical characteristics of all the clients, and constructs a joint feature vector reflecting the heterogeneity of the clients by combining class entropy; S4, calculating an overall heterogeneity index according to the normalized local average training loss and the frequency domain statistical characteristics, dynamically calculating the clustering number K according to the overall heterogeneity index, and dividing all clients into K groups based on the joint feature vector; S5, calculating the intra-group weight of each client in the group, and carrying out weighted aggregation on the client models in the same cluster to obtain an intra-group aggregation model; s6, calculating the inter-group weight of each group, and carrying out secondary weighted aggregation on the aggregation models of all groups to obtain an updated global model; S7, in the training process, threshold detection is carried out on the weights in the groups and the weights between the groups, and the weights lower than a preset lower limit are reset to be lower-limit thresholds and are re-normalized; S8, the server adjusts the learning rate of the local training of the next round of client according to the current training round, and a three-phase strategy comprising a preheating period, a stabilizing period and an attenuation period is adopted during adjustment; And S9, evaluating the performance of the global model on the public test set, and judging whether the termination condition is met according to the precision or loss index.
2. The global model training method for federal learning based on fuzzy weighting and dynamic clustering according to claim 1, wherein in S1, the class entropy is entropy calculated based on client tag distribution or prediction distribution, and is used for describing the equalization degree of local data distribution; The client reports the local average training loss, the class entropy and the updated model parameters to the server together, and the client also reports statistics used for describing the local data distribution characteristics of the client, including local gradient norms and local loss descending speeds.
3. The global model training method for federal learning based on fuzzy weighting and dynamic clustering according to claim 1, wherein in S2, frequency domain energy, amplitude peak value, frequency domain mean value and frequency domain standard deviation are extracted through fast fourier transform, and frequency domain statistical characteristics are obtained: ; in the formula, The frequency domain statistical characteristics of the client k are obtained; Frequency domain energy for client k; amplitude peak for client k; the frequency domain standard deviation of the client k; The frequency domain mean value of the client k; 、、、 Respectively is 、、、 Weight coefficient of (c) in the above-mentioned formula (c).
4. The global model training method for federal learning based on fuzzy weighting and dynamic clustering of claim 1, wherein in S3, the local average training loss for client k Sum frequency domain statistics Performing maximum and minimum normalization processing to obtain And , For after normalization treatment , For after normalization treatment Will be And Combining into two-dimensional joint characterization vector ; At the position of Class entropy of incoming client k Obtaining joint feature vectors The superscript T denotes a transpose.
5. The global model training method for federal learning based on fuzzy weighting and dynamic clustering of claim 4, wherein in S4, the method is based on And Calculating the heterogeneity of the loss and the frequency domain: ; ; in the formula, Indicating loss heterogeneity; representing frequency domain characteristic heterogeneity; Is a standard deviation function; Comprehensively obtain the overall heterogeneity index : ; Calculating the cluster number K by adopting a linear-truncated mixing strategy: ; ; ; in the formula, Represent rounding; representing an initial cluster number; representing the number of intermediate clusters; the heterogeneity magnification coefficient in dynamic clustering; 、 the number of the clusters is respectively a preset maximum and minimum cluster number; b is a basic bias term; To the combined feature vector K-Means clustering is performed to divide all clients into K groups.
6. The global model training method for federal learning based on fuzzy weighting and dynamic clustering according to claim 4, wherein in S5, calculating the intra-group weights of the clients in the group specifically includes first calculating an initial weight, where the initial weight is any one of an inversely proportional intra-group weight or a Softmax intra-group weight; The inverse intra-group weights are expressed as: ; in the formula, Inverse type intra-group weights representing client k in the g-th group; 、 Respectively is And Weight coefficient of (2); the Softmax type intra-group weights are expressed as: ; ; ; in the formula, T1 represents the temperature coefficient in the Softmax function, and is used for controlling the smoothness of the weight distribution; Softmax type intra-group weights representing client k in the g-th group; Representing the sum of unnormalized Softmax weights for all clients within the g-th group; a set of clients representing a g-th group, j representing one of the clients; 、 Respectively representing normalized local average training loss and normalized frequency domain statistical characteristics of the client j in the g group; 、 Respectively representing the weight coefficient of the normalized local average training loss and the normalized frequency domain statistical characteristic; Selection of 、 Any of which serves as the initial weight for client k in the g-th group ; Subsequently, correction using class entropy : ; In the formula, Intra-group weights representing client k in the g-th group; the entropy adjusting factor is used for adjusting the aggregation weight of the client according to the category entropy; Finally, an intra-group polymerization model is obtained: ; in the formula, A group-aggregation-model parameter representing the g-th group; Representing updated model parameters for client k.
7. The global model training method for federal learning based on fuzzy weighting and dynamic clustering of claim 6, wherein in S6, calculating the inter-group weights of each group is performed by first calculating group-level loss and frequency domain statistics: ; ; in the formula, Representing average training loss within group g; representing average frequency domain statistics within the g-th group; The inter-group weight adopts any one of inverse type inter-group weight or Softmax type inter-group weight; The inverse ratio type group weight is expressed as: ; in the formula, Inverse type inter-group weights representing the g-th group; 、 Respectively is 、 A weight coefficient in the inverse type inter-group weights; The Softmax type inter-group weights are expressed as: ; ; in the formula, The Softmax type inter-group weights representing group g; 、 Respectively is 、 Weight coefficients in Softmax type inter-group weights; an unnormalized contribution index representing group g; An unnormalized contribution index representing the h th group; Selection of 、 Any one of them as the inter-group weight of the g-th group ; Finally, an updated global model is obtained: ; in the formula, Representing the final parameters of the global model for a training round t+1.
8. The global model training method for federal learning based on fuzzy weighting and dynamic clustering of claim 7, wherein in S7, a lower threshold is set For the following And : ; ; In the formula, The representation is updated as; And re-normalized to ensure that the sum of weights is 1: ; ; in the formula, An inter-group weight representing the h-th group; Representing intra-group weights for client j in the g-th group.
9. The global model training method for federal learning based on fuzzy weighting and dynamic clustering according to claim 1, wherein in S8, the three-phase strategy including the warm-up period, the stationary period and the decay period is expressed as: ; in the formula, The learning rate when the training round is t is shown; representing an initial learning rate; representing a learning rate decay coefficient; representing the decay period constant; indicating the preset preheating period end turn; Indicating the beginning turn of the preset decay period.
10. The global model training method for federal learning based on fuzzy weighting and dynamic clustering according to claim 1, wherein in S9, the evaluation index is calculated after each training round, and the training is terminated when the termination condition is satisfied, the termination condition being that Or (b) , wherein, The accuracy of the test set at training round t is expressed, Representing a preset optimal accuracy rate, t representing training rounds, Is the preset maximum training round.

Description

Federal learning global model training method based on fuzzy weighting and dynamic clustering Technical Field The invention belongs to the technical field of model training, and particularly relates to a federal learning global model training method based on fuzzy weighting and dynamic clustering. Background Federal learning (FEDERATED LEARNING, FL) is a distributed learning framework that allows multiple clients to co-train machine learning models without sharing raw data. In practical applications, the data of the participants usually show significant Non-independent co-distribution (Non-IID) characteristics, such as unbalanced classification, different feature distribution, uneven sample number distribution, and the like, so that the conventional FedAvg-class method is difficult to converge to a global model with good performance. However, the existing method is used for relieving the problem of partial drift by introducing a near-end term or adopting fixed clustering based on model parameters, but has the defects that the model parameters are only relied on for clustering, loss characteristics and training dynamic information cannot be captured, the clustering number is fixed, the clustering cannot be dynamically adjusted according to the training process, so that excessive or insufficient clustering is caused, the aggregation between groups adopts a fixed form and lacks a flexible weight distribution mechanism, and the structural characteristics of the model in a frequency domain are not considered, so that the high-order statistical property of the model parameters cannot be reflected. Therefore, there is a need for a federal learning method that can improve stability and convergence in a strongly heterogeneous environment. Disclosure of Invention According to the defects in the prior art, the invention aims to provide a federal learning global model training method based on fuzzy weighting and dynamic clustering, which can improve the stability, robustness and precision of model training in a non-IID environment. In order to achieve the above purpose, the invention provides a federal learning global model training method based on fuzzy weighting and dynamic clustering, which comprises the following steps: S1, a server transmits current global model parameters to a selected client, the client executes random gradient descent update with momentum, calculates local average training loss and class entropy, and reports the local average training loss and class entropy to the server together with the updated model parameters; s2, flattening the received updated model parameters by the server to obtain one-dimensional vectors, and performing fast Fourier transform to obtain frequency domain statistical characteristics; S3, the server executes maximum and minimum normalization processing on the local average training loss and the frequency domain statistical characteristics of all the clients, and constructs a joint feature vector reflecting the heterogeneity of the clients by combining class entropy; S4, calculating an overall heterogeneity index according to the normalized local average training loss and the frequency domain statistical characteristics, dynamically calculating the clustering number K according to the overall heterogeneity index, and dividing all clients into K groups based on the joint feature vector; S5, calculating the intra-group weight of each client in the group, and carrying out weighted aggregation on the client models in the same cluster to obtain an intra-group aggregation model; s6, calculating the inter-group weight of each group, and carrying out secondary weighted aggregation on the aggregation models of all groups to obtain an updated global model; S7, in the training process, threshold detection is carried out on the weights in the groups and the weights between the groups, and the weights lower than a preset lower limit are reset to be lower-limit thresholds and are re-normalized; S8, the server adjusts the learning rate of the local training of the next round of client according to the current training round, and a three-phase strategy comprising a preheating period, a stabilizing period and an attenuation period is adopted during adjustment; And S9, evaluating the performance of the global model on the public test set, and judging whether the termination condition is met according to the precision or loss index. In the S1, the class entropy is entropy calculated based on the label distribution or the prediction distribution of the client, and is used for describing the equilibrium degree of the local data distribution; The client reports the local average training loss, the class entropy and the updated model parameters to the server together, and the client also reports statistics used for describing the local data distribution characteristics of the client, including local gradient norms and local loss descending speeds. In the S2, the frequency domain energy, the ampli