CN-121981203-A - Client clustering prototype alignment federation learning method and system for non-independent co-distributed industrial data

CN121981203ACN 121981203 ACN121981203 ACN 121981203ACN-121981203-A

Abstract

The invention discloses a client clustering prototype alignment federation learning method and system for non-independent co-distributed industrial data, and belongs to the technical field of industrial intelligence. The method comprises the steps of enabling a server to end a hierarchical global model and initializing, sending global model parameters and a global model prototype set to each client, enabling the client to execute local training of the local model by using local private training data of the client, uploading the local model set and weight update vectors to the server, enabling the server to conduct incremental dynamic clustering and cluster number optimization on the client, conducting model generation of hierarchical aggregation and privacy enhancement in clusters in the process of cluster number optimization, sending each cluster model to the client, and enabling the client to conduct model selection and personalized fine adjustment to obtain a personalized model finally adapting to local data distribution. The invention effectively relieves the client drift problem caused by Non-IID data and improves the convergence speed and individuation performance of the model.

Inventors

CHEN XUEJIAO
WANG PAN
JIANG MINMIN

Assignees

南京信息职业技术学院

Dates

Publication Date: 20260505
Application Date: 20251215

Claims (10)

1. A client clustering prototype alignment federation learning method for non-independent co-distributed industrial data is characterized by comprising the following steps: step S1, a server ends a hierarchical global model and initializes the hierarchical global model, and global model parameters and a global class prototype set are issued to each client; step S2, the client i receives the global model parameters and the global model prototype set issued by the server, performs local training of the local model according to the global model parameters and the global model prototype set by using local private training data of the client i, and uploads the local model prototype set and the weight update vector to the server; Step S3, the server performs incremental dynamic clustering and cluster number optimization on the client according to the local prototype vector and the weight updating vector; step S4, in the process of clustering and cluster number optimization, the server performs model generation of intra-cluster hierarchical aggregation and privacy enhancement, and sends each cluster model to the client; And S5, the client performs model selection and personalized fine adjustment according to the received cluster models to obtain a personalized model finally adapting to local data distribution.
2. The client clustering prototype alignment federation learning method for non-independent co-distributed industrial data according to claim 1, wherein the hierarchical global model comprises a feature extraction sub-network and a predictor sub-network, wherein the feature extraction sub-network is formed by connecting a convolutional neural network ResNet and a projection network in series, the projection network comprises an input layer, a regularization layer, an activation layer and a full connection layer which are sequentially connected, and the predictor sub-network comprises an input layer dimension, a regularization layer, an activation layer and an output layer which are sequentially connected.
3. The client clustering prototype alignment federation learning method for the Non-independent co-distributed industrial data according to claim 2 is characterized in that the server initialization comprises presetting the clustering center number K, training total rounds T, a temperature coefficient tau and a temperature difference coefficient tau ', wherein the value range of K is 5-100, the clustering number K is set according to the number of 20-1000 clients of an industrial scene, when the number of the clients is not more than 50, K is 5-15 to control intra-cluster heterogeneity, when the number of the clients is between 50 and 200, K is 15-30, when the number of the clients is more than 200, K is increased to 30-100 tau to be 0.1-1.0, the temperature parameter tau is dynamically adjusted according to the data Non-independent co-distribution degree, when the Non-IID degree is higher and the class distribution deviation is more than 60%, tau is 0.1-0.3, when the Non-IID degree is lower and the class distribution deviation is not more than 30%, tau is set to be 0.7-1.0, tau ' can be used for judging the intra-cluster feature of the client, and if the temperature difference is less than the threshold value tau ' is less than the threshold.
4. The method for learning aligned federation of client clustering prototypes for non-independent co-distributed industrial data of claim 3 wherein the local model comprises a feature extractor and a classifier, wherein the client i performs local training using its local private training data, the training process comprising inputting the local private training data into the feature extractor to obtain feature embedded vectors and inputting the vectors into the classifier to obtain predicted values, calculating cross entropy losses based on the predicted values and their true labels, and prototype alignment losses based on the feature embedded vectors and the global prototypes, constructing a target loss function in combination with the cross entropy losses and the prototype alignment losses, optimizing local model parameters by a gradient descent method until convergence to obtain trained personalized model parameters, and after training is completed, the client calculates local prototype vectors for each class and records weight update vectors of the feature extractor.
5. The method for aligning a prototype of client cluster to federate learning for non-independent co-distributed industrial data according to claim 4, wherein the expression of the objective loss function is: Wherein, the Representing the overall goal of client local model training, Representing the total number of local private training data samples for the i-th client, Is sample data Is used for the classification of the category of (c), Representing the client-side local model parameters, Is given input sample data The feature embedding vector output by the feature extractor, For the feature extractor parameters of the i-th client, Is sample data Is used for the classification of the category of (c), Is the category to which the sample belongs Is a global class prototype of (a), Representing the square of the euclidean distance between two feature embedding vectors, Is a superparameter for controlling the weight of the term in the total loss.
6. The method for aligning a prototype of a client cluster to federate learning for non-independent co-distributed industrial data of claim 5, wherein the client computes a local prototype vector for each class j: In the formula, The prototype vector representing the j-th class of the client i is also the average value of the feature embedding vectors of all samples in the class j; Local private training data set that is the ith client Consists of local private training data belonging to class j, wherein the local private training data set Sampling self-distribution ; A feature embedding vector representing the feature extractor output of the i-th client, The feature extractor parameters representing the i-th client, Representing local private training data entered into the feature extractor.
7. The method for aligning a prototype of client clustering with federation learning oriented to non-independent and co-distributed industrial data according to claim 6, wherein the server collects local prototype vectors and weight update vectors of each client, and performs incremental dynamic clustering, comprising the steps of: the first round of clustering adopts a K-Medoids algorithm, and the initial grouping is carried out based on cosine similarity between local prototype vectors of the client, wherein a similarity calculation formula is as follows: Wherein the method comprises the steps of Representing similarity distance for client i and client j Ternary cosine similarity matrix representing client i Ternary cosine similarity matrix representing client j Initializing cluster centers using maximum minimum distance method Clustering partitioning by similarity distance And adopts cost function to measure clustering quality Finally get K subgroups Dynamically adjusting K value in clustering process, initializing K= Calculating contour coefficients after each round of clustering to evaluate clustering quality: Wherein the method comprises the steps of For the average distance of client i from other clients of the same group, The minimum average distance between the client i and the nearest other group; if S (K) <0.3, it is indicated that the clustering effect is poor, and the K value is adjusted: and clustering again until S (k) is more than or equal to 0.5 or the maximum iteration number is reached; In the subsequent round, the server updates the vector delta according to the client weight Is delta II 2 and prototype moving distance dynamic adjustment cluster structure in prototype vector evolution trend: if the similarity of client prototypes in the cluster is lower than the threshold value Splitting the cluster; if the prototype similarity of different clusters is higher than Cluster merging is carried out; if the difference of the training rounds participated by the clients in the cluster is too large, pruning the cluster; and the cluster number K is adaptively adjusted according to the contour coefficient or the average similarity in the cluster, and updated after each round of training.
8. The method for client cluster prototype alignment federation learning for non-independent co-distributed industrial data of claim 7, wherein the server calculates a cluster prototype vector for each cluster and constructs a prototype alignment regularization term to guide the aggregation process, comprising the steps of: Hierarchical aggregation, namely carrying out weighted aggregation on shared feature extractor parameters of clients in the cluster according to data quantity, training participation and model performance of the shared feature extractor parameters; Cluster prototype vector Local prototyping by intra-cluster clients Weighted average calculation of (c): Wherein the method comprises the steps of Representative of the cluster prototype vector generated by the c-th cluster, This summation symbol represents the summation of all clients belonging to the c-th cluster, A set of clients belonging to the c-th cluster, Is the aggregate weight assigned to the ith client, wherein the higher the accuracy of the client's local model, the more rounds the client participates in training, the weight The larger the size of the container, Is the local prototype vector uploaded by the ith client; Wherein the weights are Based on client data size | Number of training participation And local model accuracy And (3) comprehensive calculation: Is a weight coefficient, is a super parameter and + + = 1; Introducing a prototype alignment regularization term during aggregation to encourage intra-cluster client features to align to the cluster prototypes, wherein the regularization term is in the form of: representing an alignment regularization term or an alignment penalty, Representing summing all clients belonging to the c-th cluster, Is a set containing the index of all clients that are partitioned into cluster c, Is the local prototype vector of the i-th client, A cluster prototype vector that is the c-th cluster and represents the average or central feature distribution of the entire cluster; Privacy enhancement, namely introducing a gradient sparsification and differential noise adding mechanism in the aggregation process, and enhancing privacy protection; finally, an aggregated cluster model is generated for each cluster.
9. The client clustering prototype alignment federation learning method for the non-independent co-distributed industrial data according to claim 8 is characterized by comprising the steps of obtaining a personalized model finally adapting to local data distribution, calculating the similarity between the local prototype and each cluster prototype by each client, selecting an optimal cluster model for next training, fine-tuning a local private classifier after training is finished to obtain the personalized model finally adapting to local data distribution by the client, and iterating the process until the model converges or reaches a preset round.
10. The client clustering prototype alignment federation learning system for the non-independent co-distributed industrial data is characterized by comprising a server and a client, wherein: The server ends the hierarchical global model and initializes the hierarchical global model, and issues global model parameters and a global class prototype set to each client; the server performs incremental dynamic clustering and cluster number optimization on the client according to the local prototype vector and the weight updating vector, performs model generation of hierarchical clustering and privacy enhancement in clusters in the cluster number optimization process, and sends each cluster model to the client; The client i receives the global model parameters and the global model prototype set issued by the server, performs local training of the local model by utilizing local private training data of the global model parameters and the global model prototype set, uploads the local model prototype set and the weight update vector to the server, and performs model selection and personalized fine tuning according to each received cluster model to obtain a personalized model finally adapting to local data distribution.

Description

Client clustering prototype alignment federation learning method and system for non-independent co-distributed industrial data Technical Field The invention relates to the technical field of federal learning and industrial intelligent intersection, in particular to a personalized federal learning optimization method and system aiming at Non-independent co-distributed (Non-IID) industrial data, which are suitable for model performance improvement and weakness client adaptation under a multi-factory collaborative modeling scene. Background In the field of industrial Internet, federal learning has become a key technical means for constructing high-value intelligent models cooperatively in a plurality of industrial sites by virtue of privacy protection advantages of 'data cannot go out of domain'. However, due to significant differences in production equipment, process parameters and environmental conditions among different factories, local data of each client exhibits a highly Non-independent co-distribution (Non-IID) characteristic, which is specifically characterized by unbalanced category distribution, offset of characteristic spatial distribution and great difference in sample scale, and seriously affects training effect and practicability of the collaborative model. At present, a single global model aggregation strategy based on FedAvg is generally adopted by a mainstream federal learning framework, and the method has the obvious defects that firstly, under a Non-IID data environment, local updating directions of clients and a global optimization target are deviated, model training is unstable and even diverged easily, convergence accuracy and generalization capability of a final model are obviously influenced, secondly, the unified global model is difficult to adapt to data distribution specific to each factory, and particularly, the suitability of a 'weak client' with small data scale or special distribution is poor, and reliability requirements of high-precision tasks such as fault diagnosis, quality detection and the like in an industrial scene cannot be met. While some improvement schemes attempt to introduce a client clustering method based on data statistics, important limitations still exist in that clustering accuracy is limited according to the structural similarity of insufficient deep feature space mining, meanwhile, an effective intra-cluster consistency constraint mechanism is lacking, the problem that model offset (CLIENT DRIFT) generated by local training of a client cannot be restrained is solved, in addition, the existing method lacks dynamic adjustment capability in a model distribution and adaptation stage, and efficient matching and updating of a client personalized model are difficult to achieve. Disclosure of Invention The invention aims to overcome the defects of model divergence, poor adaptability, unbalanced performance and the like of a weak client under a Non-IID industrial data scene of the existing federation learning algorithm, and provides a client clustering prototype alignment federation learning method for Non-independent co-distributed industrial data. According to the method, personalized modeling and efficient aggregation are realized through a dynamic clustering mechanism, prototype constraint alignment and an intelligent model selection strategy, and the convergence speed of a global model and the reasoning performance of a local model of each client are remarkably improved. The technical scheme adopted by the invention is as follows: a client clustering prototype alignment federation learning method for non-independent co-distributed industrial data comprises the following steps: step S1, the server ends the hierarchical global model and initializes the hierarchical global model, and issues global model parameters and a global class prototype set to each client. Step S2, the client i receives the global model parameters and the global model prototype set issued by the server, performs local training of the local model according to the global model parameters and the global model prototype set by using the local private training data, and uploads the local model prototype set and the weight update vector to the server. And S3, the server performs incremental dynamic clustering and cluster number optimization on the client according to the local prototype vector and the weight updating vector. And S4, in the process of clustering and cluster number optimization, the server performs model generation of intra-cluster hierarchical aggregation and privacy enhancement, and sends each cluster model to the client. And S5, the client performs model selection and personalized fine adjustment according to the received cluster models to obtain a personalized model finally adapting to local data distribution. Preferably, the hierarchical global model comprises a feature extraction sub-network and a predictor sub-network which are connected in series, the feature extraction sub-n