CN-121998131-A - Knowledge distillation-based adaptive clustering federal learning method and system
Abstract
The invention discloses a knowledge distillation-based self-adaptive clustering federation learning method and system, wherein the method specifically comprises the steps of enabling all clients to utilize a local data training model, extracting model low-dimensional feature vectors, collecting position information and speed information at the current moment, uploading the position information and the speed information to a cloud server, enabling the cloud server to predict macroscopic traffic density, analyzing microscopic mobility of the clients at the same time, constructing a space-time-feature joint similarity matrix by combining the low-dimensional feature vectors, carrying out dynamic clustering on the clients, calculating aggregate weights according to microscopic mobility of the clients based on edge servers corresponding to each clustered cluster obtained after the dynamic clustering, weighting to generate a clustered teacher model, and guiding the clients in the clustered to carry out local training in a mode of dynamically adjusting knowledge distillation loss weights. The method utilizes space-time prediction to optimize the distillation process, reduces communication overhead, and simultaneously remarkably improves model robustness and individuation precision in a high dynamic scene.
Inventors
- LIANG WEI
- XIE RUI
- DIAO ZULONG
- CHEN YUXIANG
- MENG XIANGWEI
- HE DACHENG
- LI GUANJING
Assignees
- 湖南科技大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260408
Claims (10)
- 1. The self-adaptive clustering federal learning method based on knowledge distillation is characterized by comprising the following steps of: Initializing a global model by a cloud server, randomly selecting clients participating in the round of training from a vehicle client pool, and issuing the global model; based on the global model, each client trains the model by using local data, extracts a model low-dimensional feature vector, collects position information and speed information at the current moment, and uploads the position information and speed information to a cloud server; The cloud server predicts macroscopic traffic flow density by using historical traffic data, analyzes microscopic mobility of the client by using a graph neural network, constructs a space-time-feature joint similarity matrix by combining low-dimensional feature vectors, and performs dynamic clustering division on the client by adopting a self-adaptive clustering algorithm; Based on the edge server corresponding to each cluster obtained after dynamic cluster division, calculating aggregate weight according to microscopic mobility of the client, weighting to generate a cluster teacher model, and guiding the client in the cluster to carry out local training in a mode of dynamically adjusting knowledge distillation loss weight; the cloud server aggregates cluster teacher models of all edge servers, updates a global model by using a benchmark reference aggregation mechanism, and sends the updated global model to a client side to serve as regularization constraint; And (3) performing iterative training until the global model converges or reaches the preset training round number.
- 2. The method according to claim 1, wherein the enabling the cloud server to initialize the global model and randomly select the clients participating in the training round from the vehicle client pool, and the issuing the global model specifically includes: Setting the maximum training round number and convergence threshold of the global model based on the cloud server to determine a training period; initializing a network structure and parameters of a global model according to task requirements based on a cloud server; randomly selecting a client participating in the round of training according to the communication state and the active state of the client from a vehicle client pool; And packaging the initialized network structure and parameters of the global model and issuing the network structure and parameters to the selected client.
- 3. The method of claim 1, wherein the global model is based on enabling each client to train a model using local data, extracting a model low-dimensional feature vector, collecting position information and speed information at a current moment, and uploading the position information and speed information to a cloud server, and specifically comprises: Enabling each client to utilize a local private data set to perform multiple rounds of random gradient descent training on the received global model, and updating parameters of the local model; Extracting weight parameters of the last full-connection layer from the trained local model and flattening the weight parameters to form a low-dimensional feature vector representing local data distribution, and simultaneously acquiring vehicle state information comprising longitude and latitude coordinates, instantaneous speed and acceleration in real time through a vehicle-mounted sensor; And encrypting and packaging the low-dimensional feature vector and the vehicle state information, and uploading the low-dimensional feature vector and the vehicle state information to a cloud server.
- 4. The method of claim 2, wherein the enabling the cloud server to predict macroscopic traffic flow density using historical traffic data, and simultaneously analyze microscopic mobility of the client using the graph neural network, construct a spatio-temporal-feature joint similarity matrix in combination with the low-dimensional feature vector, and dynamically cluster the client using an adaptive clustering algorithm, specifically comprises: The cloud server inputs the historical traffic flow data into a long-period and short-period memory network model, and predicts macroscopic congestion density coefficients of the area where each client is located in a future time window; enabling the cloud server to construct a dynamic vehicle topological graph based on vehicle state information uploaded by the clients, inputting the dynamic vehicle topological graph into a graph neural network model for analysis, predicting link connection stability of each client, and calculating to obtain microscopic mobility scores of each client; calculating the mahalanobis distance between any two clients based on the low-dimensional feature vector; Based on the mahalanobis distance, the macroscopic congestion density coefficient and the microscopic mobility score, calculating a space-time correction coefficient, weighting the mahalanobis distance by using the space-time correction coefficient, and constructing a space-time-feature joint similarity matrix; And processing the space-time-feature joint similarity matrix by adopting an affinity propagation algorithm based on the cloud server, automatically determining the clustering centers and the number of clusters, and completing the dynamic clustering division of the client.
- 5. The method of claim 4, wherein the enabling the cloud server to input the historical traffic data into the long-short term memory network model predicts a macroscopic congestion density coefficient of the area where each client is located in a future time window, specifically comprises: Enabling the cloud server to acquire historical traffic flow time sequence data in the past preset time length of the target area from the traffic flow database; normalizing and aligning the historical traffic flow time sequence data to generate preprocessed time sequence data; Inputting the preprocessed time sequence data into the long-short-period memory network model, executing forward reasoning operation, and outputting macroscopic congestion density coefficients of the area where each client is located in a specified future time window; The macro congestion density coefficient is used for representing the stability degree of traffic flow in a corresponding area, and the higher the macro congestion density coefficient is, the more congested the traffic is and the longer the vehicle is at rest.
- 6. The method of claim 4, wherein the enabling the cloud server to construct a dynamic vehicle topology based on the vehicle status information uploaded by the clients, inputting the dynamic vehicle topology into a graph neural network model for analysis, predicting link connection stability of each client, and calculating to obtain a micro mobility score of each client specifically comprises: According to the vehicle state information uploaded by each client, the cloud server constructs a dynamic vehicle topological graph by taking the vehicles as nodes and the relative position relationship among the vehicles as edges; Inputting a dynamic vehicle topological graph into a graph neural network model, and predicting the link connection duration between each client node and the neighbor nodes thereof through graph convolution and aggregation operation; According to the predicted result of the link connection duration, the micro mobility score is calculated by combining the instant speed of the client; The micro mobility score is used for representing the stability of the motion state of the client, and the higher the micro mobility score is, the stronger the client mobility is and the more unstable the connection in the dynamic vehicle topological graph is.
- 7. The method of claim 4, wherein the edge server corresponding to each cluster obtained after dynamic cluster division calculates an aggregate weight according to microscopic mobility of the client, weights to generate a cluster teacher model, and guides the client in the cluster to perform local training by dynamically adjusting knowledge distillation loss weights, specifically comprising: Enabling an edge server to receive a client dynamic clustering division result issued by a cloud server and a corresponding micro mobility score; Aiming at each client in the clustering cluster managed by each edge server, calculating an aggregation weight according to the microscopic mobility score; Using the aggregation weight to carry out weighted median aggregation on local model parameters of all clients in the cluster to generate a cluster teacher model; according to the macroscopic congestion density coefficient and the microscopic mobility score of each client, calculating a knowledge distillation weight coefficient for each client; each edge server transmits the cluster teacher model and the corresponding knowledge distillation weight coefficient to each client in the managed cluster; In the local training, each client uses a cluster teacher model as a supervision target, and performs model updating by using a composite loss function including knowledge distillation loss terms weighted by knowledge distillation weight coefficients.
- 8. The method of claim 7, wherein the calculating the aggregate weight according to the micro mobility score for each client in the cluster managed by each edge server comprises: For each client, enabling an edge server to calculate a basic weight value based on the reciprocal of the microscopic mobility score of the edge server, and summing the basic weight values of all clients in the cluster to obtain a normalization factor; dividing the corresponding basic weight value of each client by a normalization factor to obtain the aggregation weight of the client; Wherein the aggregate weight is inversely related to the microscopic mobility score, and the lower the mobility score, the higher the aggregate weight of the client.
- 9. The method of claim 8, wherein the aggregating the cluster teacher model of each edge server by the cloud server updates the global model with the benchmark reference aggregation mechanism and issues the updated global model to the client as the regularization constraint, specifically comprises: collecting cluster teacher models generated by all edge servers based on cloud servers; the global model after the training of the cloud server is completed in the previous round is used as a reference model of the aggregation of the present round, the knowledge increment of each cluster teacher model relative to the reference model is calculated, the aggregation weight is distributed to each cluster teacher model according to the magnitude of the knowledge increment, and the higher the contribution of the knowledge increment is, the higher the weight is distributed; And carrying out weighted average on all cluster teacher models by using distributed aggregate weights based on the cloud server, generating an updated global model, and transmitting the updated global model to each client as a regularization constraint target of each client in subsequent local training.
- 10. An adaptive clustered federal learning system based on knowledge distillation, the system comprising: the initialization module is used for enabling the cloud server to initialize the global model, randomly selecting clients participating in the round of training from the vehicle client pool and issuing the global model; The local training module is used for enabling each client to train the model by utilizing local data based on the global model, extracting a model low-dimensional feature vector, collecting position information and speed information at the current moment and uploading the position information and speed information to the cloud server; the space-time prediction module is used for enabling the cloud server to predict macroscopic traffic flow density by utilizing historical traffic data, analyzing microscopic mobility of the client by utilizing the graph neural network, constructing a space-time-feature joint similarity matrix by combining low-dimensional feature vectors, and carrying out dynamic clustering division on the client by adopting a self-adaptive clustering algorithm; The joint clustering module is used for calculating an aggregation weight according to microscopic mobility of the client based on the edge server corresponding to each cluster obtained after dynamic clustering division, weighting to generate a cluster teacher model, and guiding the clients in the clusters to carry out local training in a mode of dynamically adjusting knowledge distillation loss weight; the model updating module is used for enabling the cloud server to aggregate cluster teacher models of all edge servers, updating the global model by using a benchmark reference aggregation mechanism, and sending the updated global model to the client side to serve as regularization constraint; And the iterative training module is used for iterative training until the global model converges or reaches the preset training round number.
Description
Knowledge distillation-based adaptive clustering federal learning method and system Technical Field The invention relates to the technical field of Internet of vehicles communication, in particular to a knowledge distillation-based self-adaptive clustering federal learning method and system. Background With the rapid development of intelligent traffic systems, the amount of data generated by the internet of vehicles (IoV) has increased explosively. Federal learning (FEDERATED LEARNING, FL) has become a mainstream solution for model training using these data dispersed in vehicle terminals while protecting user privacy. Traditional federal learning algorithms (e.g., fedAvg) attempt to train a single global model to serve all clients. However, in a real IoV scenario, the direct application of traditional federal learning faces serious challenges: 1. The lack of awareness of physical mobility-the prior art (including methods based on bi-feedback knowledge distillation) mostly clusters based solely on similarity of model parameters or gradients. In the internet of vehicles, two vehicles may have similar data distributions (both identifying traffic signs), but are physically facing away at high speeds. Neglecting this microscopic mobility can lead to a cluster structure that is extremely unstable, with intra-cluster communication links frequently broken. 2. Neglecting macroscopic environmental impact the existing methods do not take into account the macroscopic traffic flow environment in which the vehicle is located. For example, the vehicle topology of a congested road segment is stable, suitable for high-intensity knowledge distillation, while the vehicle interaction time of a sparse rapid road segment is short. The uniform distillation strategy is adopted without distinction, so that the optimal convergence effect cannot be achieved. 3. Existing schemes with insufficient creativity often simply adjust an aggregation formula, and lack a mechanism for introducing external auxiliary information (such as traffic prediction) to guide an internal learning process. 4. Privacy and communication bottlenecks-existing clustering methods generally rely on exchanging high-dimensional model parameters or gradients to calculate similarity between clients, which not only consume valuable internet-of-vehicles bandwidth resources, but also present serious privacy disclosure risks (e.g., back-pushing user data through inference attacks). The problem of 'information island', in which the traditional clustering method is easy to cause the fight among different clusters, and lacks a knowledge sharing mechanism across clusters, so that global knowledge such as general traffic rules cannot be effectively transmitted in the whole network range. Therefore, there is a need for a new federal learning method that can combine macroscopic traffic flow prediction data with microscopic vehicle movement observations to optimize the knowledge distillation process. Disclosure of Invention The invention aims to provide a knowledge distillation-based self-adaptive clustering federation learning method and system, which effectively solve the problem of model failure caused by the fact that the traditional method only depends on static feature clustering by introducing double space-time constraints of macroscopic traffic flow and microscopic vehicle topology, and remarkably improve model robustness and personalized precision in a high dynamic scene while reducing communication overhead by utilizing a space-time prediction optimization distillation process so as to solve at least one of the problems in the prior art. In a first aspect, the present invention provides an adaptive clustered federal learning method based on knowledge distillation, the method specifically comprising: Initializing a global model by a cloud server, randomly selecting clients participating in the round of training from a vehicle client pool, and issuing the global model; based on the global model, each client trains the model by using local data, extracts a model low-dimensional feature vector, collects position information and speed information at the current moment, and uploads the position information and speed information to a cloud server; The cloud server predicts macroscopic traffic flow density by using historical traffic data, analyzes microscopic mobility of the client by using a graph neural network, constructs a space-time-feature joint similarity matrix by combining low-dimensional feature vectors, and performs dynamic clustering division on the client by adopting a self-adaptive clustering algorithm; Based on the edge server corresponding to each cluster obtained after dynamic cluster division, calculating aggregate weight according to microscopic mobility of the client, weighting to generate a cluster teacher model, and guiding the client in the cluster to carry out local training in a mode of dynamically adjusting knowledge distillation loss weight; the clou