CN-121707014-B - Federal learning method and system for heterogeneous data of Internet of vehicles

CN121707014BCN 121707014 BCN121707014 BCN 121707014BCN-121707014-B

Abstract

The application discloses a federal learning method and a federal learning system for heterogeneous data of the Internet of vehicles, and relates to the technical field of intelligent transportation. The method comprises the steps that a roadside unit sends a current global model and a global control variable to a vehicle participating in training, the vehicle performs local training combined with a client drift correction mechanism through local data and a differential privacy random gradient descent algorithm to generate local model updating, the vehicle calculates noise multipliers of next round training through a personalized self-adaptive differential privacy strategy according to loss values of the round training, the vehicle judges whether the round model updating is effective or not based on selective uploading, if the updating is effective, the vehicle uploads model increment and local control variable increment to the roadside unit, and the roadside unit updates the global model and the global control variable after aggregation. The application carries out systematic innovation from three layers of privacy protection, model optimization and communication strategy, and provides a solution for constructing efficient, safe and reliable federal learning in the scene of the Internet of vehicles.

Inventors

WANG TENG
LIU JIN
QU JIANI
HU DE
YANG TENGFEI
FENG JINGYU
LIU SHUANGGEN

Assignees

西安邮电大学

Dates

Publication Date: 20260512
Application Date: 20260213

Claims (6)

1. The federal learning method for heterogeneous data of the Internet of vehicles is characterized by comprising the following steps of: S1, a roadside unit sends a current global model and global control variables to vehicles participating in training; S2, based on the current global model and the global control variable, the vehicle executes local training combined with a client drift correction mechanism through local data and a differential privacy random gradient descent algorithm to generate local model update; S3, the vehicle calculates a noise multiplier for the next round of training through a personalized self-adaptive differential privacy strategy according to the loss value of the round of training; s4, the vehicle judges whether the update of the current round of model is effective or not based on a selective uploading rule; S5, if the update of the model of the present wheel is effective, uploading the model increment and the local control variable increment to the roadside unit by the vehicle, and aggregating the model increment and the local control variable increment by the roadside unit to update the global model and the global control variable; wherein the calculating the noise multiplier for the next round of training through the personalized adaptive differential privacy policy comprises: in the training, calculating a loss value of local model update on a training batch; obtaining a noise multiplier used in the next round of training through a self-adaptive noise multiplier calculation formula based on the loss value; The adaptive noise multiplier calculation formula is as follows: wherein t+1 is the training round updated by the local model, t+2 is the next training round, For the indexing of the vehicles, For the noise multiplier of the next round of training, For the noise multiplier updated for the local model, The loss value updated for the local model, Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, At the level of the maximum loss value, Is the noise multiplier minimum threshold.
2. The federal learning method for heterogeneous data of the internet of vehicles according to claim 1, wherein the generating the local model update comprises: randomly sampling the local data set to obtain a training batch; Based on the current global model, performing a differential privacy random gradient descent algorithm on the training batch to perform federal learning, so as to obtain gradient parameters with differential privacy protection; And correcting the gradient based on the gradient parameter, the global control variable and the local control variable stored locally by using a client drift correction mechanism to obtain local model update.
3. The federal learning method for heterogeneous data of the internet of vehicles according to claim 1, wherein the local model update is calculated by the following formula: wherein t is the current round, t+1 is the updated round, For the indexing of the vehicles, For the purpose of a local model update, As for the current global model of the model, Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, In order to add a gradient after the noise, As a current local control variable, Is the current global control variable.
4. The federal learning method for heterogeneous data of the internet of vehicles according to claim 1, wherein the selective uploading rule comprises: performing second random sampling on the local data set to obtain a verification batch; respectively calculating the accuracy of the current global model and the local model update on the verification batch, and calculating the loss value difference value of the current global model and the local model update; defining an updated formula for the local control variable: wherein t is the current round, t+1 is the updated round, For the indexing of the vehicles, In order to update the local control variable, As a current local control variable, As the current global control variable, Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, In order to take part in the total number of vehicles in the training, For the purpose of a local model update, Is the current global model; if the loss value difference is smaller than a preset threshold value or the accuracy of the local model update is higher than that of the current global model, judging that the current round of update is effective, and according to the update formula Updating local control variable and noise multiplier used in next training, calculating model increment and local control variable increment, uploading to roadside unit, if the loss value difference is greater than or equal to preset threshold value and the accuracy of local model updating is lower than or equal to the accuracy of current global model, then according to the updating formula The partial terms update the local control variables, the noise multiplier remains unchanged, and the local is preserved.
5. The federal learning method for heterogeneous data of internet of vehicles according to claim 1, wherein updating the global model and the global control variable comprises: The global model updates the formula: wherein t is the current round, t+1 is the updated round, For the indexing of the vehicles, As for the current global model of the model, For the purpose of the global model update, For an uploaded active set of vehicles, For the kth car, the vehicle is, Is a model increment; global control variable update formula: , wherein, For the global control variable to be updated, As the current global control variable, Delta for the local control variable.
6. Federal learning system for heterogeneous data of the internet of vehicles, characterized in that the method according to any one of claims 1-5 is applied, said system comprising roadside units and vehicles; The roadside unit is used for sending the current global model and global control variables to vehicles participating in training, aggregating model increment and local control variable increment and updating the global model and the global control variables; The vehicle is used for performing local training combined with a client drift correction mechanism through local data and a differential privacy random gradient descent algorithm based on a current global model and a global control variable to obtain local model update; according to the loss value of the round of training, calculating a noise multiplier for the next round of training through a personalized self-adaptive differential privacy strategy, judging whether the update of the round of model is effective or not based on a selective uploading rule, and uploading the increment of the model and the increment of a local control variable to a roadside unit if the update of the round of model is effective; wherein the calculating the noise multiplier for the next round of training through the personalized adaptive differential privacy policy comprises: in the training, calculating a loss value of local model update on a training batch; obtaining a noise multiplier used in the next round of training through a self-adaptive noise multiplier calculation formula based on the loss value; The adaptive noise multiplier calculation formula is as follows: wherein t+1 is the training round updated by the local model, t+2 is the next training round, For the indexing of the vehicles, For the noise multiplier of the next round of training, For the noise multiplier updated for the local model, The loss value updated for the local model, Is the parameter of the ultrasonic wave to be used as the ultrasonic wave, At the level of the maximum loss value, Is the noise multiplier minimum threshold.

Description

Federal learning method and system for heterogeneous data of Internet of vehicles Technical Field The application relates to the technical field of intelligent transportation, in particular to a federal learning method and system for heterogeneous data of the Internet of vehicles. Background With the development of the internet of vehicles and intelligent transportation technology, an intelligent transportation system based on the cooperation of vehicle, road and cloud is gradually formed. The multi-source heterogeneous data generated in the running process of the vehicle is a key for improving traffic perception and decision intelligence. However, centralized data processing faces the dual challenges of huge communication overhead and user privacy disclosure. Federal learning is used as a distributed machine learning paradigm, allows vehicles to train models locally, only shares model parameters, and provides a new idea for solving the problem. However, the federal learning is directly applied to the internet of vehicles, which still faces serious challenges, and the prior art is difficult to achieve a good balance among privacy protection, model utility and communication efficiency: One is the contradiction between privacy preserving mechanism rigidification and model accuracy. To defend against privacy inference attacks that may exist in model updates, existing schemes typically introduce differential privacy techniques that reduce the risk of privacy disclosure by adding random noise to the model updates. However, the mainstream method adopts a noise adding strategy of fixed or centralized unified scheduling, and the privacy protection intensity cannot be dynamically adjusted according to the individual data characteristics of the vehicle and the training stage. This results in a model training that converges early or severely with excessive and slow drag for common vehicles, and may fail to protect later or sensitive vehicles, making it difficult to achieve a refined trade-off of privacy and utility. And secondly, the client drift problem caused by data high-degree heterogeneity. The vehicle data has non-independent co-distribution characteristics, which cause the local model update direction to deviate seriously from the global optimal solution, i.e. client drift is generated. The existing relieving method has limited correction capability or introduces complex parameter adjustment, and has poor robustness under the dynamic environment of the Internet of vehicles, so that the accuracy of the global model is reduced and the convergence is unstable. Thirdly, the communication resource is occupied by invalid or negative effect update. Frequent model uploading and aggregation in the multi-round federal training process can bring larger communication burden, partial vehicles have poor data quality, excessive privacy noise or serious local drift, and model updating can negatively contribute to global aggregation to influence the system operation efficiency. To enhance privacy preserving capability. It can be seen that the existing scheme is often in the dilemma of sacrificing one of the three of privacy, utility and communication to preserve the other. The method has the advantages that fixed strong noise is applied to meet strict privacy requirements, the convergence of a model is obviously dragged and the final precision is damaged, and the client drift effect caused by data heterogeneity can be amplified to lead the global model to deviate from the optimal solution for simple client screening or compression for improving communication efficiency. These coupled challenges make it difficult for the prior art to provide a systematic and efficient solution in the dynamic, heterogeneous, resource-constrained scenario of the internet of vehicles. Disclosure of Invention The embodiment of the application provides a federal learning method and a federal learning system for heterogeneous data of the Internet of vehicles, which are used for solving the problems in the prior art. In one aspect, an embodiment of the present application provides a federal learning method for heterogeneous data of internet of vehicles, including: S1, a roadside unit sends a current global model and global control variables to vehicles participating in training; S2, based on the current global model and global control variables, the vehicle executes local training combined with a client drift correction mechanism through local data and a differential privacy random gradient descent algorithm to generate local model update; s3, the vehicle calculates a noise multiplier for the next round of training through a Personalized self-adaptive differential privacy policy (Personalized AND ADAPTIVE DIFFERENTIAL PRIVACY, PADP) according to the loss value of the round of training; s4, the vehicle judges whether the update of the present round model is effective or not based on the selective uploading rule; And S5, if the update of the model of the