CN-117173882-B - Traffic jam prediction method, system, equipment and medium based on swarm learning

CN117173882BCN 117173882 BCN117173882 BCN 117173882BCN-117173882-B

Abstract

The invention discloses a traffic jam prediction method, a system, equipment and a medium based on swarm learning, and relates to the technical field of swarm learning. The method comprises the steps of determining a robustness value of a local model of each participant node, obtaining a local model training evaluation weight of each participant node in the current aggregation round according to local data credibility, robustness value and machine learning evaluation index of each participant node in the previous aggregation round under the current aggregation round, denoising the local model of the current aggregation round by adopting a localized differential privacy technology to obtain a denoised local model in the current aggregation round, obtaining a local model in the next aggregation round according to the local model of the current aggregation round, the local model training evaluation weight and the denoised local model, and determining the local model in the next aggregation round as a traffic prediction model. The invention can improve the privacy and the safety in the process of data processing and traffic prediction model training.

Inventors

KANG HAIYAN
WANG JIAKANG
WU SIYUAN
JIANG HONGLING
JI SHANSHAN

Assignees

北京信息科技大学

Dates

Publication Date: 20260508
Application Date: 20230821

Claims (10)

1. A traffic jam prediction method based on bee colony learning is characterized by comprising the following steps: using vehicles as participant nodes in a bee colony network, and determining the robustness value of a local model of each participant node according to the size of a traffic data set of each participant node in the bee colony network, wherein the size of the traffic data set is the total number of data amounts in the traffic data set; Under the current aggregation round number, according to the local data credibility of each participant node under the previous aggregation round number and the robustness value of the local model of each participant node, obtaining the local data credibility of each participant node under the current aggregation round number; Obtaining the credibility of the local model of each participant node under the current aggregation round number according to the machine learning evaluation index under the current aggregation round number, wherein the machine learning evaluation index comprises a real example, a false positive example and a false negative example; Obtaining local model training evaluation weights of the participant nodes under the current aggregation round number according to the local data credibility of the participant nodes under the previous aggregation round number, the credibility of the local model of the participant nodes under the previous aggregation round number and the loss function of the traffic data set corresponding to the participant nodes under the previous aggregation round number; the model parameters of the local model of the current aggregation round number are subjected to noise adding by adopting a localization differential privacy technology to obtain a noise adding local model under the current aggregation round number; according to the local model of the current aggregation round number, the local model training evaluation weight of each participant node under the current aggregation round number and the noise adding local model under the current aggregation round number, obtaining a local model under the next aggregation round number; judging whether the current aggregation wheel number reaches the set aggregation wheel number or not to obtain a first judging result; If the first judgment result is yes, determining a local model under the next aggregation wheel number as a traffic prediction model, wherein the traffic prediction model is used for predicting traffic jam conditions; if the first judgment result is negative, updating the aggregation round number, and entering the next aggregation.
2. The traffic congestion prediction method based on swarm learning according to claim 1, wherein determining the robustness value of the local model of each of the participant nodes according to the size of the traffic data set of each of the participant nodes in the swarm network, specifically comprises: For any one of the participant nodes N k , a robustness value of the local model of the participant node N k is calculated according to the formula C (N k )＝log m S Nk ), where C (N k ) represents the robustness value of the local model of the participant node N k , m represents the size of the traffic dataset where the total number of data amounts of all the participant nodes is the largest, and S Nk represents the size of the traffic dataset of the participant node N k .
3. The traffic congestion prediction method based on swarm learning according to claim 1, wherein obtaining the local data reliability of each participant node under the current aggregation round number according to the local data reliability of each participant node under the previous aggregation round number and the robustness value of the local model of each participant node specifically comprises: For any one of the participant nodes N k , according to the formula Calculating the local data reliability of the participant node N k under the ith aggregation round, wherein C i (N k ) represents the local data reliability of the participant node N k under the ith aggregation round, C (N k ) represents the robustness value of the local model of the participant node N k , C i-1 (N k ) represents the local data reliability of the participant node N k under the i-1 th aggregation round, N represents the total number of the participant nodes in the bee colony network, and C (N j ) represents the robustness value of the local model of the participant node N j .
4. The traffic congestion prediction method based on swarm learning according to claim 1, wherein the obtaining the credibility of the local model of each participant node in the current aggregation round number according to the machine learning evaluation index in the current aggregation round number specifically comprises: For any one of the participant nodes N k , according to the formula The reliability of the local model of the participant node N k under the ith polymerization round is calculated, wherein A i (N k ) represents the reliability of the local model of the participant node N k under the ith polymerization round, beta represents a first adjusting parameter, TP i represents a real example under the ith polymerization round, FP i represents a false positive example under the ith polymerization round, and FN i represents a false negative example under the ith polymerization round.
5. The traffic congestion prediction method based on bee colony learning according to claim 1, wherein the obtaining the local model training evaluation weight of each participant node under the current aggregation round according to the local data reliability of each participant node under the previous aggregation round, the reliability of each participant node local model under the previous aggregation round, and the loss function of the traffic data set corresponding to each participant node by each participant node under the previous aggregation round specifically comprises: For any one of the participant nodes N k , according to the formula Calculating a local model training evaluation weight of a participant node N k under the ith aggregation round, wherein W i (N k ) represents the local model training evaluation weight of a participant node N k under the ith aggregation round, C i-1 (N k ) represents the local data reliability of a participant node N k under the ith-1 aggregation round, C i-1 (N j ) represents the local data reliability of a participant node N j under the ith-1 aggregation round, a i-1 (N k ) represents the reliability of a participant node N k under the ith-1 aggregation round, a i-1 (N j ) represents the reliability of a local model of a participant node N j under the ith-1 aggregation round, N represents the total number of participant nodes in the bee colony network, α represents a second adjustment parameter, exp () represents an exponential function based on e, L i-1 (N k ) represents the local data reliability of a participant node N k under the ith-1 aggregation round on a traffic data set corresponding to a participant node N k , and W i-1 (N k ) represents the loss of a local model training evaluation weight of a participant node N k under the ith-1 aggregation round.
6. The traffic congestion prediction method based on swarm learning according to claim 1, wherein the obtaining the local model of the next aggregation round number according to the local model of the current aggregation round number, the local model training evaluation weight of each participant node of the current aggregation round number, and the noisy local model of the current aggregation round number specifically comprises: According to the formula Calculating a local model at the next aggregation round number, wherein M (i+1) represents the local model at the (i+1) th aggregation round number, M (i) represents the local model at the (i) th aggregation round number, N represents the total number of participant nodes in the bee colony network, W i (N k ) represents the local model training evaluation weight of the participant node N k at the (i) th aggregation round number, Representing the noisy local model at the ith aggregation round number, representing the multiplication operation.
7. A traffic congestion prediction system based on swarm learning, comprising: The system comprises a robustness value calculation module, a traffic data set generation module and a traffic data set generation module, wherein the robustness value calculation module takes a vehicle as a participant node in a bee colony network, and determines the robustness value of a local model of each participant node according to the size of the traffic data set of each participant node in the bee colony network, wherein the size of the traffic data set is the total number of data amounts in the traffic data set; The local data reliability calculation module is used for obtaining the local data reliability of each participant node under the current aggregation round number according to the local data reliability of each participant node under the previous aggregation round number and the robustness value of the local model of each participant node under the current aggregation round number; The local model credibility calculation module is used for obtaining credibility of local models of all the participant nodes under the current aggregation round number according to machine learning evaluation indexes under the current aggregation round number, wherein the machine learning evaluation indexes comprise real examples, false positive examples and false negative examples; The local model training evaluation weight calculation module is used for obtaining the local model training evaluation weight of each participant node under the current aggregation round according to the local data credibility of each participant node under the previous aggregation round, the credibility of each participant node local model under the previous aggregation round and the loss function of the traffic data set corresponding to each participant node by each participant node under the previous aggregation round; The localization differential privacy module is used for carrying out noise adding on the model parameters of the local model of the current aggregation round number by adopting a localization differential privacy technology to obtain a noise added local model under the current aggregation round number; The local model updating module is used for training and evaluating weights according to the local model of the current aggregation round number, the local model of each participant node under the current aggregation round number and the noise adding local model under the current aggregation round number to obtain a local model under the next aggregation round number; the judging module is used for judging whether the current aggregation wheel number reaches the set aggregation wheel number or not to obtain a first judging result; The traffic prediction model determining module is used for determining a local model under the next aggregation wheel number as a traffic prediction model if the first judging result is yes, and the traffic prediction model is used for predicting traffic jam conditions; and the iteration module is used for updating the aggregation round number and entering the next aggregation if the first judging result is negative.
8. The traffic congestion prediction system based on swarm learning according to claim 7, wherein said robustness value calculation module specifically comprises: a robustness value calculation unit, configured to calculate, for any one of the participant nodes N k , a robustness value of the local model of the participant node N k according to a formula C (N k )＝log m S Nk ), where C (N k ) represents the robustness value of the local model of the participant node N k , m represents a size of a traffic data set having a maximum total number of data amounts of all the participant nodes, and S Nk represents a size of the traffic data set of the participant node N k .
9. An electronic device, comprising: A memory for storing a computer program that is executed by the processor to cause the electronic device to execute the traffic congestion prediction method based on the swarm learning according to any one of claims 1 to 6.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the traffic congestion prediction method based on bee colony learning as claimed in any one of claims 1 to 6.

Description

Traffic jam prediction method, system, equipment and medium based on swarm learning Technical Field The invention relates to the technical field of swarm learning, in particular to a traffic jam prediction method, a system, equipment and a medium based on swarm learning. Background The existing vehicles are all provided with vehicle-mounted sensors, can collect various traffic data such as traffic flow, speed and road condition in real time, and can use the data to train a traffic prediction model so as to realize subsequent traffic jam prediction. However, conventional centralized machine learning methods face some challenges when processing large-scale data. First, centralized machine learning requires that all raw data be collected to a central server for training and model consolidation, which can involve large data transmission and storage requirements, and can involve privacy and security issues. Furthermore, some data owners may be reluctant to share their sensitive data, limiting the feasibility of the centralized machine approach. Second, centralized machine methods cannot accommodate data changes and heterogeneity in a distributed environment. In a distributed environment, there may be differences in data distribution, quantity, and quality of different participant nodes, resulting in a centralized model that is difficult to adapt to different data characteristics and variations. To overcome these problems, swarm learning has been developed. The bee colony learning is a machine learning framework with decentralization, and safe model merging and parameter sharing are realized while protecting private data by utilizing edge calculation and point-to-point network technology. It ensures the trustworthiness of the network through blockchain technology and provides a higher level of data security and protection capability, enabling participants to work together securely for machine learning tasks, which allows model training and updating of data scattered across different sites while protecting data privacy. In swarm learning, participant nodes (e.g., devices, sensors, or edge nodes) perform model training locally and use local data for learning. And then, sharing model parameters among the participant nodes through a point-to-point network and an edge computing technology, and realizing the combination and aggregation of the models. Compared to the centralized machine approach, swarm learning has the following advantages: 1. and data privacy protection, namely, the swarm learning avoids centralized sharing of original data through localized learning and parameter sharing. The participant nodes only need to share part of model parameters, and do not need to share original data, so that the privacy and safety of the data are protected. 2. And the method has the advantages of high-efficiency communication and calculation, namely, the swarm learning reduces the cost of data transmission and calculation through a point-to-point network and edge calculation. The participant nodes perform model training locally and only share model parameters, thereby reducing traffic and computational load. 3. Adaptive model aggregation-swarm learning allows the participant nodes to adaptively model aggregate according to their weights. However, the data measured by the sensor are directly related to the privacy of the driver and the vehicle, so that the sensor is very sensitive, the privacy of the data is ensured when the traffic model is trained, the privacy protection of the driver and the vehicle is ensured, and better traffic management and driving experience are realized. But because swarm learning has the following drawbacks: 1. scalability-swarm learning requires intensive communication and collaboration between the participant nodes, and as the number of participant nodes increases, the complexity and management difficulty of the system increases. This can present challenges to the scalability of the system, particularly in dynamic environments where joining and exiting of participant nodes can lead to network instability. 2. Data heterogeneity-in swarm learning, there may be differences in data distribution, scale, and quality of different participant nodes, which may lead to bias or inaccuracy of the model. 3. Security and privacy risks-swarm learning involves model parameter sharing and communication between participant nodes, which may introduce security and privacy risks, and sharing parameters that are not properly protected may be attacked by malicious parties, resulting in risks of information disclosure or model tampering. These drawbacks can lead to threats to privacy and security in the course of data processing and traffic prediction model training using existing swarm learning. Disclosure of Invention The invention aims to provide a traffic jam prediction method, a system, equipment and a medium based on swarm learning, which can improve the privacy and safety in the process of data processing and traf