CN-121981204-A - Heterogeneous circuit federal consistency diagnosis method for defending composite security attack

CN121981204ACN 121981204 ACN121981204 ACN 121981204ACN-121981204-A

Abstract

The application discloses a heterogeneous circuit federal consistency diagnosis method for defending composite security attack, which relates to the field of federal learning, and comprises the steps of obtaining an optimal feature subset matrix according to a local data set and encrypting; the central server receives the encrypted optimal feature subset matrix, obtains a global optimal feature subset after denoising, obtains the features screened by each user according to the global optimal feature subset, determines malicious users according to local model parameters of all users, processes the local model parameters of the malicious users, determines global model parameters based on the processed local model parameters of the malicious users and anti-interference regular terms, and performs train bogie fault diagnosis by adopting a trained global model on the basis of train bogie vibration signal data acquired in real time by each operation line. The application solves the problem of compound security attack of heterogeneous circuits in federal diagnosis.

Inventors

QIN NA
DU JIAHAO
HUANG DEQING
CHENG YANZI

Assignees

西南交通大学

Dates

Publication Date: 20260505
Application Date: 20260106

Claims (10)

1. A heterogeneous circuit federal consistency diagnostic method for defending against composite security attacks, the method comprising: Obtaining local data sets of a plurality of users according to vibration signal data of train bogies of different running lines under multiple working conditions; The method comprises the steps of obtaining a local data set of each user, obtaining a feature matrix of each user, screening the feature matrix of each user to obtain an optimal feature subset matrix of each user, encrypting the optimal feature subset matrix of each user by adopting a differential privacy method, and uploading the optimal feature subset matrix to a central server; The central server receives the encrypted optimal feature subset matrix of all users, and after denoising, a spatial clustering algorithm is adopted to obtain a global optimal feature subset, and feature indexes corresponding to the global optimal feature subset are issued to corresponding users; Aiming at the m-th federation training, a central server receives local model parameters of all users, performs normalization processing and then determines malicious users and honest users according to cosine similarity, and introduces a regularization-based parameter correction strategy for the malicious users to correct the local model parameters and perform median dynamic pruning strategy processing on the corrected local model parameters; if the disconnection of the user and the central server is detected, based on the number of the characteristics screened by each user, designating an agent user from the online user, and carrying out local model parameter interaction and gradient correction with the disconnection user in a homomorphic encryption mode so as to maintain the consistency of the disconnection user and federal training; and each running line adopts a trained global model to carry out train bogie fault diagnosis based on the train bogie vibration signal data acquired in real time.
2. The method for diagnosing federal consistency of heterogeneous circuits for defending against composite security attack according to claim 1, wherein the feature matrix of each user is filtered to obtain an optimal feature subset matrix of each user, and the method specifically comprises: For any user, calculating the symmetry uncertainty of the feature matrix of the user to obtain a feature association matrix of the user; calculating the characteristic-category mutual information of the user to obtain a characteristic-category relation matrix of the user; Clustering the characteristic relevance matrix of the user and the characteristic-class relation matrix of the user by adopting a spatial clustering algorithm to obtain a characteristic relevance clustering result and a characteristic-class relation clustering result; respectively calculating the average value of each cluster in the feature relevance clustering result and the average value of each cluster in the feature-class relation clustering result, and screening the clusters in the feature relevance clustering result and the clusters in the feature-class relation clustering result according to the preset average value; and determining an intersection of the screened clusters in the feature relevance clustering result and the screened clusters in the feature-class relation clustering result, and determining an optimal feature subset matrix of the user according to the intersection.
3. The method for diagnosing federal consistency of heterogeneous circuits for protecting against composite security attack according to claim 1, wherein the feature matrix of each user comprises a plurality of feature vectors, and wherein the symmetric uncertainty of the ith feature vector and the jth feature vector in the feature matrix of the kth user is determined by using the following formula: ; Wherein, the For symmetry uncertainty between the ith feature vector and the jth feature vector of the kth user, The ith feature vector for the kth user, The jth feature vector for the kth user, For the joint probability distribution function of the ith feature vector and the jth feature vector, As an edge probability distribution function of the ith feature vector, As an edge probability distribution function of the jth feature vector, For the entropy of the ith feature vector, Is the entropy of the j-th feature vector.
4. The method for diagnosing federal consistency of heterogeneous circuits for protecting against composite security attack according to claim 3, wherein a condition corresponds to a category, and the ith feature vector of the kth user are determined by using the following formula Feature-category mutual information between individual category vectors: ; Wherein, the Ith feature vector and ith feature vector for kth user Feature-class mutual information between individual class vectors, As a class vector of the type, Is a category vector Is a function of the edge probability distribution of (c), For the ith feature vector and class vector Is provided.
5. The method for diagnosing federal consistency of heterogeneous circuits for defending against composite security attack according to claim 1, wherein the optimal feature subset matrix comprises a plurality of feature vectors, and the optimal feature subset matrix of each user is encrypted by using a differential privacy law, and specifically comprises: for the a-th feature vector in the optimal feature subset matrix of the k-th user, adopting a formula Determining the sensitivity of the a-th feature vector, wherein, For the dimension of the a-th feature vector, For the sensitivity of the a-th feature vector, Is the minimum value in the a-th feature vector, Is the maximum value in the a-th feature vector; Based on the sensitivity of the a-th feature, the formula is adopted The standard deviation of the noise is determined, wherein, A first privacy parameter selected randomly is used for controlling the severity of privacy; for a second privacy parameter selected randomly, representing the probability that the amount of information actually leaked exceeds a theoretical upper limit; Is the standard deviation of noise; and generating noise conforming to Gaussian distribution based on the standard deviation of the noise, and correspondingly adding the noise to an optimal feature subset matrix of the kth user to obtain an encrypted optimal feature subset moment of the kth user.
6. The method for diagnosing federal consistency of heterogeneous circuits for defending against composite security attack according to claim 1, wherein the central server receives local model parameters of all users, performs normalization processing, and determines malicious users and honest users according to cosine similarity, comprising: After receiving the local model parameters of all users, the central server performs normalization processing on the local model parameters of all users to obtain local model parameter vectors of all users; Based on the local model parameter vectors of all users, determining cosine similarity between any two users; obtaining a similarity score of each user based on cosine similarity between any two users; And determining malicious users and honest users based on the preset similarity score and the similarity score of each user.
7. The method for diagnosing federal consistency of heterogeneous circuits against composite security attacks according to claim 1, wherein the local model parameters of malicious users are corrected using the following formula: ; ; ; Wherein, the Local model number representing malicious users in the mth federal training The parameters of the parameters are set to be, Global model number representing the end of the m-1 st federal training The parameters of the parameters are set to be, Representing the corrected first moment estimate, Representing the corrected second moment estimate, Representing a stability factor for determining Whether or not to stabilize; Representing an adjustment factor for correcting the deviation of the adaptive learning rate; Representing a positive number for avoiding denominator zero in the numerical calculation process; The regularization coefficient is represented as a function of the regularization coefficient, The activation function is represented as a function of the activation, Representing local model number Gradient of the t-th update of the parameter, Representing local model number of malicious user in number t-1 update in number m federal training The parameters of the parameters are set to be, The learning rate is indicated as being indicative of the learning rate, Representing local model number of malicious users in number m federal training at number t update And parameters.
8. The method for diagnosing federal consistency of a heterogeneous circuit for protecting against composite security attack according to claim 1, wherein the following formula is used to determine global model parameters for the mth federal training: the global model parameters for the mth federal training were determined using the following formula: ; ; Wherein, the Local model number representing user k in the mth federal training The parameters of the parameters are set to be, Representing global model mth in mth federal training The result of the aggregation of the individual parameters, Representing global model number in number m-1 federal training The result of the aggregation of the individual parameters, As a result of the first adjustment factor, As a result of the second adjustment factor, For the third adjustment factor, n is the number of users, To characterize the anti-interference regularization term associated with the malicious user by the honest user in the mth federal training, Local model number representing user k in honest user group in mth federal training The parameters of the parameters are set to be, Local model number representing user k in malicious user group in number m federal training A parameter, q is the number of honest users, p is the number of malicious users, And representing an activation function for avoiding interference caused by the large-amplitude parameters of the malicious user.
9. The method for diagnosing federal consistency of heterogeneous circuits for defending against composite security attacks according to claim 1, wherein the method for diagnosing federal consistency of heterogeneous circuits for defending against composite security attacks is characterized by performing parameter interaction and gradient correction with a disconnected user in a homomorphic encryption manner, and specifically comprising the following steps: the central server randomly generates a pair of public key and private key based on Paillier homomorphic encryption and sends the public key and private key to the proxy user; the proxy user receives the public key and the private key and sends the public key to the disconnection user; the disconnected user receives the public key, encrypts the local model parameters according to the public key and sends the encrypted local model parameters to the proxy user; the proxy user receives the local model parameters encrypted by the disconnection user, and measures the local model parameter difference between the proxy user and the disconnection user in an encrypted state by combining the local model encrypted parameters; Decrypting the local model parameter difference between the proxy user and the disconnection user by the proxy user by adopting a private key, calculating a sign value of the decrypted local model parameter difference, obtaining local model parameter distribution deviation between the proxy user and the disconnection user, and sending the local model parameter distribution deviation to the disconnection user; and the disconnection user finely adjusts the gradient of local model parameter updating according to the local model parameter distribution deviation of the proxy user and the disconnection user.
10. The method for diagnosing federal consistency of a heterogeneous circuit for protecting against composite security attack according to claim 9, wherein the gradient of local model parameter updates for the disconnected user is determined using the following formula: ; ; Wherein, the Representing local model number Gradient of the t-th update of the parameter, Representing the corrected first moment estimate, Representing the corrected second moment estimate, Representing an adjustment factor for correcting the deviation of the adaptive learning rate; Local model representing agent user and disconnected user The deviation of the distribution of the individual parameters, The stability factor is represented by a factor of stability, Representing regularization coefficients of the disconnected users.

Description

Heterogeneous circuit federal consistency diagnosis method for defending composite security attack Technical Field The application relates to the field of federal learning, in particular to a heterogeneous circuit federal consistency diagnosis method for defending composite security attack. Background In recent years, artificial intelligence technology has been increasingly popular, and in particular, a deep learning algorithm has been widely applied to railway industry and has achieved a series of outstanding achievements by virtue of its high accuracy and high adaptability. For example, by collecting and analyzing historical data, intelligent algorithms can automatically detect potential faults of infrastructure such as rails, vehicles, etc., achieve preventative maintenance, and reduce incident incidence. Meanwhile, the service life of the equipment can be predicted, the purchasing period can be reasonably planned, and the cost is saved. However, when applying the deep learning algorithm in the actual maintenance and overhaul of the railway train, a series of problems such as insufficient data volume, low sample quality, inaccurate labels and the like may be encountered. Taking a train bogie as an example, a plurality of groups of sensors are strategically deployed in the vertical, horizontal and longitudinal directions of the bogie, so that acceleration and displacement data of key components on the bogie can be collected, real-time monitoring of the running state of the bogie is realized, and the actual running characteristics of the bogie can be comprehensively analyzed. Therefore, when a sensor group monitoring a certain critical component fails, the dimension of the monitored data is lost. This may miss some important features of the part, making it impossible to resolve its state all around, making it difficult to achieve efficient and accurate operation and maintenance. The integration of multi-channel and multi-dimensional information and the generation of a comprehensive model with strong adaptivity are important trends of the current artificial intelligence development in the big data age. In an actual railway scenario, due to the fact that railway running environments and line conditions are different, train bogies on different running lines often have unique characteristic data reflecting frequent faults of the train bogies. These fault data are typically small-volume and fragmented, resulting in a weak generalization of the model trained based on only a single line, which is difficult to adapt to the actual situation of other lines. Moreover, the lack of the dimension of the monitoring data can further aggravate heterogeneity among lines, so that the characteristic difference of the sensor data of different lines is more obvious. On the other hand, there may be data barriers between different line trains. Data is often tightly protected from leaving the local storage device. Direct concentration of multiparty data and training not only may trigger security privacy concerns for the data, but may also greatly increase economic costs. Therefore, integrating knowledge of fault characteristics of a multi-line train and generating a comprehensive model with strong adaptivity is difficult to realize in a practical scene. For the unified training of cooperative heterogeneous users and obtaining a self-adaptive fault diagnosis model with strong generalization capability, scholars propose a federal learning idea which is suitable for intelligent industrial scenes with high requirements on data security and aims to solve the problems through secure cooperative modeling of multi-party heterogeneous data holders. Obviously, information security is the primary guarantee for realizing cooperative diagnosis of heterogeneous railway trains. Because the intelligent framework based on distributed learning is realized, although data of all users cannot leave the local in the whole process, interaction parameter information is needed between the local train and the central server to realize cooperative training, and external attacks such as hijacking of transmission information and communication interruption cannot be avoided. More seriously, malicious users can exist in the framework, and error information is transmitted to a central server by launching a Bayesian attack to interfere with global model aggregation, so that the training efficiency of the whole framework is affected. Under the interference of the composite security attack, efficient collaborative modeling of the multi-line train becomes very difficult. Specifically, the influence of the compound security attack on federal consistency diagnosis among heterogeneous circuits is reflected in the following two aspects: (1) The malicious parameters can influence the aggregation result of the central server on the whole model to generate a reverse aggregation effect, and after the server transmits the global model back to all users, the server can furthe