CN-121980480-A - Fault diagnosis method and system based on differential attention double-branch network and large language model interpretation

CN121980480ACN 121980480 ACN121980480 ACN 121980480ACN-121980480-A

Abstract

The invention discloses a fault diagnosis method and system based on differential attention double-branch network and large language model interpretation, and relates to the technical field of intelligent operation and maintenance of an industrial system and multi-sensor fault diagnosis. The method comprises the steps of inputting a window to be diagnosed and a reference window in pairs through construction and conducting differential modeling, in a differential branch, adopting a channel cross attention mechanism modulated by channel selection probability prior to implement self-adaptive focusing and contribution reassignment on a key sensor channel, simultaneously introducing a shared characterization branch to describe working condition commonality and cross sample stable modes, fusing the key sensor and the diagnosis basis of statistical characteristics, channel attention weight and the like of the key sensor with the differential branch, inputting a large language model to generate a natural language diagnosis report, and conducting the diagnosis on the key sensor. The method can keep higher diagnosis accuracy under complex working conditions, and can output stable and recheckable fault diagnosis results.

Inventors

LIU YUEHUA
ZHU YIFAN
XIN LIMING
KONG HAO
SHENG BIN
LI PEIYING
GU JIAHAO

Assignees

上海大学

Dates

Publication Date: 20260505
Application Date: 20260408

Claims (10)

1. A fault diagnosis method based on differential attention double-branch network and large language model interpretation is characterized by comprising the following steps: Acquiring multi-sensor time sequence data, constructing a window to be diagnosed and a reference window, forming paired input, and calculating differential information of the window to be diagnosed relative to the reference window; Extracting statistical deviation characteristics of each sensor channel based on the differential information, generating channel importance scores, and obtaining channel selection probability through a channel selection module capable of end-to-end training; The channel selection probability is utilized to guide the cross attention calculation of the channel so as to strengthen a key channel and obtain a difference discrimination feature, the common information of working conditions is extracted to obtain a sharing discrimination feature, and the difference discrimination feature and the classification output corresponding to the sharing discrimination feature are fused to obtain a fault class and a confidence coefficient; outputting a key channel set and corresponding statistical characteristics thereof, and outputting a channel attention weight to form a structural diagnosis evidence; and inputting the structural diagnosis evidence, the fault category and the confidence coefficient into a large language model, generating a natural language diagnosis report and outputting the natural language diagnosis report.
2. The fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 1, wherein the method for generating the window characterization and residual construction comprises the following steps: Slicing the multi-sensor time sequence according to the window length T to obtain a window X to be diagnosed and a reference window : Wherein, C represents the category number, T represents the window length; the reference window is obtained from the data pool according to the search of the working condition similarity, and the working condition similarity is at least based on one or more of environmental variables, load variables and control amounts; After the construction is input, differential information is calculated, and a residual form is adopted: wherein the reference window is used to provide a common operating normal basis to focus the model on deviations rather than absolute changes.
3. A method for fault diagnosis based on differential attention dual-branch network and large language model interpretation as claimed in claim 2, wherein the statistical pilot channel priors of the differential branches comprise the steps of: firstly constructing a channel anomaly prior by differential branches, calculating statistical deviation characteristics by using the mean value and variance information of differential sequences, and carrying out differential sequence on each channel i Calculating deviation amplitude statistics: Wherein, the The standard deviation operation is represented by the formula, Representing the aggregate norm over the time dimension, Data representing the ith channel of the window to be diagnosed, Data representing the ith channel of the reference window; introducing a learnable channel guide score: Wherein the method comprises the steps of , , , In order for the parameters to be able to be learned, Representing the i-th channel's learnable guidance score at time step t, A learnable guidance score representing an ith channel within the window for supplementing statistical deviation features and guiding channel priors; Finally, the channel abnormality priori score is obtained : 。
4. A fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 3, characterized in that the channel selection probability is obtained by the following method: with a continuously relaxed Top-K selection, solving the threshold b causes: Wherein F is For continuous gating in the form of cumulative distribution functions, To smooth the coefficient, control the sharpness of the continuous relaxation, For the number of key channels desired to be selected, m is the importance probability of the key sensor, b represents the adaptive threshold, Representing the soft select value of channel i, Representing the normalized channel selection probability.
5. The fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 4, wherein: in differential channel cross-attention, the channel is treated as a token, and the target window is treated as Taking the reference window as , And introduces a selection prior to highlight critical channels, the difference attentiveness logit is expressed as: ; Wherein Q represents a set of query vector combinations obtained by linear projection of the target window X, K, V represents a reference window A key vector set and a value vector set are obtained through linear projection, wherein m represents normalized channel selection probability; Mapping the channel selection prior to a logic space; Is a temperature coefficient of the silicon carbide material, Is a numerical stability constant.
6. The fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 5, wherein the method for obtaining fault category and confidence comprises the following steps: The shared branch is used for extracting the common information of the working conditions, and the cross attention of the channels is adopted: Wherein, the Representing a shared branch channel attention weight matrix, d representing the feature dimension of the query or key vector, Representing a shared branch attention output; residual protection fusion is performed in the differential branch to enhance the bias components and suppress common interference: Wherein the method comprises the steps of Is a fusion coefficient; extracting time sequence features by multi-branch one-dimensional convolution, and obtaining by time attention pooling The classifier adopts a cosine classification form: Wherein the method comprises the steps of For the class prototype vector to be used, Is a learnable scale factor; The differential branch and the shared branch are respectively output logits, and the fault category and the confidence degree are obtained through weighted fusion, wherein the confidence degree is the maximum category probability obtained through softmax of the fusion logits.
7. The fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 6, wherein the method for forming the structural diagnosis evidence comprises the following steps: Channel contribution from the differential attention tensor is aggregated: wherein H represents the number of attention heads, The method comprises the steps of inquiring attention weight distributed to a key channel i by a channel q under the attention head h, selecting a Top-K channel with the largest contribution degree as a key sensor set, calculating mean values and standard deviations of key sensors in a target window and a reference window respectively, and forming structural diagnosis evidence together with fault categories and confidence degrees, wherein the structural diagnosis evidence is expressed in a form, a key value pair or JSON and at least comprises fault categories, confidence degrees, key channel identifications and key channel statistical characteristics.
8. The method of claim 1, wherein the natural language diagnostic report includes fault class interpretation, critical channel anomalies basis, and possible cause and treatment recommendations.
9. The fault diagnosis method based on differential attention double branch network and large language model interpretation as claimed in claim 1, wherein when the consistency check fails, the following procedure is performed: deleting or shielding inconsistent content, including illegal sensor names, statistics inconsistent with evidence and fault category descriptions inconsistent with evidence; correcting, namely replacing the content by using corresponding fields in the structural diagnosis evidence, or calling a large language model to rewrite the marked section under the constraint of the evidence; rechecking, namely, performing consistency check again on the corrected report, and outputting a simplified report which only consists of verifiable evidence fields if the report still does not pass.
10. A fault diagnosis system based on differential attention double-branch network and large language model interpretation, characterized by comprising: The paired window differential modeling module is used for constructing a window to be diagnosed and a reference window, forming paired input, and calculating differential information for describing deviation, wherein the reference window is obtained by searching a normal data pool according to working condition similarity, and the working condition similarity is at least based on one or more of environmental variables, load variables and control quantities; the statistical guided channel selection module is used for extracting statistical deviation characteristics of each channel based on the differential sequence to form a channel importance score, and the channel selection module capable of end-to-end training outputs channel selection probability; The differential attention module is used for guiding the cross attention calculation of the channels by using the channel selection probability, so that the key channels are enhanced in the attention matching and aggregation, thereby forming differential distinguishing characteristics; the shared characterization and fusion classification module is used for extracting the common information of the working conditions in parallel and fusing the common information with the differential branches to output fault types and confidence; The structural evidence and large language model interpretation module is used for outputting a key channel set, statistical characteristics and attention weight thereof to form structural diagnosis evidence, inputting a large language model to generate a natural language diagnosis report, introducing a device knowledge base to search and enhance, and carrying out consistency check, filtering and correction on the output.

Description

Fault diagnosis method and system based on differential attention double-branch network and large language model interpretation Technical Field The invention relates to the technical field of intelligent operation and maintenance of an industrial system and multi-sensor fault diagnosis, in particular to a fault diagnosis method and system based on differential attention double-branch network and large language model interpretation. Background Industrial systems (e.g., refrigeration units, HVAC systems, industrial process devices, etc.) are often configured with a large number of sensors for monitoring and control. The multi-sensor time sequence has the characteristics of strong coupling, strong noise, obvious working condition drift and the like under the influence of load change, environmental disturbance and control compensation. The existing data driven diagnostic method can output fault types, but generally has the following disadvantages: 1) The working condition changes cause the fault characteristics to be overlapped with the normal changes, so that the robustness is insufficient; 2) The model output is mainly classified, and lacks a transparent evidence chain of 'key sensor-key time segment-evidence index', so that rechecking is difficult; 3) The importance or attention fluctuates greatly among samples, and stable and consistent sensor attribution results are difficult to form; 4) The lack of natural language interpretation and treatment suggestions for operation and maintenance personnel affects the engineering landing efficiency. Therefore, there is a need for a fault diagnosis scheme that can maintain high diagnostic accuracy under complex conditions, and that can output stable, recheckable sensor-level evidence and user-friendly interpretation. Disclosure of Invention The invention aims to solve the technical problem of providing a method capable of maintaining higher diagnosis accuracy under complex working conditions and outputting stable and recheckable fault diagnosis results. In order to solve the technical problems, the technical scheme adopted by the invention is that the fault diagnosis method based on differential attention double-branch network and large language model interpretation comprises the following steps: Acquiring multi-sensor time sequence data, constructing a window to be diagnosed and a reference window, forming paired input, and calculating differential information of the window to be diagnosed relative to the reference window; Extracting statistical deviation characteristics of each sensor channel based on the differential information, generating channel importance scores, and obtaining channel selection probability through a channel selection module capable of end-to-end training; The channel selection probability is utilized to guide the cross attention calculation of the channel so as to strengthen the key channel and obtain the differential discrimination feature, the common information of the working conditions is extracted to obtain the shared discrimination feature, and the differential discrimination feature and the classification output corresponding to the shared discrimination feature are fused to obtain the fault category and the confidence coefficient; outputting a key channel set and corresponding statistical characteristics thereof, and outputting a channel attention weight to form a structural diagnosis evidence; and inputting the structural diagnosis evidence, the fault category and the confidence coefficient into a large language model, generating a natural language diagnosis report and outputting the natural language diagnosis report. The invention also discloses a fault diagnosis system based on the differential attention double-branch network and the large language model interpretation, which comprises: The paired window differential modeling module is used for constructing a window to be diagnosed and a reference window, forming paired input, and calculating differential information for describing deviation, wherein the reference window is obtained by searching a normal data pool according to working condition similarity, and the working condition similarity is at least based on one or more of environmental variables, load variables and control quantities; the statistical guided channel selection module is used for extracting statistical deviation characteristics of each channel based on the differential sequence to form a channel importance score, and the channel selection module capable of end-to-end training outputs channel selection probability; The differential attention module is used for guiding the cross attention calculation of the channels by using the channel selection probability, so that the key channels are enhanced in the attention matching and aggregation, thereby forming differential distinguishing characteristics; the shared characterization and fusion classification module is used for extracting the common information of the worki