EP-4738037-A1 - IMPROVING THE EXPLAINABILITY OF ALARMS IN INDUSTRIAL PLANTS
Abstract
A method (100) for monitoring and/or controlling an industrial process (2) that is executed on an industrial plant (1), comprising the steps of: • obtaining (110) predictions (3a) for future values of one or more process variables (3) that characterize the state, and/or the behavior, of the industrial process (2); • determining (120), based at least in part on these predictions (3a) for future values, a prediction (4a) as to which one or more alarms (4) from a given set of possible alarms are likely to be raised within a predetermined time frame reaching into the future; and • identifying (130) at least one process variable (3*) whose predicted evolution (3a) over time is responsible for the likely raising of the one or more alarms (4a).
Inventors
- STREM, Nika
- Kotriwala, Arzam Muzaffar
- BUELOW, Fabian
- HUEHNERBEIN, Ruben
- DIX, MARCEL
- MACZEY, SYLVIA
- ZHANG, YANQING
- ZIOBRO, Dawid
- BRORSSON, Emmanuel
- MANCA, Gianluca
- BHATTACHARYA, Nilavra
Assignees
- ABB SCHWEIZ AG
Dates
- Publication Date
- 20260506
- Application Date
- 20241031
Claims (19)
- A method (100) for monitoring and/or controlling an industrial process (2) that is executed on an industrial plant (1), comprising the steps of: • obtaining (110) predictions (3a) for future values of one or more process variables (3) that characterize the state, and/or the behavior, of the industrial process (2); • determining (120), based at least in part on these predictions (3a) for future values, a prediction (4a) as to which one or more alarms (4) from a given set of possible alarms are likely to be raised within a predetermined time frame reaching into the future; and • identifying (130) at least one process variable (3*) whose predicted evolution (3a) over time is responsible for the likely raising of the one or more alarms (4a).
- The method (100) of claim 1, wherein the predictions (3a) for future values are obtained (111) by means of a sequence-to-sequence machine learning model that takes a time series of one or more process variables (3) as input and outputs a time series of one or more process variables (3) that at least partially reaches into the future.
- The method (100) of claim 2, wherein the sequence-to-sequence machine learning model comprises (111a) a long short-term memory, LSTM, network, a recurrent neural network, RNN, a temporal convolution network, TCN, or other regression model, and/or a transformer network.
- The method (100) of any one of claims 1 to 3, wherein the obtaining (110) of predictions (3a) for future values of one or more process variables comprises: • encoding (112), by means of a trained encoder model (5), a sequence of one or more values of the one or more process variables (3) into a representation (3#); and • decoding (113), by means of a trained decoder model (6), the sought predictions (3a) from this representation (3#).
- The method (100) of claim 4, wherein the representation (3#) depends (112a) on fewer independent variables than the sequence of the one or more values of the one or more process variables (3).
- The method (100) of any one of claims 1 to 5, wherein the determining (120) of the prediction (4a) as to which alarms (4) are likely to be raised comprises: • providing (121) predicted future values (3a) to a classifier machine learning model (7) that determines classification scores (7a) with respect to one or more classes corresponding to alarms; and • determining (122) the likelihood that a particular alarm (4) is raised based at least in part on the corresponding classification score (7a).
- The method (100) of claim 6, wherein at least one process variable (3*) whose predicted evolution over time is responsible for the likely raising of a given alarm (4a) is identified (139) based at least on a saliency analysis of the output (7a, 7b) of the classifier machine learning model (7).
- The method (100) of any one of claims 1 to 7, wherein the identifying (130) of at least one process variable (3*) whose predicted evolution over time is responsible for the likely raising of a given alarm (4) comprises: • determining (131), based on information about the topology of the industrial plant (1), out of a complete set of assets (8), a subset (8a) of assets (8) that have a potential to contribute to the raising of the given alarm (4a); • identifying (132), from a complete set of process variables (3) of the industrial process (1), a subset (3b) of process variables (3) that are indicative of the operating state of at least one asset (8) from the subset (8a) of assets (8); and • analyzing (133), for at least one process variable (3) from the subset (3b) of process variables (3), whether this process variable (3) is responsible for the likely raising of the given alarm 4a.
- The method (100) of any one of claims 1 to 8, wherein the identifying at least one process variable (3*) whose predicted evolution over time is responsible for the likely raising of a given alarm (4a) comprises: • modifying (134) the predicted evolution (3a) of at least one candidate process variable (3) by removing, from this predicted evolution (3a), at least one suspected anomaly; • simulating (135) the behavior (2a) of the industrial process (2) given this modified evolution (3a*) of the at least one candidate process variable (3); • determining (136), based on the outcome (2a) of this simulation, whether the given alarm (4a) is still likely to be raised; and • if the given alarm (4) is no longer likely to be raised, identifying (137) the candidate process variable (3) as one process variable (3*) whose predicted evolution over time is responsible for the likely raising of the given alarm (4).
- The method (100) of any one of claims 1 to 9, further comprising: in response to determining that an alarm (4a) is likely to be raised, • determining (140) a remedy (4b) for the anomalous state of the industrial process, and/or of the industrial plant, indicated by the alarm (4a); and • executing (150) this remedy (4b) by means of physical interaction with the industrial process (2), and/or with the industrial plant (1); and/or • proposing (160) this remedy (4b) to an operator (O) of the industrial plant (1).
- The method (100) of any one of claims 1 to 10, further comprising: • rating (170), by a predetermined loss function (9), how well predictions (3a) for values of one or more process variables (3) at particular future points in time are in agreement with actual values of these process variables (3) acquired later for these points in time; and • optimizing (180) parameters (10) that characterize how the future values (3a) of one or more process variables (3) are predicted towards the goal of improving the rating (9a) by the loss function (9).
- The method (100) of any one of claims 1 to 12, further comprising: • rating (190), by a predetermined loss function (11), how well predicted likely alarms (4a) for time frames in the future correspond to alarms (4) that have later actually been raised in these time frames; and • optimizing (200) parameters (12) that characterize how likely alarms (4a) are predicted towards the goal of improving the rating (11a) by the loss function (11).
- The method (100) of claim 12, wherein, out of the complete set of all alarms (4) available in the industrial plant (1), only a given subset (4c) of important alarms (4) is considered (191) for the rating (11a) by the loss function (11).
- The method (100) of any one of claims 11 to 13, wherein the optimizing (180, 200) of the parameters (10, 12) is further based (181, 201) on feedback (F) from an operator (O) of the industrial plant (1) as to which process variables (3) are relevant for the likelihood of a particular alarm (4) or not.
- The method (100) of any one of claims 1 to 14, further comprising: determining (210), based at least in part on the identification of at least one process variable (3*) whose predicted evolution over time is responsible for the likely raising of an alarm (4a), a likely root cause (4a*) for the alarm (4a) in at least one asset (8) of the industrial plant (1).
- The method (100) of any one of claims 1 to 16, wherein the future values (3a) of one or more process variables (3), and/or the alarms (4a) that are likely to occur, are predicted (123, 138) for a temporal horizon reaching at most 10 minutes, preferably at most 5 minutes, into the future.
- A computer program, comprising machine-readable instructions that, when executed by one or more computers and/or compute instances, cause the one or more computers and/or compute instances to perform the method (100) of any one of claims 1 to 16.
- A non-transitory machine-readable data carrier, and/or a download product, with the computer program of claim 17.
- One or more computers and/or compute instances with the computer program of claim 17, and/or with the machine-readable data carrier and/or download product of claim 18.
Description
FIELD OF THE INVENTION The invention relates to the management of alarms that accrue in distributed control systems, DCS, of industrial plants during the execution of industrial processes. BACKGROUND Industrial plants executing industrial processes are controlled by distributed control systems, DCS. A DCS allows an operator of the plant to make a decision regarding the running of the plant on a rather abstract level, and translates this decision into setpoints for low-level controllers and other instructions for the assets that are physically involved in the execution of the industrial process. In a large industrial plant, it is inevitable that not everything works as designed or planned at all times. Therefore, a DCS usually has an alarm system for reporting everything in the industrial plant that is out of the ordinary. Such an alarm system helps plant operators to keep execution of the process stable and safe. In a large DCS, a large number of alarms will accrue. This creates a large amount of "noise", or "nuisance alarms". It is difficult for a plant operator to filter the really important alarms that require prompt action out of this "noise", which may cause critical alarms to be overlooked. To aid the operator in noticing important alarms, machine learning models may be used to predict, based on the data available in the industrial plant, important alarms that are likely to be raised. This can give the operator an advance warning of the important alarm, so that, if this alarm is actually raised as predicted, the operator may take immediate action. When an important alarm is predicted by a machine learning system, this may come as a complete surprise to the operator. The reason for the prediction may not be evident to the operator. This may cause the operator to dismiss the prediction as a false prediction, negating the effect of the advance warning. OBJECTIVE OF THE INVENTION It is therefore an objective of the invention to improve the explainability of predictions of alarms that will likely be raised in an industrial plant executing an industrial process. This objective is achieved by a method according to the independent claim. Further advantageous embodiments are detailed in the dependent claims. DISCLOSURE OF THE INVENTION The invention provides a method for monitoring and/or controlling an industrial processes that is executed on an industrial plant. The industrial plant may be composed of many different assets that participate in the execution of the industrial process. All these assets may be under the control of a distributed control system. The distributed control system may monitor many process variables that characterize the state, and/or the behavior, of the industrial process. Each process variable may relate to one or more individual assets of the industrial plant. Examples of such process variables are quantities measured by sensors or other measuring instruments in the plant, such as a temperature, a pressure, a voltage, a mass flow, a fill level, or the chemical composition of a mixture of substances. But process variables may also relate to the industrial process, and/or to the industrial plant, as a whole. Examples of such process variables are the total throughput of the industrial process, or an overall energy usage of the industrial plant. In the course of the method, predictions for future values of one or more process variables that characterize the state, and/or the behavior, of the industrial process are obtained. These predictions may be obtained in any suitable manner. In particular, the predictions may be obtained using a machine learning model. But alternatively or in combination to this, other sources may be used. For example, if a formula or other law for the dependency of a process variable on some other quantity is known, then this formula or other law may be used in the prediction as well. That is, any available a priori knowledge may be put to use to create or improve the predictions for future values of one or more process variables. A trainable regression model that has been trained on training examples of time series or other sequences of values of process variables is one example of a further model that may predict future values of process variables. A simulation model is another example of a model that may be used. In particular, a simulation model may incorporate a lot of a priori knowledge about the industrial process, and/or about the industrial plant. Based at least in part on these predictions for future values, a prediction is determined as to which one or more alarms from a given set of possible alarms are likely to be raised within a predetermined time frame reaching into the future. This given set of possible alarms may comprise the complete set of alarms available in the industrial plant. But in order to highlight only important alarms, the given set of possible alarms may just as well be limited to these important alarms only. For predicting likely a