US-12626094-B2 - Anomaly detection in time-based events via pattern mining

US12626094B2US 12626094 B2US12626094 B2US 12626094B2US-12626094-B2

Abstract

Anomaly detection in neural networks is provided. The method comprises extracting, from different layers of a recurrent neural network (RNN) for a specified time interval, a number of on-the-fly node activations produced by sequences of event data from a number of data sources and sensors. The method references known node activations of the RNN produced by normal sequences of event data, and for each layer of the RNN, calculates a maximum nonparametric divergence of the on-the-fly node activations from the known node activations. For each layer of the RNN, the method determines a subset of nodes that most contribute to the maximum nonparametric divergence for that layer for a given time window and identifies data sources or sensors from among the number of data sources and sensors responsible for activating the subset of nodes that most contribute to the maximum nonparametric divergence for each layer.

Inventors

Celia Cintas
Girmaw Abebe Tadesse
SKYLER SPEAKMAN
Komminist Weldemariam

Assignees

INTERNATIONAL BUSINESS MACHINES CORPORATION

Dates

Publication Date: 20260512
Application Date: 20230620

Claims (20)

1 . A computer-implemented method of anomaly detection in neural networks, the method comprising: extracting, from different layers of a recurrent neural network (RNN) for a specified time interval, a number of on-the-fly node activations produced by sequences of event data from a number of data sources and sensors; referencing known node activations of the RNN produced by normal sequences of event data; for each layer of the RNN, calculating a maximum nonparametric divergence of the on-the-fly node activations from the known node activations; for each layer of the RNN, determining a subset of nodes that most contribute to the maximum nonparametric divergence for that layer for a given time window; and identifying data sources or sensors from among the number of data sources and sensors responsible for activating the subset of nodes that most contribute to the maximum nonparametric divergence for each layer.
2 . The method of claim 1 , wherein the recurrent neural network comprises one of: a recurrent autoencoder; a gated recurrent network; or a long short-term memory network.
3 . The method of claim 1 , wherein the number of on-the-fly node activations corresponds to sequential overlapping series of events for a defined time window size.
4 . The method of claim 1 , wherein the data sources comprise multi-modal data sources.
5 . The method of claim 1 , further comprising identifying a subset of layers with the RNN that contribute most to an overall maximum deviation of the whole RNN.
6 . The method of claim 1 , wherein the data sources are integrated at different layers of the RNN.
7 . The method of claim 1 , further comprising using input perturbations to enhance detection of anomalous samples among the on-the-fly node activations.
8 . A system for anomaly detection in neural networks, the system comprising: a storage device that stores program instructions; one or more processors operably connected to the storage device and configured to execute the program instructions to cause the system to: extracting, from different layers of a recurrent neural network (RNN) for a specified time interval, a number of on-the-fly node activations produced by sequences of event data from a number of data sources and sensors; referencing known node activations of the RNN produced by normal sequences of event data; for each layer of the RNN, calculating a maximum nonparametric divergence of the on-the-fly node activations from the known node activations; for each layer of the RNN, determining a subset of nodes that most contribute to the maximum nonparametric divergence for that layer for a given time window; and identifying data sources or sensors from among the number of data sources and sensors responsible for activating the subset of nodes that most contribute to the maximum nonparametric divergence for each layer.
9 . The system of claim 8 , wherein the recurrent neural network comprises one of: a recurrent autoencoder; a gated recurrent network; or a long short-term memory network.
10 . The system of claim 8 , wherein the number of on-the-fly node activations corresponds to sequential overlapping series of events for a defined time window size.
11 . The system of claim 8 , wherein the data sources comprise multi-modal data sources.
12 . The system of claim 8 , wherein the program instructions further cause the system to identify a subset of layers with the RNN that contribute most to an overall maximum deviation of the whole RNN.
13 . The system of claim 8 , wherein the data sources are integrated at different layers of the RNN.
14 . The system of claim 8 , wherein the program instructions further cause the system to use input perturbations to enhance detection of anomalous samples among the on-the-fly node activations.
15 . A computer program product for anomaly detection in neural networks, the computer program product comprising: a persistent storage medium having program instructions configured to cause one or more processors to: extracting, from different layers of a recurrent neural network (RNN) for a specified time interval, a number of on-the-fly node activations produced by sequences of event data from a number of data sources and sensors; referencing known node activations of the RNN produced by normal sequences of event data; for each layer of the RNN, calculating a maximum nonparametric divergence of the on-the-fly node activations from the known node activations; for each layer of the RNN, determining a subset of nodes that most contribute to the maximum nonparametric divergence for that layer for a given time window; and identifying data sources or sensors from among the number of data sources and sensors responsible for activating the subset of nodes that most contribute to the maximum nonparametric divergence for each layer.
16 . The computer program product of claim 15 , wherein the recurrent neural network comprises one of: a recurrent autoencoder; a gated recurrent network; or a long short-term memory network.
17 . The computer program product of claim 15 , wherein the number of on-the-fly node activations corresponds to sequential overlapping series of events for a defined time window size.
18 . The computer program product of claim 15 , wherein the data sources comprise multi-modal data sources.
19 . The computer program product of claim 15 , further comprising instructions for identifying a subset of layers with the RNN that contribute most to an overall maximum deviation of the whole RNN.
20 . The computer program product of claim 15 , wherein the data sources are integrated at different layers of the RNN.

Description

BACKGROUND The present disclosure relates generally to an improved computing system, and more specifically to identifying anomalies in artificial neural networks. Many typical modern life functions, ranging from banking systems to utility consumption control, rely on a series of heterogeneous computing systems. Anomaly detection and identifying early indicators of system malfunctioning such as system failures, abnormalities, and carbon emission, are critical components of building a robust and sustainable system. The primary purpose of a system log or sensor readings is to record system states at various essential points to help monitor and detect these malfunctioning and provide recommendations for asset maintenance or perform root cause analysis. These types of countermeasure solutions reduce system or asset unavailability, critical failures, and associated carbon footprint through effective asset utilization for energy production. As systems and applications become increasingly complex (code, dependencies interactions, or sensor data complexity), they are subject to more bugs and vulnerabilities SUMMARY An illustrative embodiment provides a method of anomaly detection in neural networks. The method comprises extracting, from different layers of a recurrent neural network (RNN) for a specified time interval, a number of on-the-fly node activations produced by sequences of event data from a number of data sources and sensors. The method references known node activations of the RNN produced by normal sequences of event data, and for each layer of the RNN, calculates a maximum nonparametric divergence of the on-the-fly node activations from the known node activations. For each layer of the RNN, the method determines a subset of nodes that most contribute to the maximum nonparametric divergence for that layer for a given time window and identifies data sources or sensors from among the number of data sources and sensors responsible for activating the subset of nodes that most contribute to the maximum nonparametric divergence for each layer. According to other illustrative embodiments, a computer system, and a computer program product for anomaly detection in neural networks are provided. The features and functions can be achieved independently in various embodiments of the present disclosure or may be combined in yet other embodiments in which further details can be seen with reference to the following description and drawings. BRIEF DESCRIPTION OF THE DRAWINGS The novel features believed characteristic of the illustrative embodiments are set forth in the appended claims. The illustrative embodiments, however, as well as a preferred mode of use, further objectives and features thereof, will best be understood by reference to the following detailed description of an illustrative embodiment of the present disclosure when read in conjunction with the accompanying drawings, wherein: FIG. 1 depicts a pictorial representation of a computing environment in which illustrative embodiments may be implemented; FIG. 2 depicts a block diagram for anomaly detection in accordance with an illustrative embodiment; FIG. 3 depicts an overview of anomaly detection in accordance with an illustrative embodiment; FIG. 4 depicts a flowchart of a process for anomaly detection in neural networks in accordance with an illustrative embodiment; and FIG. 5 depicts a flowchart of a process for anomaly scoring for temporal samples in accordance with an illustrative embodiment. DETAILED DESCRIPTION Various aspects of the present disclosure are described by narrative text, flowcharts, block diagrams of computer systems and/or block diagrams of the machine logic included in computer program product (CPP) embodiments. With respect to any flowcharts, depending upon the technology involved, the operations can be performed in a different order than what is shown in a given flowchart. For example, again depending upon the technology involved, two operations shown in successive flowchart blocks may be performed in reverse order, as a single integrated step, concurrently, or in a manner at least partially overlapping in time. A computer program product embodiment (“CPP embodiment” or “CPP”) is a term used in the present disclosure to describe any set of one, or more, storage media (also called “mediums”) collectively included in a set of one, or more, storage devices that collectively include machine readable code corresponding to instructions and/or data for performing computer operations specified in a given CPP claim. A “storage device” is any tangible device that can retain and store instructions for use by a computer processor. Without limitation, the computer readable storage medium may be an electronic storage medium, a magnetic storage medium, an optical storage medium, an electromagnetic storage medium, a semiconductor storage medium, a mechanical storage medium, or any suitable combination of the foregoing. Some known types of storage devi