US-12626104-B2 - Methods of training variational autoencoders to recognize anomalous data in distributed systems

US12626104B2US 12626104 B2US12626104 B2US 12626104B2US-12626104-B2

Abstract

Methods of training a variational autoencoder (VAE) to recognize anomalous data in a distributed system are provided. Input image data representative of devices/processes in a distributed system are provided to an encoder of a VAE on a processor. The input image data is compressed, via the processor, using a first plurality of weights with the encoder. A normal distribution of the compressed image data is created in a latent space of the VAE. The compressed image data from the latent space is decompressed using a second plurality of weights with a decoder of the VAE. The decompressed image data from the decoder is optimized. At least the first and second plurality of weights are updated, via the processor, based on the loss detected in the optimized decompressed image data. The above steps are iterated until the decompressed image data possesses substantially the same statistical properties as the input image data.

Inventors

Mark Rahmes
Kevin Fox
Jennifer Spicer
Shoaib Shaikh
Macaulay Osaisai
Michael Fischer
Ziad Chaudhry

Assignees

L3HARRIS TECHNOLOGIES, INC.

Dates

Publication Date: 20260512
Application Date: 20220203

Claims (16)

1 . A method of training a variational autoencoder to recognize anomalous data in a distributed system, comprising the step of: a) providing input image data representative of one or more devices or processes in the distributed system to be analyzed to an encoder of a variational autoencoder on a processor; b) compressing, via the processor, the input image data using a first plurality of weights with the encoder; c) creating, via the processor, a normal distribution of the compressed image data in a latent space of the variational autoencoder, wherein the latent space includes an n-D point scatter plot; d) decompressing, via the processor, the compressed image data from the latent space using a second plurality of weights with a decoder of the variational autoencoder; e) optimizing, via the processor, the decompressed image data from the decoder; and f) updating, via the processor, at least the first and second plurality of weights based on loss detected in the optimized decompressed image data, wherein the optimizing step e) further comprises the steps of: generating, via the processor, a plurality of probabilistic models of the decoded image data; and determining, via the processor, which of the plurality of models is optimal, and wherein the step of determining which of the plurality of models is optimal further comprises the steps of: applying a game theoretic optimization to the plurality of models; and selecting which of the plurality of models to use to generate the n-D point scatter plot in latent space.
2 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 1 , wherein the latent space includes a 3-D point scatter plot that includes hidden vector values.
3 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 1 , wherein the updating step f) further comprises the steps of: applying a game theoretic optimization to the plurality of models; and selecting which of the plurality of models to use to update the first and second plurality of weights.
4 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 1 , wherein the plurality of models includes at least two of Adam, SGDM, or RMSProp.
5 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 1 , further comprising the step of preprocessing the decompressed image prior to optimizing the decompressed image data via an image gradient Sobel edge detector.
6 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 5 , further comprising the step of returning a floating-point edge metric via the image gradient Sobel edge detector.
7 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 3 , further comprising the step of g) iterating at least steps b)-f) until the decompressed image data possesses substantially the same statistical properties as the input image data.
8 . A method of training a variational autoencoder to recognize anomalous data in a distributed system according to claim 1 , wherein the distributed system includes at least one of an antenna system, an air traffic control system, or a computer network.
9 . A non-transitory computer-readable storage medium, comprising one or more programs for executing a model of training a variational autoencoder to recognize anomalous data in a distributed system, wherein the model is configured to: a) provide image input data representative of one or more devices or processes in the distributed system to be analyzed to an encoder of a variational autoencoder on a processor; b) compress, via the processor, the image input data using a first plurality of weights with the encoder; c) create, via the processor, a normal distribution of the compressed image data in a latent space of the variational autoencoder, wherein the latent space includes an n-D point scatter plot; d) decompress, via the processor, the compressed image data from the latent space using a second plurality of weights with a decoder of the variational autoencoder; e) optimize, via the processor, the decompressed image data from the decoder; and f) update, via the processor, at least the first and second plurality of weights based on the loss detected in the optimized decompressed image data, wherein the model is configured to optimize, via the processor, the decompressed image data from the decoder by being further configured to: generate, via the processor, a plurality of probabilistic models of the decoded image data; and determine, via the processor, which of the plurality of models is optimal, and wherein the model is further configured to: apply a game theoretic optimization to the plurality of models; and select which of the plurality of models to use to generate the n-D point scatter plot in latent space.
10 . A non-transitory computer-readable storage medium according to claim 9 , wherein the latent space includes a 3-D point scatter plot that includes hidden vector values.
11 . A non-transitory computer-readable storage medium according to claim 9 , wherein the model is further configured to: apply a game theoretic optimization to the plurality of models; and select which of the plurality of models to use to update the first and second plurality of weights.
12 . A non-transitory computer-readable storage medium according to claim 9 , wherein the plurality of models includes at least two of Adam, SGDM, or RMSProp.
13 . A non-transitory computer-readable storage medium according to claim 9 , the model being further configured to preprocess the decompressed image prior to optimizing the decompressed image data via an image gradient Sobel edge detector.
14 . A non-transitory computer-readable storage medium according to claim 13 , the model being further configured to return a floating-point edge metric via the image gradient Sobel edge detector.
15 . A non-transitory computer-readable storage medium according to claim 11 , the model being further configured to: g) iterate at least steps b)-f) until the decompressed image data possesses substantially the same statistical properties as the input image data.
16 . A non-transitory computer-readable storage medium according to claim 9 , wherein the distributed system includes at least one of an antenna system, an air traffic control system, or a computer network.

Description

BACKGROUND OF THE INVENTION Field of the Invention The invention is directed to variational autoencoders (VAEs). More specifically, the invention is directed to methods for efficiently and effectively training VAEs to recognize anomalous data, e.g., in networked devices and/or distributed systems such as antenna systems, air traffic control systems, computer networks, and the like. Description of Related Art Machine learning (ML) and artificial intelligence (AI) often require the use of vast amounts of training data for scoring an ML model. In many cases, the data needed for this work is either hard to acquire or has access restrictions due to the sensitivity of the data (e.g., asset locations, proprietary messages within a data stream, configuration of devices including Internet Protocol (IP) addresses, personal identifiable information, etc.). There has been an ongoing debate withing the AI research community of best practices for protecting sensitive data, and synthetic data generation is the preeminent solution. It is however difficult to create or augment an existing dataset synthetically that still retains the underlying statistical properties of the original “truth” dataset. The ability to synthesize sensitive datasets will enable the use of more accessible compute resources that are currently not allowable. Additionally, the ability to synthesize data will facilitate more robust validation testing of Machine Learning models, which is critical for safety critical applications. Data for edge or corner cases may not be available but could easily be generated synthetically to provide a more robust training set. Similarly, anomalies could be created synthetically and injected to test and validate datasets for enhanced sensitivity analysis. For various ML use cases, including prediction, anomaly detection, and others, training data is either non-existent, sensitive in nature, or insufficient for modeling purposes. Current data generation techniques are overly simplistic and select observations to randomly duplicate. This leads to missed opportunities in the training framework for edge cases and validation testing. There are currently no other viable techniques for creating statistically sound training datasets that can either augment existing datasets or create new ones altogether. Current approaches for synthetic data generation include rudimentary approaches such as random oversampling to augment training datasets. For an imbalanced dataset, the minority sample observations will be randomly oversampled. This oversample will copy or duplicate the entire feature set of the minority class. While this leads to a more balanced dataset, the quality of the data remains in question as it does not provide a statistically meaningful representation of the original entire dataset. In the area of anomaly detection, one example is a network or array of antennae. If each antenna is inspected by comparing current data of, e.g., coverage, with known normal data, one can find anomalies. Unfortunately, some antenna networks/arrays are huge and complex, e.g., include thousands of antennae. Comparing normal data versus current data for thousands of antennae is impractical. Due to the cumbersome nature of this task, months often go by without anyone looking for (let alone finding) anomalies. One tool that can be of use in ML/AI endeavors is a variational autoencoder. VAEs, like other autoencoders, each include an encoder, a decoder, and latent space. In a typical autoencoder, the encoder learns to compress (reduce) the input data into an encoded representation, the decoder learns to reconstruct the original data from the encoded representation to be as close to the original input as possible, and the latent space is the layer that contains the compressed representation of the input data. VAEs differ from regular autoencoders in that they do not use the encoding-decoding process simply to reconstruct an input. Instead, they impose a probability distribution on the latent space, and they learn the distribution so that the distribution of outputs from the decoder matches that of the observed data. Then, they sample from this distribution to generate new data. A VAE assumes that the source data has some sort of underlying probability distribution (such as Gaussian) and then attempts to find the parameters of the distribution. A variational autoencoder is a generative system and serves a similar purpose as a generative adversarial network. One main use of a variational autoencoder is to generate new data that is related to the original source data. In the case of the instant invention, the new data is used for additional training and testing analysis. There is a long-felt need to provide an effective way to generate synthetic data for training a VAE in applications such as anomaly detection. There is a long-felt need to provide an effective way to synthetic data generation that maintains the underlying statistical properties of t