Search

JP-7855450-B2 - Unknown data detection device, method, and program

JP7855450B2JP 7855450 B2JP7855450 B2JP 7855450B2JP-7855450-B2

Inventors

  • 上松 和樹
  • 春木 耕祐
  • 木村 光宏
  • 中川 英之
  • 瀧本 崇博

Assignees

  • 株式会社東芝

Dates

Publication Date
20260508
Application Date
20220726
Priority Date
20211006

Claims (12)

  1. A data acquisition unit that acquires data to be monitored, An intermediate output calculation unit that applies a trained model to the monitored data and calculates an intermediate output, A noise impact evaluation unit calculates the noise impact of the intermediate output in any layer using only the parameters that have a high contribution to the task from among the parameters that constitute the trained model, A determination unit that determines whether the monitored data is unknown data based on the degree of influence of the noise, It is equipped with , The noise impact evaluation unit further comprises a projection component calculation unit that calculates the projection component of the intermediate output onto the parameters constituting the trained model as the noise impact, and a projection matrix determination unit that determines the projection matrix based on the matrix decomposition of the parameters. The projection component calculation unit calculates the projection component by applying the projection matrix to the intermediate output. Unknown data detection device.
  2. The unknown data detection device according to claim 1 , wherein the projection matrix determination unit calculates the projection matrix by singular value decomposition of the weight parameters constituting the trained model.
  3. The unknown data detection device according to claim 2 , wherein the projection matrix determination unit removes matrix components with small singular values from among the matrix components included in the projection matrix as components with low contribution to the task.
  4. The unknown data detection device according to claim 3, wherein the projection matrix determination unit searches for matrix components that satisfy predetermined conditions based on the change in the performance of the trained model on the task when the position and/or number of matrix components included in the projection matrix is changed.
  5. The aforementioned discrimination unit is The noise influence is converted into a one-dimensional discrimination variable, Based on the comparison between the aforementioned discrimination variable and the threshold, it is determined whether the monitored data is unknown data or not. The unknown data detection device according to claim 1.
  6. The unknown data detection device according to claim 5 , wherein the discrimination unit sets data from a plurality of training data that are not used to train the trained model as unknown data, and sets the threshold using the unknown data.
  7. The unknown data detection device according to claim 6 , wherein the discrimination unit sets the threshold so that outliers among the training data used to train the trained model can be classified as unknown data.
  8. The unknown data detection device according to claim 5 , wherein the discrimination unit calculates the norm of the noise influence, or the ratio of the norm of the intermediate output to the norm of the noise influence, as the discrimination variable.
  9. The aforementioned discrimination unit is A threshold is determined for each layer of the aforementioned trained model. Based on a comparison between the discrimination variable for each layer and the threshold, it is determined whether the monitored data is unknown data or not. The unknown data detection device according to claim 5 .
  10. The unknown data detection device according to claim 9 , wherein the discrimination unit determines the threshold value based on the number of levels of the parameters for each layer.
  11. Computers The acquisition process involves obtaining the data to be monitored, An intermediate output calculation step is performed by applying a trained model to the monitored data and calculating an intermediate output. A noise impact evaluation step , which calculates the noise impact of the intermediate output in any layer using only the parameters that have a high contribution to the task among the parameters that constitute the trained model, The system comprises a determination step of determining whether the monitored data is unknown data based on the noise impact, The noise impact evaluation step further comprises a projection component calculation step that calculates the projection component of the intermediate output onto the parameters constituting the trained model as the degree of noise impact, and a projection matrix determination step that determines the projection matrix based on the matrix decomposition of the parameters. The projection component calculation step involves applying the projection matrix to the intermediate output to calculate the projection component. Methods for detecting unknown data.
  12. On the computer, A function to acquire data to be monitored, An intermediate output calculation function that applies a trained model to the monitored data and calculates an intermediate output , A noise impact evaluation function that calculates the noise impact of the intermediate output in any layer using only the parameters that have a high contribution to the task from among the parameters that constitute the trained model, A determination function that determines whether the monitored data is unknown data or not based on the degree of impact of the noise, To make it happen , The noise impact evaluation function further comprises a projection component calculation function that calculates the projection component of the intermediate output onto the parameters constituting the trained model as the noise impact degree, and a projection matrix determination function that determines the projection matrix based on the matrix decomposition of the parameters. The projection component calculation function calculates the projection component by applying the projection matrix to the intermediate output. Unknown data detection program.

Description

Embodiments of the present invention relate to an unknown data detection device, method, and program. The performance of machine learning is greatly influenced not only by the model used, but also by the training dataset and the dataset used in operation. For example, if the input data distribution changes due to system operation, the trained model may not perform as initially expected due to the difference in datasets. Therefore, performance degradation progresses as the input data distribution changes from the training data distribution over time. In particular, with deep learning models, which have seen rapid application in recent years, it has been reported that they exhibit behavior similar to the training data even with completely different datasets (OOD: Out-Of-Distribution). For example, in a deep neural network (DNN) model trained on a classification task, while the classification probability for each class should ideally be low for OOD samples, it actually obtains a high classification probability that is not significantly different from the training data, making it difficult to detect unknown data. Various approaches have been taken to achieve higher accuracy in OOD detection. Non-patent document 1 approximates the intermediate outputs from each intermediate layer of the model with a Gaussian distribution when training data is input, and performs OOD detection using the Mahalanobis distance from the center of each class as an indicator. Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin, “A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks,” in Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018). This figure shows an example configuration of the unknown data detection device according to this embodiment.This figure shows an example of the configuration of a noise impact evaluation unit, which evaluates the degree of noise impact by adding noise.This figure shows an example of the configuration of a noise impact evaluation unit related to a method of evaluating the degree of noise impact by calculating projection components.A schematic diagram illustrating the processing of unknown data.A diagram showing an example of how to handle the evaluation phase.This diagram shows the relationship between the network structure of a trained model and its intermediate output.This diagram schematically illustrates the data processing of the noise impact evaluation unit in a system that evaluates the degree of noise impact by adding noise.This diagram schematically illustrates the data processing of the noise impact evaluation unit in a system that evaluates the degree of noise impact by calculating projection components.This figure shows an example of processing in the pre-treatment phase.A schematic diagram illustrating the unknown data detection process related to Application Example 1.Diagram showing the relationship between layer depth and stability rank in Application Example 2.This figure shows performance metrics for each method of detecting unknown data (OOD).This figure shows the memory usage of the method using Mahalanobis distance and the method according to this embodiment. The unknown data detection device, method, and program according to this embodiment will be described below with reference to the drawings. Figure 1 shows an example configuration of the unknown data detection device 100 according to this embodiment. As shown in Figure 1, the unknown data detection device 100 is a computer having a processing circuit 1, a storage device 2, an input device 3, a communication device 4, and a display device 5. Data communication between the processing circuit 1, the storage device 2, the input device 3, the communication device 4, and the display device 5 is performed via a bus. The processing circuit 1 includes a processor such as a CPU (Central Processing Unit) and memory such as RAM (Random Access Memory). The processing circuit 1 includes an acquisition unit 11, an intermediate output calculation unit 12, a noise impact evaluation unit 13, a discrimination unit 14, and an output control unit 15. The processing circuit 1 implements each of the above units 11 to 15 by executing an unknown data detection program. The unknown data detection program is stored in a non-temporary computer-readable recording medium such as a storage device 2. The unknown data detection program may be implemented as a single program describing all the functions of the above units 11 to 15, or it may be implemented as multiple modules divided into several functional units. Furthermore, the above units 11 to 15 may be implemented using integrated circuits such as Application Specific Integrated Circuits (ASICs). In this case, they may be implemented on a single integrated circuit or individually on multiple integrated circuits. The acquisition unit 11 acquires the trained model. The trained model is a deep learning model whose parameters have