WO-2026094103-A1 - ACTIVE LEARNING SYSTEM AND METHOD

WO2026094103A1WO 2026094103 A1WO2026094103 A1WO 2026094103A1WO-2026094103-A1

Abstract

Provided is technology related to active learning whereby high performance (in other words, high accuracy of a model) can be achieved regardless of the number of possible annotations, and annotation cost can be reduced. A computer of an active learning system includes: a model inference unit (301) that acquires a result of inference by a machine learning model on the basis of unlabeled data that is input data; a certainty factor calculation unit (304) that calculates a certainty factor related to the unlabeled data on the basis of the result of inference; a prototypicality calculation unit (302) that calculates a prototypicality related to the unlabeled data on the basis of the result of inference; a low certainty factor data selection unit (306) that selects low certainty factor data on the basis of the certainty factor related to the unlabeled data; a high prototypicality data selection unit (305) that selects high prototypicality data on the basis of the prototypicality related to the unlabeled data; and a data determination unit (307) that selects annotation target data from the selected low certainty factor data.

Inventors

OKANO, Takuya
HAYAKAWA, MIKI
MINEKAWA, YOHEI

Assignees

株式会社日立ハイテク

Dates

Publication Date: 20260507
Application Date: 20241028

Claims (10)

An active learning system comprising a computer for actively training a machine learning model, The aforementioned computer, A model inference unit that obtains inference results from the machine learning model based on unlabeled input data, A confidence calculation unit calculates the confidence level regarding the unlabeled data based on the inference results, A typicality calculation unit calculates the typicality of the unlabeled data based on the inference results, A low-confidence data selection unit selects low-confidence data based on the confidence level of the unlabeled data, A high-typicality data selection unit selects high-typicality data based on the typicality of the unlabeled data, A data determination unit that selects annotation target data from the selected low-confidence data and the low-confidence data, An active learning system having [a certain characteristic].
In the active learning system according to claim 1, The data determination unit selects the low-confidence data if the confidence level for the high-typicality data is higher than or equal to a threshold, and selects the high-typicality data if the confidence level for the high-typicality data is lower than or equal to a threshold. An active learning system.
In the active learning system according to claim 1, The confidence calculation unit comprises a plurality of confidence calculation modules with different confidence calculation methods, and a confidence integration unit that combines the plurality of confidences calculated by the plurality of confidence calculation modules to calculate a single integrated confidence. An active learning system.
In the active learning system according to claim 3, The confidence calculation unit has a classification confidence calculation module as one of the plurality of confidence calculation modules. The classification confidence calculation module calculates a classification confidence as a first provisional label and a first confidence level using the classification probability based on the inference result. An active learning system.
In the active learning system according to claim 4, The confidence calculation unit has a cluster confidence calculation module as one of the plurality of confidence calculation modules. The cluster confidence calculation module clusters the features obtained as a result of the inference, and calculates a cluster confidence score as a second provisional label and a second confidence score based on the relative distance between the clustered features and the center of each cluster. An active learning system.
In the active learning system according to claim 5, The confidence integration unit said above, When the first provisional label is denoted as y1 , the first confidence level as c1 , the second provisional label as y2 , and the second confidence level as c1 , the confidence level (c) obtained by combining the first and second confidence levels is calculated based on the following formula. c=δ(y 1 , y 2 )・c 1 ・c 2 , δ is an indicator function that returns 1 if y1 and y2 are the same, and 0 otherwise. An active learning system.
In the active learning system according to claim 1, The clustering unit has a function that clusters the unlabeled data based on the inference results, The low-confidence data selection unit selects low-confidence data for each cluster of the clustered results, based on the confidence level of the unlabeled data. The high-typicality data selection unit selects high-typicality data for each cluster of the clustered results based on the typicality of the unlabeled data. The data determination unit selects the annotation target data for each cluster. An active learning system.
In the active learning system according to claim 1, Based on the annotation target data selected by the data determination unit, The annotation screen displays the data to be annotated, along with the data selection results from the low-confidence data and the high-typicality data, and the confidence level or the typicality level. An active learning system.
An active learning method performed in an active learning system equipped with a computer for actively training a machine learning model, The steps performed by the aforementioned computer include: A model inference step in which the machine learning model obtains the inference result based on the input data, which is unlabeled data, A confidence calculation step of calculating the degree of confidence regarding the unlabeled data based on the inference results, A typicality calculation step, which calculates the typicality of the unlabeled data based on the inference result, A low-confidence data selection step in which low-confidence data is selected based on the confidence level of the unlabeled data, A high-typicality data selection step, in which high-typicality data is selected based on the typicality of the unlabeled data, A data determination step of selecting annotation target data from the selected low-confidence data and the low-confidence data, An active learning method having [a certain characteristic].
In the active learning method described in claim 9, The data determination step involves selecting the low-confidence data if the confidence level for the high-typicality data is higher than or equal to a threshold, and selecting the high-typicality data if the confidence level for the high-typicality data is lower than or equal to a threshold. Active learning methods.

Description

Active learning systems and methods This disclosure relates to machine learning techniques, particularly active learning. In supervised machine learning, reducing annotation costs is a particularly important challenge. Improving the performance of machine learning models (i.e., inference accuracy) requires large amounts of data, but annotating (or labeling) all of this data is extremely costly. This challenge is even more serious in specialized fields such as manufacturing and healthcare, where accurate labeling requires enormous costs and time. One method for reducing annotation costs is active learning. Active learning is a technique that prioritizes and generates data useful for learning by interactively querying the user, and then performs annotation. In active learning, the system first selects multiple data points useful for improving the performance of the machine learning model from unlabeled data. The number of data points that can be selected at this time is referred to as the "annotable data points." The data selected at this time (i.e., the data used to query the user) is referred to as the "query." Next, the system interactively queries the user with this query. The user then makes a decision and performs annotation based on the query. The system adds the annotated data as labeled data. The system then trains the machine learning model using the updated labeled data. By repeating this process, the model can be optimized while keeping annotation costs down. A common method of active learning involves prioritizing data with low confidence levels in the machine learning model. However, simply prioritizing low-confidence data can lead to a bias in queries towards similar data, and may not reduce annotation costs. Therefore, Patent Document 1 describes a method for correcting data bias while prioritizing data with low confidence levels: "The analysis unit calculates the confidence level of each estimation result of a machine learning model that estimates the classification to which multiple unsupervised training data belong from a predetermined set of classification destinations. The selection unit clusters the training data into multiple clusters, and after clustering, selects one or more training data from those with the lowest confidence levels." A diagram showing the configuration of the active learning system in this embodiment (Embodiment 1).This figure shows an example configuration of the active learning system (computer system) in this embodiment.This diagram shows the flow of the active learning method using the active learning system of this embodiment.This figure shows an example of the configuration of the data selection unit 109 in this embodiment.This figure shows the processing flow of the data selection unit 109 in this embodiment.This figure shows an example of the configuration of the confidence level calculation unit 304 in this embodiment.This is an explanatory diagram illustrating the confidence level calculated by the cluster confidence calculation module 502 in this embodiment.This diagram illustrates the effect of the confidence integration unit 503 in this embodiment.A schematic diagram showing an example of the initial setup screen in this embodiment.A schematic diagram showing an example of the annotation screen in this embodiment.This figure shows an example of the hardware configuration of the active learning system (computer system) in this embodiment. The embodiments of this disclosure will be described in detail below with reference to the drawings. In the drawings, the same parts are generally denoted by the same reference numerals, and repeated descriptions are omitted. In explanations, when describing program-based processing, the focus may sometimes be on the program, functions, or processing units. However, the core hardware component is the processor, or the controller, device, computer, or system comprised of such processors. The computer, using its processor, executes processing according to the program read into memory, utilizing resources such as memory and communication interfaces as appropriate. This realizes the specified functions and processing units. The processor is composed of semiconductor devices such as CPUs/MPUs and GPUs. Processing is not limited to software program processing; it can also be implemented using dedicated circuits. Applicable dedicated circuits include FPGAs, ASICs, and CPLDs. The program may be pre-installed as data on the target computer, or it may be distributed as data to the target computer from the program source. The program source may be a program distribution server on a communication network, or a non-transient, computer-readable storage medium, such as a memory card or disk. The program may consist of multiple modules. The computer system may consist of multiple devices. The computer system may consist of a client-server system, a cloud computing system, an IoT system, etc. Various types of data and information are str