Search

KR-102962926-B1 - SYSTEM AND METHOD FOR TRAINING ARTIFICIAL INTELLIGENCE MODEL

KR102962926B1KR 102962926 B1KR102962926 B1KR 102962926B1KR-102962926-B1

Abstract

An electronic device for training an artificial intelligence model is disclosed. A method for training an artificial intelligence model by the disclosed electronic device comprises: a step of obtaining data points from a dataset containing a plurality of data points for which a label indicating an attribute of the data points is not labeled; a step of selecting a predetermined number of candidate data points from among the unlabeled data points; a step of outputting an input request for labeling the selected candidate data points together; a step of obtaining labeled candidate data points by labeling the selected candidate data points based on user input responding to the input request; and a step of updating the artificial intelligence model using the labeled candidate data points.

Inventors

  • 이병찬
  • 나리스닉 에릭
  • 고든 조나단
  • 허난데즈 로바토 호세 미구엘
  • 핀슬러 로버트

Assignees

  • 삼성전자주식회사

Dates

Publication Date
20260508
Application Date
20190830

Claims (20)

  1. In a method for an electronic device to train an artificial intelligence model, A step of obtaining data points from a dataset containing multiple data points for which a label indicating the attribute of the data point is not labeled; A step of obtaining data point vectors corresponding to the unlabeled data points by converting the contribution of each of the unlabeled data points to the update of the artificial intelligence model into a vector in a vector space; A step of obtaining a vector representing the dataset from the above data point vectors; A step of selecting a predetermined number of data point vectors based on the angle between the data point vectors and the vector representing the dataset; A step of determining, among the above unlabeled data points, data points corresponding to the selected data point vectors as candidate data points; A step of outputting an input request for labeling the above candidate data points together; A step of obtaining labeled candidate data points by labeling the candidate data points based on user input responding to the above input request; and A step of updating the artificial intelligence model using the above-mentioned labeled candidate data points; including, Method for training an artificial intelligence model.
  2. In paragraph 1, The above artificial intelligence model is characterized by predicting the label of an unlabeled data point using a parameter having a probability distribution, and The step of updating the above artificial intelligence model is, A method comprising the step of updating the probability distribution of the parameter using the labeled candidate data points. Method for training an artificial intelligence model.
  3. delete
  4. delete
  5. delete
  6. delete
  7. delete
  8. delete
  9. delete
  10. delete
  11. In paragraph 1, The step of updating the above artificial intelligence model is, The step of adding the above-mentioned labeled candidate data points to the above dataset all at once; and A step of updating the artificial intelligence model by inputting the above-mentioned labeled candidate data points into the artificial intelligence model together with other pre-labeled data points included in the dataset; including, Method for training an artificial intelligence model.
  12. In an electronic device for training an artificial intelligence model, Memory for storing one or more instructions; and A processor that executes one or more of the above instructions; comprising, The above processor executes the above one or more instructions, From a dataset containing multiple data points, data points for which a label representing the attribute of the data point is not labeled are obtained, and Data point vectors corresponding to the unlabeled data points are obtained by converting the contribution of each of the unlabeled data points to the update of the artificial intelligence model into a vector in a vector space, and Obtain a vector representing the dataset from the above data point vectors, and A predetermined number of data point vectors are selected based on the angle between the above data point vectors and the vector representing the above dataset, and Among the above unlabeled data points, the data points corresponding to the above selected data point vectors are determined as candidate data points, and Output an input request to label the above candidate data points together, and Based on user input responding to the above input request, the above candidate data points are labeled to obtain labeled candidate data points, and Updating the artificial intelligence model using the above-mentioned labeled candidate data points, Electronic device.
  13. In Paragraph 12, The above artificial intelligence model is characterized by predicting the label of an unlabeled data point using a parameter having a probability distribution, and The above processor executes the above one or more instructions, Updating the probability distribution of the parameter using the labeled candidate data points above, Electronic device.
  14. delete
  15. delete
  16. delete
  17. delete
  18. delete
  19. delete
  20. delete

Description

System and Method for Training Artificial Intelligence Model The present disclosure relates to a system and method for training an artificial intelligence model, and more specifically, to an electronic device and method for training an artificial intelligence model using unlabeled data points. Artificial Intelligence (AI) systems are computer systems that achieve human-level intelligence. Unlike existing rule-based smart systems, they are systems that learn, make judgments, and become smarter on their own. As AI systems improve in recognition rate and become capable of understanding user preferences more accurately with continued use, existing rule-based smart systems are gradually being replaced by deep learning-based AI systems. Artificial intelligence technology consists of machine learning (deep learning) and elemental technologies utilizing machine learning. Machine learning is an algorithmic technology that classifies and learns the features of input data on its own, and elemental technology is a technology that mimics the functions of the human brain, such as cognition and judgment, by utilizing machine learning algorithms such as deep learning, and consists of technology fields such as linguistic understanding, visual understanding, reasoning/prediction, knowledge representation, and motion control. Meanwhile, for machine learning, it is generally necessary to provide training data that contains correct information about the attributes of the data. Therefore, a labeling process is required to assign information to the input data. Active learning is a method of machine learning that can query external information sources, such as users, regarding unlabeled data. When a large number of unlabeled data points are provided, the repetitive process of selecting data points to label, labeling them, and training an AI model with the labeled data points consumes significant time and computing resources. Therefore, to perform active learning efficiently, it is necessary to optimize the process of selecting data points to label. FIG. 1 is a schematic diagram of a system for training an artificial intelligence model according to one embodiment. FIG. 2 is a block diagram of an apparatus for training an artificial intelligence model using unlabeled data points according to one embodiment. FIG. 3 is a flowchart of a method for training an artificial intelligence model using unlabeled data points according to one embodiment. FIG. 4 is a flowchart of a method for selecting a predetermined number of candidate data points among unlabeled data points according to one embodiment. FIG. 5 is a flowchart of a method for selecting a predetermined number of data point vectors among data point vectors according to one embodiment. FIGS. 6a to 6f are drawings illustrating a method for selecting a predetermined number of data point vectors among data point vectors according to one embodiment. FIG. 7 is a block diagram of an apparatus for training an artificial intelligence model using unlabeled data points according to one embodiment. FIG. 8 is a flowchart of a method for training an artificial intelligence model using unlabeled data points according to one embodiment. FIG. 9 is a block diagram of an electronic device according to one embodiment. The terms used in this specification will be briefly explained, and the present disclosure will be described in detail. The terms used in this disclosure have been selected to be as widely used and general as possible, taking into account their functions within this disclosure; however, these terms may vary depending on the intent of those skilled in the art, case law, the emergence of new technologies, etc. Additionally, in specific cases, terms have been selected at the applicant's discretion, and in such cases, their meanings will be described in detail in the relevant description of the invention. Therefore, terms used in this disclosure should be defined not merely by their names, but based on their meanings and the overall content of this disclosure. When a part of a specification is described as "including" a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. Furthermore, terms such as "…part" or "module" as used in the specification refer to a unit that processes at least one function or operation, and this may be implemented in hardware or software, or as a combination of hardware and software. Functions related to artificial intelligence according to the present disclosure may be operated through a processor and memory. The processor may be composed of one or more processors. In this case, the one or more processors may be general-purpose processors such as CPUs, APs, and DSPs (Digital Signal Processors), graphics-dedicated processors such as GPUs and VPUs (Vision Processing Units), or artificial intelligence-dedicated processors such as NPUs. The one or more processors may be con