CN-122021777-A - Semi-supervised training of machine learning models using label guesses

CN122021777ACN 122021777 ACN122021777 ACN 122021777ACN-122021777-A

Abstract

The application relates to semi-supervised training of machine learning models using label guesses. Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model. One of the methods includes receiving an unlabeled lot, generating a processed unlabeled lot and a processed labeled lot from the unlabeled lot and the labeled lot, the generating including, for each unlabeled training input of the plurality of unlabeled training inputs, generating a plurality of enhanced unlabeled training inputs from the unlabeled training inputs, processing each of the amplified unlabeled training inputs using a machine learning model to generate a corresponding model output for each of the amplified unlabeled training inputs, generating a guessed model output from the model outputs of the amplified unlabeled training inputs, and associating the guessed model output with each of the amplified unlabeled training inputs, and training the machine learning model on the processed labeled lot and the processed unlabeled lot.

Inventors

David bertelow

Assignees

谷歌有限责任公司

Dates

Publication Date: 20260512
Application Date: 20200506
Priority Date: 20190506

Claims (12)

1. A method of training a machine learning model having a plurality of parameters to perform a machine learning task, wherein the machine learning model is configured to receive an input and process the input according to the parameters to generate a model output, the method comprising: Receiving an unlabeled batch comprising a plurality of unlabeled training inputs; receiving a token batch comprising a plurality of token training inputs, and for each token training input, receiving a ground truth output that should be generated by the machine learning model by performing a particular machine learning task on the token training input; Generating a processed unlabeled lot and a processed labeled lot from the unlabeled lot and the labeled lot, the generating comprising: for each unlabeled training input of the plurality of unlabeled training inputs: Generating a plurality of augmented unlabeled training inputs from the unlabeled training inputs; Associating a guessed model output with each of the augmented unlabeled training inputs; For each of the plurality of marker training inputs: generating an augmented tag training input from the tag training input, and Correlating the augmented tag training input with a ground truth output of the tag training input, and Training the machine learning model on the processed labeled lot and the processed unlabeled lot to adjust the current value of the parameter.
2. The method of claim 1, wherein generating a processed unlabeled lot and a processed labeled lot from the unlabeled lot and the labeled lot further comprises: for each unlabeled training input of the plurality of unlabeled training inputs: processing each of the augmented unlabeled training inputs according to a current value of the parameter using the machine learning model to generate a respective model output for each augmented unlabeled training input; generating the guess model output from the model output of the augmented unlabeled training input, and The guessed model outputs are associated with each of the augmented unlabeled training inputs.
3. The method of claim 1, wherein the input of the machine learning model is an image and the model output is a probability distribution over a set of object classes.
4. The method of claim 1, wherein the input of the machine learning model is one or more video frames and the model output is a probability distribution over a set of object classes or a set of topics.
5. The method of claim 1, wherein the input of the machine learning model is text and the model output is a probability distribution over a set of topics.
6. The method of claim 1, wherein the input of the machine learning model is an audio signal and the model output is a probability distribution over a set of natural language text.
7. The method of claim 2, wherein generating a guessed model output from the model output of the augmented unlabeled training input comprises: An average of the model outputs of the augmented unlabeled training inputs is calculated.
8. The method of claim 7, wherein generating a guessed model output from the model output of the augmented unlabeled training input further comprises: a sharpening function is applied to the average of the model outputs to reduce uncertainty of the average.
9. The method of claim 2, wherein generating a processed unlabeled lot and a processed labeled lot from the unlabeled lot and the labeled lot further comprises: Generating a processed label input associated with the processed ground truth output for each particular amplified label input and associated ground truth output, comprising: selecting an input-output pair from the set of (i) amplified labeled input and associated ground truth output and (ii) amplified unlabeled input and associated guess output; performing convex combinations of the amplified marker inputs and inputs in the input selection pair to generate processed inputs; Performing a convex combination of the ground truth output associated with the amplified marker input and the output in the selected pair to generate a processed output, and The processed input is associated with the processed output.
10. The method of claim 2, wherein generating a processed unlabeled lot and a processed labeled lot from the unlabeled lot and the labeled lot further comprises: generating a processed unlabeled input associated with a processed guess output for each particular amplified labeled input and associated guess output, comprising: selecting an input-output pair from the set of (i) amplified labeled input and associated ground truth output and (ii) amplified unlabeled input and associated guess output; Performing a convex combination of the amplified unlabeled input and the input in the selected pair to generate a processed input; performing a convex combination of the guessed output associated with the augmented unlabeled input and the output in the selected one to generate a processed output, and The processed input is associated with the processed output.
11. A system comprising one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform the operations of the respective method of any one of claims 1-10.
12. One or more computer storage media storing instructions that, when executed by one or more computers, cause the one or more computers to perform the operations of the respective method of any one of claims 1-10.

Description

Semi-supervised training of machine learning models using label guesses Description of the division The application belongs to a divisional application of Chinese patent application 202080033626.6 with the application date of 2020, 5 months and 6 days. Cross Reference to Related Applications The present application claims priority from U.S. provisional patent application No.62/843,806 filed 5/6/2019, the entire contents of which are incorporated herein by reference. Technical Field The present description relates to training machine learning models. Background The machine learning model receives input and generates an output, e.g., a predicted output, based on the received input and parameter values of the model. Neural networks are machine learning models that employ one or more layers of nonlinear units to predict the output of a received input. Some neural networks include one or more hidden layers in addition to the output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or output layer. Each layer of the network generates an output from the received inputs based on the current values of the respective parameter sets. Disclosure of Invention The present specification describes a system implemented as a computer program on one or more computers in one or more locations that trains a machine learning model through semi-supervised learning to perform machine learning tasks, i.e., by training the machine learning model on training data that includes unlabeled training inputs and labeled training inputs. The labeled training inputs are inputs for which a ground truth output is available, i.e., an output that should be generated by the machine learning model by performing a particular machine learning task on the labeled training inputs. Unlabeled training input is training input for which ground truth output is not available. The system trains the machine learning model by generating guess model outputs for, in part, unlabeled training inputs in the training data. To generate the guess model output, the system generates a plurality of augmented unlabeled training inputs from the unlabeled training inputs. The system then processes the plurality of augmented unlabeled training inputs using a machine learning model to generate a respective model output for each of the augmented unlabeled training inputs. The system then generates a guessed model output from the corresponding model output of each augmented unlabeled training input. Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. The described system is able to utilize limited tag data to model a training machine learning to perform well on machine learning tasks. In particular, by utilizing "label guesses", i.e., guess model outputs generated for unlabeled training inputs, the described system is able to train a machine learning model to have high performance at lower ratios of labeled data to unlabeled data than conventional techniques. Given the same amount of labeled data and unlabeled data, the system is able to train the machine learning model with better accuracy than using conventional techniques. As a specific example, the described techniques can be used to train a machine learning model to achieve state-of-the-art performance on various image classification tasks. Additionally, the system can train the machine learning model to be robust to input variability, e.g., to effectively handle variability in the input. For example, a trained machine learning model that has been trained in accordance with the described techniques will be able to effectively classify input images, even when the images have occlusions or blur, have different degrees of skew, different degrees of rotation, and so on. The details of one or more embodiments of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims. Drawings FIG. 1 illustrates an example machine learning model training system. FIG. 2 is a flow chart of an example process for training a machine learning model. FIG. 3A is a flowchart of an example process for training a machine learning model on a lot of unlabeled training inputs and a lot of labeled training inputs. Fig. 3B is a schematic diagram showing the generation of unlabeled lots for initial processing. Fig. 4 shows the performance of the described technique relative to other semi-supervised learning techniques. Like reference numbers and designations in the various drawings indicate like elements. Detailed Description FIG. 1 illustrates an example machine learning model training system 100. The machine learning model training system 100 is an example of a system implemented as a computer program