EP-4738196-A1 - CONTROL PROGRAM, CONTROL METHOD, AND INFORMATION PROCESSING DEVICE

EP4738196A1EP 4738196 A1EP4738196 A1EP 4738196A1EP-4738196-A1

Abstract

A control program that causes a computer to execute a process, the process including: generating an intermediate representation corresponding to input data, the intermediate representation being generated using a trained model; training a distribution of latent representations corresponding to the generated intermediate representation, using a predetermined encoder generating the latent representation from the intermediate representation and a predetermined decoder corresponding to the predetermined encoder and generating an intermediate representation different from the intermediate representation; selecting, from the trained distribution, a sample of the latent representation based on a probability distribution; generating, using the predetermined decoder, a new intermediate representation corresponding to the selected sample; and generating, using the trained model, output data corresponding to the generated new intermediate representation.

Inventors

YOSHIKAWA, HIYORI
TOMA, Mitsunori
YAMAZAKI, KIMIHIRO
WADA, YUICHIRO
WADA, MUTSUYO
Waida, Hiroki
ISHII, YOSHIYUKI
KATOH, TAKASHI
NAKAGAWA, AKIRA

Assignees

FUJITSU LIMITED

Dates

Publication Date: 20260506
Application Date: 20251024

Claims (10)

A computer-readable recording medium storing therein a control program that causes a computer to execute a process, the process comprising: generating an intermediate representation corresponding to input data, the intermediate representation being generated using a trained model; training a distribution of latent representations corresponding to the generated intermediate representation, using a predetermined encoder generating the latent representation from the intermediate representation and a predetermined decoder corresponding to the predetermined encoder and generating an intermediate representation different from the intermediate representation; selecting, from the trained distribution, a sample of the latent representation based on a probability distribution; generating, using the predetermined decoder, a new intermediate representation corresponding to the selected sample; and generating, using the trained model, output data corresponding to the generated new intermediate representation.
The control program according to claim 1, wherein the training includes: employing training data used in training the trained model as the input data; and training the distribution of the latent representation according to the predetermined encoder and the predetermined decoder, using an objective function that includes a likelihood of output data corresponding to the input data, a reconstruction error of the intermediate representation corresponding to the input data, and a KL divergence related to the distribution of the latent representation.
The control program according to claim 2, wherein the predetermined encoder includes: a first generator that generates a first vector by fully concatenating a plurality of vectors representing the intermediate representation and then multiplying the fully concatenated vectors by a first weight matrix, and a first multilayer perceptron that generates the latent representation corresponding to the generated first vector, and the predetermined decoder includes: a second multilayer perceptron that generates a second vector corresponding to the latent representation, and a second generator that restores the plurality of vectors representing the intermediate representation from a third vector obtained by multiplying the generated second vector by a second weight matrix.
The control program according to claim 2, wherein the predetermined encoder is a model that generates the latent representation by repeating a convolution operation on the intermediate representation and that identifies parameters representing a distribution of the latent representation, the predetermined decoder is a model that generates the intermediate representation by repeating a deconvolution operation on the latent representation, and the training includes identifying the distribution of the latent representation by identifying the parameters.
The control program according to claim 2, wherein the predetermined encoder is a model that includes repeating a convolution operation on an intermediate representation a plurality of times and that identifies a first parameter representing the distribution of each of a plurality of hierarchical latent representations according to the results of each of the convolution operations, the predetermined decoder is a model that includes performing a deconvolution operation on each of the plurality of latent representations and that fixes a second parameter representing the distribution of the latent representation in a bottom layer to identify a second parameter representing the distribution of each of the latent representations other than the bottom layer, and the process of training includes: training the first parameter according to the predetermined encoder based on the input data; and training the second parameter according to the predetermined decoder based on the first parameter, thereby training the distribution of each of the latent representations represented by the second parameter.
The control program according to claim 1, the process further comprising: generating a first vector corresponding to the generated intermediate representation using a first model; adopting the training data used in training the trained model as the input data and utilizing an objective function including the likelihood of output data corresponding to the input data and the KL divergence related to the distribution of the latent representation, to train the distribution of latent representations corresponding to the generated first vector according to a first encoder generating the latent representation from the first vector and a first decoder corresponding to the first encoder and generating a vector different from the first vector and train a second model that converts the different vector into output data; selecting a sample of the latent representation according to a probability distribution from the trained distribution; and generating a second vector corresponding to the selected sample using the first decoder; and generating output data corresponding to the generated second vector using the trained second model.
The control program according to any one of claims 1 to 6, wherein the trained model is a deep learning model that, with an amino acid sequence as the input data, outputs the output data representing a protein structure.
The control program according to any one of claims 1 to 6, wherein the trained model is a model that, with sequence information representing a sentence as the input data, outputs sequence information representing another sentence as the output data.
A control method executed by a computer, the method comprising: generating an intermediate representation corresponding to input data, the intermediate representation being generated using a trained model; training a distribution of latent representations corresponding to the generated intermediate representation, using a predetermined encoder generating the latent representation from the intermediate representation and a predetermined decoder corresponding to the predetermined encoder and generating an intermediate representation different from the intermediate representation; selecting, from the trained distribution, a sample of the latent representation based on a probability distribution; generating, using the predetermined decoder, a new intermediate representation corresponding to the selected sample; and generating, using the trained model, output data corresponding to the generated new intermediate representation.
An information processing device, comprising: a memory; a processor coupled to the memory, the processor configured to: generate an intermediate representation corresponding to input data, the intermediate representation being generated using a trained model; train a distribution of latent representations corresponding to the generated intermediate representation, using a predetermined encoder generating the latent representation from the intermediate representation and a predetermined decoder corresponding to the predetermined encoder and generating an intermediate representation different from the intermediate representation; select, from the trained distribution, a sample of the latent representation based on a probability distribution; generate, using the predetermined decoder, a new intermediate representation corresponding to the selected sample; and generate, using the trained model, output data corresponding to the generated new intermediate representation.

Description

FIELD The embodiments discussed herein are related to a control program, a control method, and an information processing device. BACKGROUND Conventionally, deep learning models that handle sequence information, such as Transformer models, have demonstrated high performance in understanding input information and predicting structured outputs (for example, refer to, Vaswani, Ashish, et al, "Attention is all you need." Advances in neural information processing systems 30 (2017)). Large-scale trained models with high expressive capabilities are also known, such as AlphaFold2, which predicts protein structures (or example, refer to, Jumper, John, et al, "Highly accurate protein structure prediction with AlphaFold." Nature 596.7873 (2021): 583-589). There are cases where it is desirable to take advantage of the expressive capabilities of such trained models to generate high-quality, diverse outputs without making changes to the parameters of the trained model. SUMMARY It is an object in one aspect of the embodiments to at least solve the above problems in the conventional technologies. According to an aspect of an embodiment, a control program that causes a computer to execute a process, the process including: generating an intermediate representation corresponding to input data, the intermediate representation being generated using a trained model; training a distribution of latent representations corresponding to the generated intermediate representation, using a predetermined encoder generating the latent representation from the intermediate representation and a predetermined decoder corresponding to the predetermined encoder and generating an intermediate representation different from the intermediate representation; selecting, from the trained distribution, a sample of the latent representation based on a probability distribution; generating, using the predetermined decoder, a new intermediate representation corresponding to the selected sample; and generating, using the trained model, output data corresponding to the generated new intermediate representation. BRIEF DESCRIPTION OF DRAWINGS Fig. 1 is an explanatory diagram depicting an example of a control method according to an embodiment.Fig. 2 is an explanatory diagram depicting an example of a system configuration of an information processing system 200.Fig. 3 is a block diagram depicting an example of a hardware configuration of a controller 201.Fig. 4 is an explanatory diagram depicting a specific example of input data 400.Fig. 5 is an explanatory diagram depicting a specific example of an intermediate representation 500.Fig. 6 is a block diagram depicting an example of a functional configuration of the controller 201.Fig. 7 is an explanatory diagram depicting a first operation example of the controller 201.Fig. 8 is an explanatory diagram depicting the first operation example of the controller 201.Fig. 9 is an explanatory diagram depicting a second operation example of the controller 201.Fig. 10 is an explanatory diagram depicting the second operation example of the controller 201.Fig. 11 is an explanatory diagram depicting the second operation example of the controller 201.Fig. 12 is an explanatory diagram depicting the second operation example of the controller 201.Fig. 13 is an explanatory diagram depicting a third operation example of the controller 201.Fig. 14 is an explanatory diagram depicting the third operation example of the controller 201.Fig. 15 is an explanatory diagram depicting a fourth operational example of the controller 201.Fig. 16 is a flowchart depicting an example of a procedure of a training process.Fig. 17 is a flowchart depicting an example of a procedure of a generation process. DESCRIPTION OF EMBODIMENTS First, problems associated with the conventional techniques are discussed. In the conventional techniques, when some kind of operation is attempted to be perform on an intermediate representation of a trained model to change the output, the range of operation of the intermediate representation that corresponds to valid output is not clear, making it difficult to determine what kind of operation will result in valid output. Embodiments of a control program, a control method, and an information processing device according to the present disclosure are described in detail with reference to the accompanying drawings. Fig. 1 is an explanatory diagram depicting an example of a control method according to an embodiment. The information processing device 100 is a computer for controlling operations on an intermediate representation corresponding to input data to a trained model. The information processing device 100 is, for example, a server or a personal computer (PC). Here, the trained model is a machine learning model trained by machine learning such as deep learning. Deep learning is also called deeplayer learning. The trained model is, for example, information that combines trained parameters and an algorithm for deriving output