CN-122029609-A - Generating predictions related to chemical processes using machine learning models
Abstract
In some embodiments, a computer-implemented method of generating predictions related to a chemical workflow comprising a process performed on at least one chemical is provided. The computing system encodes the representation of the at least one chemical using a chemical encoder to create at least one potential chemical representation. The computing system creates a potential experimental representation based on the at least one potential chemical representation. The computing system decodes the potential experiment representation using the experiment decoder to predict one or more properties of the output of the chemical workflow, to generate an updated representation of the at least one chemical, or to generate an updated representation of the process.
Inventors
- JONATHAN WELCH
- Tejas Barna
- Nicholas Tulkens
Assignees
- 艾尔伯特发明公司
Dates
- Publication Date
- 20260512
- Application Date
- 20241011
- Priority Date
- 20231013
Claims (20)
- 1. A computer-implemented method of generating predictions related to a chemical workflow, the chemical workflow comprising a process performed on at least one chemical, the method comprising: Encoding, by the computing system, the representation of the at least one chemical using a chemical encoder to create at least one potential chemical representation; creating a potential experimental representation based on the at least one potential chemical representation, and Decoding, by the computing system, the potential experimental representation using an experimental decoder to predict one or more properties of an output of the chemical workflow, to generate an updated representation of the at least one chemical, or to generate an updated representation of the process.
- 2. The computer-implemented method of claim 1, wherein predicting one or more properties of the output of the chemical workflow comprises predicting one or more acoustic, chemical, electrical, magnetic, mechanical, optical, or thermal properties.
- 3. The computer-implemented method of claim 1, the method further comprising: Encoding, by the computing system, the representation of the process using a process encoder to create a potential process representation, and Wherein creating the potential experimental representation based on the at least one potential chemical representation comprises: the at least one potential chemical representation and the potential process representation are encoded by the computing system using an experiment encoder to create the potential experiment representation.
- 4. The computer-implemented method of claim 3, the method further comprising: Context data associated with the chemical workflow is encoded by the computing system to create a potential context representation.
- 5. The computer-implemented method of claim 4, the method further comprising: The potential context representation is fused with at least one of the potential chemical representation, the potential process representation, or the potential experimental representation by the computing system.
- 6. The computer-implemented method of claim 4, the method further comprising: The potential context representation is used by the computing system in creating at least one of the potential chemical representation, the potential process representation, or the potential experimental representation.
- 7. The computer-implemented method of claim 3, wherein at least one of the chemical encoder, the process encoder, and the experimental encoder is a transducer model.
- 8. The computer-implemented method of claim 7, wherein the transformer model is a bi-directional encoder representation (BERT) model from a transformer.
- 9. The computer-implemented method of claim 7, wherein the transducer model is a Uni-MOL model.
- 10. The computer-implemented method of claim 7, wherein at least one of the chemical encoder, the process encoder, and the experimental encoder comprises a plurality of domain-specific encoders arranged as a proxy network.
- 11. The computer-implemented method of claim 3, wherein the experimental decoder is a gaussian process.
- 12. The computer-implemented method of claim 11, wherein the gaussian process comprises a radial basis kernel.
- 13. The computer-implemented method of claim 3, wherein the representation of the process comprises a structured representation of the process.
- 14. The computer-implemented method of claim 13, wherein the structured representation of the process is a structured text string having one or more key-value pairs, wherein the one or more key-value pairs include one or more key-value pairs representing one or more process steps and one or more key-value pairs representing one or more parameters.
- 15. The computer-implemented method of claim 13, wherein the one or more key-value pairs representing one or more process steps comprise a representation of at least one of: step names; the type of the step; the sequence of steps; equipment identifier of a machine for performing the process steps, or Set values for settings of the machine for performing the process steps.
- 16. The computer-implemented method of claim 13, wherein the one or more key-value pairs representing one or more parameters comprise a representation of at least one of: Parameter names; the sequence of steps; A machine equipment identifier associated with the parameter; Attributes of the machine associated with the parameters; a set value of the machine associated with the parameter, or Step condition value.
- 17. The computer-implemented method of claim 3, further comprising training at least one of the experimental decoder, the experimental encoder, the process encoder, or the chemical encoder by: Comparing, by the computing system, the one or more predicted properties with one or more real-valued properties measured from an execution instance of the chemical workflow to determine a value of a loss function; determining, by the computing system, a gradient of the loss function, and The gradient is used by the computing system to update the at least one of the experiment decoder, the experiment encoder, the process encoder, or the chemical encoder.
- 18. The computer-implemented method of claim 1, wherein the representation of the at least one chemical comprises a structured representation of the at least one chemical.
- 19. The computer-implemented method of claim 1, wherein the structured representation of the at least one chemical is a structured text string having one or more key-value pairs, wherein the one or more key-value pairs represent one or more of: chemical name; A chemical identifier; chemical linear sign; Concentration; properties of; A unit; chemical properties or Unstructured text associated with the at least one chemical.
- 20. The computer-implemented method of claim 19, wherein the chemical linear symbol comprises a simplified molecular input linear input system (SMILES) string, a polymer SMILES (PSMILES) string, an international chemical identifier (InChI) string, a WISWESSER linear symbol (WLN) string, a SMILES arbitrary target specification (SMARTS) string, or a SYBYL linear Symbol (SLN) string.
Description
Generating predictions related to chemical processes using machine learning models Cross Reference to Related Applications The present application claims the benefit of provisional application No. 63/590341 filed on month 13 of 2023, the entire disclosure of which is hereby incorporated by reference for all purposes. Background Machine learning is a rapidly evolving field. Recently, new techniques for processing text have been created that allow for predicting novel text outputs based on input cues. These techniques are typically based on artificial neural networks and use an architecture such as a transformer. At a high level, the transformer is typically trained on a text corpus and encodes the corpus into potential representations. The decoder is then typically used to generate novel text from the potential representation. While the ability to generate novel outputs based on manual understanding of a training corpus has significant utility, these tools and techniques are underutilized in areas other than text processing. What is desired is a technique for understanding and generating content in a field other than language processing using a machine learning model of the like. Disclosure of Invention This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. In some embodiments, a computer-implemented method of generating predictions related to a chemical workflow comprising a process performed on at least one chemical is provided. The computing system encodes the representation of the at least one chemical using a chemical encoder to create at least one potential chemical representation. The computing system creates a potential experimental representation based on the at least one potential chemical representation. The computing system decodes the potential experimental representation using an experimental decoder to predict one or more properties of the output of the chemical workflow, to generate an updated representation of the at least one chemical, or to generate an updated representation of the process. In some embodiments, a non-transitory computer-readable medium having computer-executable instructions stored thereon is provided. The instructions, in response to execution by one or more processors of a computing system, cause the computing system to perform the method as described above. In some embodiments, a computing system configured to perform a method as described above is provided. Drawings The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein: FIG. 1 is a schematic diagram of a non-limiting example embodiment of a system for predicting a property of a chemical process using machine learning, according to various aspects of the present disclosure. FIG. 2 is a block diagram illustrating aspects of a non-limiting example embodiment of a material property prediction computing system in accordance with various aspects of the present disclosure. Fig. 3 is a schematic diagram of a non-limiting example embodiment of an encoder/decoder processing workflow information in accordance with various aspects of the present disclosure. Fig. 4A-4C are schematic diagrams of techniques for predicting properties of an output of a chemical process, according to various aspects of the present disclosure. FIG. 5 is an illustration of a non-limiting example embodiment of a structured chemical representation according to various aspects of the present disclosure. Fig. 6A and 6B are illustrations of non-limiting example embodiments of structured process representations according to various aspects of the present disclosure. Fig. 7A and 7B are flowcharts illustrating non-limiting example embodiments of methods of predicting a property of an output of a chemical process in accordance with various aspects of the present disclosure. Fig. 8 is a schematic diagram of a non-limiting example implementation of an encoder/decoder according to various aspects of the present disclosure, wherein context information is optionally encoded and injected into the encoding/decoding process at various points. 9A-9C illustrate three non-limiting example techniques that may be used for potential spatial fusion according to various aspects of the present disclosure. Fig. 10 illustrates a non-limiting example embodiment of a proxy network in accordance with various aspects of the present disclosure. Detailed Description By storing large amounts of chemical experimental data generated by research and development work, embodiments of the present disclosure are able to