KR-20260065057-A - SYSTEM FOR GENERATING 3D MOLECULES BASED ON PHYSICAL KNOWLEDGE FOR DRUG CANDIDATE DISCOVERY AND METHOD THEREOF
Abstract
The present invention relates to a method and system for generating a three-dimensional molecule based on physical knowledge for deriving new drug candidates, which generates and optimizes a three-dimensional molecule that minimizes the binding free energy between a protein and a ligand using a diffusion model and an SE (3)-equivariant neural network based on physicochemical principles. The structure comprises a data input unit for inputting three-dimensional data of a protein pocket, a physical knowledge-based artificial intelligence model that sets an initial state ligand structure by injecting Gaussian noise based on the data, and generates a three-dimensional structure of a ligand that optimizes the binding free energy of the ligand structure by applying SE (3)-equivariant in a reverse process of removing noise from the initial state ligand structure, and a molecular structure output unit that finally outputs the three-dimensional structure of the ligand according to the evaluation of binding affinity and structural similarity of the generated ligand structure, wherein the physical knowledge-based artificial intelligence model is implemented using a diffusion model.
Inventors
- 박상현
- 최승연
- 서상민
Assignees
- 연세대학교 산학협력단
Dates
- Publication Date
- 20260508
- Application Date
- 20241031
Claims (12)
- In a method for generating the three-dimensional molecular structure of a ligand using physical knowledge and diffusion models for deriving new drug candidates, Step of receiving 3D data of the protein pocket; A step of establishing an initial ligand structure by injecting Gaussian noise based on the protein pocket data using a physics knowledge-based artificial intelligence model; A step of optimizing to minimize the binding free energy of the ligand structure by applying SE(3)-isovariance in the reverse process of removing noise from the initial state ligand structure using a physics knowledge-based artificial intelligence model; The method includes the step of finally outputting the structure of the above ligand according to the evaluation result, The above physics knowledge-based artificial intelligence model is implemented using a diffusion model. Method for generating ligand molecular structure.
- In paragraph 1, A method for generating a ligand molecular structure in which the step of optimizing to minimize the above-mentioned binding free energy is to secure binding stability between the protein and the ligand by calculating the Lennard-Jones potential energy.
- In paragraph 2, A method for generating a ligand molecular structure in which the step of optimizing to minimize the above-mentioned bond free energy is to generate the entire structure of the ligand at once using a non-autoregressive sampling method.
- In paragraph 3, A method for generating a ligand molecular structure that generates a molecule while maintaining the geometric consistency of the molecule, including bond angles and bond lengths, by the above-mentioned non-autoregressive sampling method.
- In paragraph 1, A step of evaluating the binding affinity of the above ligand structure, and An evaluation step for verifying the stability of a molecule by evaluating the binding affinity with a target protein in the above binding affinity evaluation step; A method for generating a ligand molecular structure that further includes
- In paragraph 5, A method for generating a ligand molecular structure, wherein the ligand structure with minimized binding free energy is output as a new drug candidate in the final output step.
- In a system for generating the three-dimensional molecular structure of a ligand using physical knowledge and diffusion models for deriving new drug candidates, A data input unit for inputting three-dimensional data of a protein pocket; Based on the above data, Gaussian noise is injected to establish the initial state ligand structure, and A physics knowledge-based artificial intelligence model that generates a 3D structure of a ligand optimized to minimize the binding free energy of the ligand structure by applying SE(3)-isovariance in the reverse process of removing noise from the ligand structure in the initial state; and A molecular structure output unit that finally outputs the three-dimensional structure of a ligand based on an evaluation of the binding affinity and structural similarity of the generated ligand structure; comprising The above physics knowledge-based artificial intelligence model is implemented using a diffusion model. Ligand molecular structure generation system.
- In Paragraph 7, The above physics knowledge-based artificial intelligence model is a ligand molecular structure generation system that secures binding stability between a protein and a ligand by calculating Lennard-Jones potential energy.
- In paragraph 8, The above physics knowledge-based artificial intelligence model is a ligand molecular structure generation system that generates the entire structure of a ligand at once using a non-autoregressive sampling method.
- In Paragraph 9, A ligand molecule structure generation system that generates molecules while maintaining the geometric consistency of the molecules, including bond angles and bond lengths, by the above-mentioned non-autoregressive sampling method.
- In Paragraph 7, The above physics knowledge-based artificial intelligence model is, Evaluate the binding affinity of the above ligand structure, and A ligand molecular structure generation system that verifies the stability of a molecule by evaluating the binding affinity with a target protein in the above binding affinity evaluation.
- In Paragraph 11, A ligand molecular structure generation system that outputs a ligand structure with minimized bond free energy as a new drug candidate from the molecular structure output section above.
Description
System for Generating 3D Molecules Based on Physical Knowledge for Drug Candidate Discovery The present invention relates to a method and system for generating three-dimensional molecules based on physical knowledge for deriving new drug candidates, and more specifically, to a technology for generating and optimizing three-dimensional molecules that minimize the binding free energy between proteins and ligands using a diffusion model and an SE(3)-equivalence neural network based on physicochemical principles. The binding of proteins to ligands is a critical process in new drug development, and designing ligands suitable for the protein's active site is key. In the conventional process of identifying new drug candidates, molecular modeling based on geometric information has been primarily used. These models evaluated molecular binding characteristics using one-dimensional strings or two-dimensional structure-based prediction techniques, but they have limitations in that they fail to adequately reflect the three-dimensional interactions between proteins and ligands. In particular, existing technologies predicted and designed molecular structures by relying solely on geometric information without adequately considering physical interactions. Consequently, when the generated molecules bind to actual proteins, the binding free energy is not sufficiently minimized, which can lead to reduced binding stability. Furthermore, conventional autoregressive methods generate each part of the molecule sequentially, which increases the likelihood of cumulative errors and limits the ability to maintain structural consistency across the entire molecule. In particular, existing 1D or 2D-based molecular generation methods fail to reflect interactions in three-dimensional space, and thus have not been able to properly model the complex interatomic interactions occurring within the binding pockets of proteins. Since physical and chemical principles are not reflected, there is a possibility that the generated molecules may not maintain stability in actual binding. Consequently, these limitations suggest the need for technology in the design process of new drug candidates that can generate more realistic three-dimensional molecular structures and maintain binding stability. A new approach is required to generate optimized three-dimensional structures that consider the geometric properties of molecules and minimize the binding free energy between proteins and ligands based on physical interactions. Therefore, there is a need for a system capable of generating three-dimensional molecular structures with minimized binding free energy by reflecting physical and chemical principles. In addition, there is an urgent need to develop a technology that can be applied to structure-based drug design (SBDD) by efficiently evaluating and optimizing protein-ligand interactions. Figure 1 is a schematic diagram illustrating a method and system for generating three-dimensional molecules based on physical knowledge for deriving new drug candidates. FIG. 2 is a diagram illustrating the configuration of a physics-knowledge-based 3D molecule generation system according to an embodiment of the present invention. FIG. 3 is a diagram illustrating a method for generating and learning a physical knowledge-based artificial intelligence model according to an embodiment of the present invention. FIG. 4 is a diagram illustrating a method in which a physics knowledge-based artificial intelligence model according to an embodiment of the present invention generates a three-dimensional structure of a ligand. FIG. 5 is a reference diagram illustrating a formula associated with physics knowledge-based optimization in a physics knowledge-based artificial intelligence model according to an embodiment of the present invention. FIG. 6 is a diagram illustrating the operation method of a physical knowledge-based artificial intelligence model (PIDiff) according to an embodiment of the present invention. FIG. 7 is a diagram illustrating the operation sequence of a molecule generation system according to an embodiment of the present invention. FIG. 8 is a reference diagram evaluating the performance of a molecule generated by an artificial intelligence model (PIDiff) of a physical knowledge-based 3D molecule generation system for deriving new drug candidate substances according to an embodiment of the present invention. FIG. 9 is a diagram illustrating a computing device that implements a descriptor generation method and a generation device according to an embodiment of the present invention. The present invention will be described below with reference to the attached drawings. However, the present invention may be implemented in various different forms and is therefore not limited to the embodiments described herein. Furthermore, in order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and similar parts throughou