CN-121999960-A - Breast cancer gene drug screening method and system based on deep learning

CN121999960ACN 121999960 ACN121999960 ACN 121999960ACN-121999960-A

Abstract

The application relates to the technical field of breast cancer analysis and prediction, in particular to a breast cancer gene drug screening method and system based on deep learning. The screening method comprises the steps of obtaining mutation information of a breast cancer gene, carrying out matching search in a MySQL database to judge whether the mutation information of the breast cancer gene is existing information, carrying out sequence search and consistency check to obtain mutation type sequence information if not, carrying out protein structure prediction on the mutation type sequence information based on ‌ deep learning model to obtain ‌ three-dimensional space conformation data of mutation type amino acid, carrying out multidimensional function quantitative evaluation on ‌ three-dimensional space conformation data based on small medicine molecules to obtain ‌ function influence coefficient analysis data, grading the small medicine molecules, carrying out matching degree calculation on graded medicine and the mutation of the breast cancer gene according to the function influence coefficient analysis data, and obtaining a medicine recommendation index. The screening method can improve the screening accuracy of the breast cancer gene mutation drugs.

Inventors

HU JIANPING
NIE YUANSHENG
Shi quanshan
SHI HUBING

Assignees

成都英特格诺生物科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251231

Claims (10)

1. A method for screening breast cancer gene drugs based on deep learning, which is characterized by comprising the following steps: Acquiring mutation information of breast cancer genes; in a MySQL database, carrying out matching search on the breast cancer gene mutation information, and judging whether the breast cancer gene mutation information is the existing information or not; If not, carrying out sequence search according to the mutation information of the breast cancer gene to obtain double-chain sequence information containing ‌ wild type and mutant type; Carrying out consistency check on the double-chain sequence information containing ‌ wild type and mutant type to obtain mutant type sequence information; Carrying out protein structure prediction on the mutant sequence information based on ‌ deep learning model to obtain ‌ three-dimensional space conformation data of mutant amino acid; Carrying out multidimensional function quantitative evaluation on the ‌ three-dimensional space conformation data of the mutant amino acid based on the drug micromolecules to obtain ‌ function influence coefficient analysis data, wherein the ‌ function influence coefficient analysis data comprises a first fusion coefficient, a second fusion coefficient, a third fusion coefficient and a fourth fusion coefficient; Classifying the small drug molecules to obtain classified drugs; and according to the function influence coefficient analysis data, matching degree calculation is carried out on the grading medicament and the breast cancer gene mutation, so as to obtain a medicament recommendation index.
2. The screening method according to claim 1, wherein the step of performing multidimensional function quantitative evaluation on the ‌ three-dimensional spatial conformational data of the mutant amino acid based on the small drug molecule to obtain ‌ function influence coefficient analysis results comprises the steps of: Using smina/codockpp tool to calculate the binding energy of ‌ receptor and ligand for the ‌ three-dimensional space conformation data of mutant amino acid, and obtaining butt-joint scoring result according to small drug molecules; Predicting the scoring result of the functional change of the breast cancer gene mutation and the binding capacity of the drug small molecule according to the docking scoring result and the standard scoring value; constructing ‌ a coupling model of energy and structure based on foldx tools; according to the coupling model, evaluating a disturbance result of the breast cancer gene mutation information on the thermodynamic stability of the protein; and carrying out ‌ feature fusion on the scoring result and the disturbance result to obtain ‌ functional influence coefficient analysis data.
3. The screening method according to claim 2, wherein the step of predicting the scoring result of the functional change of the breast cancer gene mutation and the binding capacity of the drug small molecule based on the docking scoring result and the standard scoring value comprises the steps of: if the docking scoring result is < -7, judging that the binding capacity of the small molecules of the medicine is at a strong level; if the docking scoring result is-7 to-5, judging that the drug small molecule binding capacity is at a medium level; if the docking scoring result is > -5, judging that the binding capacity of the small molecules of the medicine is in a weak level; Setting a standard scoring value, and calculating a functional change difference value X of the butt scoring result and the standard scoring value; if the function change difference value X < -2, judging that the function change of the breast cancer gene mutation is weaker; if the function change difference value X is-2 to 2, judging that the function change of the breast cancer gene mutation is enhanced; if the difference value X >2 of the functional change, judging that the breast cancer gene mutation has no functional change.
4. The screening method according to claim 3, wherein the evaluation of the disturbance result of the breast cancer gene mutation information on the thermodynamic stability of the protein based on the coupling model comprises the steps of: Calculating an actual decomposition energy term value of the breast cancer gene mutation information by using a BuildModel module of FoldX; Calculating standard decomposition energy term values of the wild type of the breast cancer gene by using a BuildModel module of FoldX; And determining a disturbance result of the breast cancer gene mutation information on the thermodynamic stability of the protein according to the stability disturbance difference value Y of the actual decomposition energy term value and the standard decomposition energy term value.
5. The screening method according to claim 4, wherein determining the disturbance result of the breast cancer gene mutation information on the thermodynamic stability of the protein by the stability disturbance difference Y between the actual decomposition energy term value and the standard decomposition energy term value comprises: if the stability disturbance difference Y is more than 0, judging that the thermodynamic stability of the protein is enhanced by the breast cancer gene mutation information; If the stability disturbance difference value Y=0, judging that the breast cancer gene mutation information has no influence on the thermodynamic stability of the protein; and if the stability disturbance difference Y is less than 0, judging that the thermodynamic stability of the breast cancer gene mutation information on the protein is weakened.
6. The screening method according to claim 5, wherein performing ‌ feature fusion on the scoring result and the disturbance result to obtain ‌ functional influence coefficient analysis data includes: performing ‌ feature fusion on the scoring result and the disturbance result, and setting ‌ functional influence coefficient types; setting the butt joint scoring result as a first fusion coefficient; The value of the docking scoring result of the small drug molecules matched with the mutant sequence information is < -7, and the value is set as a second fusion coefficient; setting the function variation difference value X as a third fusion coefficient; and setting the stability disturbance difference value Y as a fourth fusion coefficient.
7. The screening method according to claim 1 or 6, wherein the matching degree calculation of the classified drug and the breast cancer gene mutation is performed according to the functional influence coefficient analysis data to obtain a drug recommendation index, comprising the steps of: if the classified medicines are ranked according to the first fusion coefficient, the smaller the first fusion coefficient is, the higher the medicine recommendation index is; If the classified medicines are ranked according to the second fusion coefficient, the smaller the second fusion coefficient is, the higher the medicine recommendation index is; And if the classified medicines are ordered according to the third fusion coefficient and the fourth fusion coefficient, directly outputting a function change difference value X and a stability disturbance difference value Y as medicine recommendation indexes.
8. The method of claim 1, wherein fractionating the drug small molecules to obtain a fractionated drug comprises: grading the drug small molecules according to whether the drug small molecules are in a clinical stage, are approved or belong to a pathway inhibitor, so as to obtain a graded drug; If the drug small molecule is in clinical stage and the FDA has been approved, the drug small molecule is used as a primary drug; if the drug small molecule is in clinical stage and is not approved by the FDA, the drug small molecule is used as a secondary drug; if the drug small molecule is in a non-clinical stage and is not approved by the FDA but belongs to a pathway inhibitor, the drug small molecule is used as a tertiary drug; If the drug small molecule is in a non-clinical stage and is not FDA approved nor is it a pathway inhibitor, the drug small molecule acts as a fourth-order drug.
9. The screening method according to claim 1, wherein the step of performing matching search on the breast cancer gene mutation information in MySQL database to determine whether the breast cancer gene mutation information is existing information, comprises the steps of: According to the gene name, mutation site and analysis parameters, matching and searching the breast cancer gene mutation information with a MySQL database, and judging whether the breast cancer gene mutation information is the existing information or not; if the breast cancer gene mutation information is matched with the MySQL database, directly outputting recommended drugs; And if the breast cancer gene mutation information is not matched with the MySQL database, performing sequence search according to the breast cancer gene mutation information to obtain double-chain sequence information containing ‌ wild type and mutant type.
10. A deep learning-based breast cancer gene drug screening system, characterized in that the screening system is adapted to the screening method of any one of claims 1 to 9, the screening system comprising: the information collection module is used for collecting mutation information of the breast cancer genes; The retrieval module is used for detecting and judging whether the breast cancer gene mutation information is the existing information in the MySQL database; the automatic analysis module is used for executing the steps of sequence retrieval, consistency check, protein structure prediction, multidimensional function quantitative evaluation, classification and matching degree calculation to obtain a medicine recommendation index; the task scheduling module is used for balancing the information collecting module, the searching module and the automatic analyzing module to ensure the analyzing efficiency of the screening system; And the self-learning module is used for automatically optimizing the automatic analysis module and improving the accuracy of the medicine recommendation index of the automatic analysis module.

Description

Breast cancer gene drug screening method and system based on deep learning Technical Field The application relates to the technical field of breast cancer analysis and prediction, in particular to a breast cancer gene drug screening method and system based on deep learning. Background Breast cancer is the most frequently occurring malignancy in women, wherein, in numerous cases of breast cancer, 5% to 10% of breast cancers are related to genetic factors, while BRCA1/2 gene mutations are the main genetic susceptibility factors, women carrying BRCA1/2 gene mutations have lifelong breast cancer risks of up to 50% to 85%, and ovarian cancer risks of up to 12% to 60%. With the clinical use of PARP inhibitors (e.g., olapari), gene mutation detection has become a key element in accurate therapy. The 2025 CSCO guideline improves the BRCA detection recommended level to 1A level, the NCCN guideline suggests that all HER 2-patients need detection, and simultaneously a plurality of gene detection indicators such as PIK3CA, AKT1 and the like are added, so that higher requirements are provided for the comprehensiveness, timeliness and accuracy of mutation analysis. However, current methods for detecting BRCA1/2 gene mutations in breast cancer include: (1) The method is suitable for small-scale single mutation detection, and has no batch processing capacity; (2) The second generation sequencing (NGS) can analyze a plurality of genes in parallel, the detection efficiency is improved, the overall NGS panel can be increased by 20% of targeted therapy information, but the detection process is complex, takes a long time (5 days to 7 days), and does not support multiplexing of historical results; (3) The PCR technology is used for screening specific mutation sites, has lower cost but limited coverage range, and cannot realize multi-type mutation joint analysis; (4) The existing bioinformatics tools (such as PolyPhen-2 and SIFT) can only realize mutation function prediction, and do not integrate protein structure prediction, molecular docking analysis and medicine recommendation, and have no automatic flow series connection. The detection method at the present stage is difficult to meet the high-efficiency detection requirement of large-scale crowds, meanwhile, the existing evaluation system is difficult to match medicines and multi-dimensional scoring evaluation of mutation matching degree, so that the medicine grading standards for different mutations are inaccurate, and in addition, the detection method at the present stage is complex in operation and easy to introduce artificial errors due to the fact that a plurality of systems such as a sequencing platform, an analysis tool and database query are required to be manually switched, so that the detection accuracy is affected. Disclosure of Invention The application provides a breast cancer gene drug screening method and system based on deep learning, which aim to solve the technical problem of how to simply and conveniently improve the accuracy of screening breast cancer gene mutation drugs. In a first aspect, an embodiment of the present application provides a method for screening breast cancer gene drugs based on deep learning, where the screening method includes: Acquiring mutation information of breast cancer genes; in a MySQL database, carrying out matching search on the breast cancer gene mutation information, and judging whether the breast cancer gene mutation information is the existing information or not; If not, carrying out sequence search according to the mutation information of the breast cancer gene to obtain double-chain sequence information containing ‌ wild type and mutant type; Carrying out consistency check on the double-chain sequence information containing ‌ wild type and mutant type to obtain mutant type sequence information; Carrying out protein structure prediction on the mutant sequence information based on ‌ deep learning model to obtain ‌ three-dimensional space conformation data of mutant amino acid; Carrying out multidimensional function quantitative evaluation on the ‌ three-dimensional space conformation data of the mutant amino acid based on the drug micromolecules to obtain ‌ function influence coefficient analysis data, wherein the ‌ function influence coefficient analysis data comprises a first fusion coefficient, a second fusion coefficient, a third fusion coefficient and a fourth fusion coefficient; Classifying the small drug molecules to obtain classified drugs; and according to the function influence coefficient analysis data, matching degree calculation is carried out on the grading medicament and the breast cancer gene mutation, so as to obtain a medicament recommendation index. Optionally, carrying out multidimensional function quantitative evaluation on the ‌ three-dimensional space conformation data of the mutant amino acid based on the small drug molecule to obtain ‌ function influence coefficient analysis results, wherein the meth