Search

CN-121983120-A - Pancreatic cancer risk prediction method and system based on pancreatic cancer related genome markers

CN121983120ACN 121983120 ACN121983120 ACN 121983120ACN-121983120-A

Abstract

The invention discloses a pancreatic cancer risk prediction method and a pancreatic cancer risk prediction system based on pancreatic cancer related genome markers. The method comprises the steps of collecting basic information and a peripheral blood sample of an object to be evaluated, carrying out gene detection on the peripheral blood sample to obtain genome marker data comprising single nucleotide polymorphism sites and pancreatic cancer related rare germ line mutant genes, encoding and preprocessing the genome marker data and the basic information, carrying out hierarchical screening and weighting on the data based on a pancreatic cancer generation mechanism to form a risk feature set for pancreatic cancer risk evaluation, inputting the risk feature set into a risk prediction model to obtain a pancreatic cancer risk evaluation result, carrying out risk layering according to the evaluation result, and outputting corresponding health management or medical intervention suggestions. The method realizes quantitative prediction of the life-long risk of pancreatic cancer, can effectively distinguish people with different risks, and has good stability and clinical application value.

Inventors

  • WANG XIAOYI
  • ZHANG QINGYUN
  • Miao Wenteng
  • TAO BAIAN
  • LI JI
  • JIN CHEN
  • FU DELIANG

Assignees

  • 复旦大学附属华山医院

Dates

Publication Date
20260505
Application Date
20251226

Claims (10)

  1. 1. A pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers, comprising the steps of: S1, basic information of an object to be evaluated is collected, wherein the basic information at least comprises gender, age and whether the object to be evaluated has a past tumor history or a family tumor history; s2, collecting a peripheral blood sample of the object to be evaluated, and carrying out gene detection on the peripheral blood sample to obtain genome marker data related to pancreatic cancer; s3, carrying out data coding and preprocessing on the genome marker data and the basic information, and screening and weighting the data based on a pancreatic cancer specific data screening algorithm to form a pancreatic cancer risk feature set; S4, inputting the pancreatic cancer risk feature set into a pre-constructed pancreatic cancer risk prediction model to obtain a pancreatic cancer risk assessment result of the object to be assessed; S5, calculating a pancreatic cancer risk ratio based on the risk assessment result, and carrying out risk stratification on the object to be assessed according to a preset threshold; and S6, outputting corresponding health management or medical intervention suggestions according to the risk stratification result.
  2. 2. The pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 1, wherein said pancreatic cancer-related genomic markers in step S2 comprise single nucleotide polymorphism detection sites comprising at least one or more of the following sites: rs401681、rs2255280、rs2317900、rs6971499、rs167020、rs1561927、rs9581943、rs4885093、rs9543325、rs7190458、rs372883、rs1547374、rs16986825、rs5768709.
  3. 3. The pancreatic cancer risk prediction method based on a pancreatic cancer-related genomic marker according to any one of claims 1 or 2, wherein the pancreatic cancer-related genomic marker further comprises a rare germ-line mutant gene comprising at least one or more of the following genes: AR、ATM、ATR、BARD1、BRAF、BRCA1、BRCA2、BRIP1、CDH1、CDK12、CHEK1、CHEK2、ERBB2、ESR1、FANCA、FANCL、HDAC2、HOXB13、KRAS、MRE11、NBN、NRAS、PALB2、PIK3CA、PPP2R2A、PTEN、RAD51B、RAD51C、RAD51D、RAD54L、STK11、TP53.
  4. 4. the pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 1, wherein said pancreatic cancer-specific data screening algorithm in step S3 comprises: Mapping the genomic marker data and underlying information into a common genetic susceptibility layer, a rare mutation layer, and a clinical regulation layer according to their role attributes in pancreatic carcinogenesis mechanisms, wherein: A common genetic susceptibility layer is used to characterize the cumulative genetic susceptibility of multiple single nucleotide polymorphism sites; The rare mutation layer is used for characterizing pancreatic cancer-related germ-line mutation risk with high influence; the clinical regulatory layer is used for age-related regulation of genetic risk and tumor history.
  5. 5. The pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 4, wherein in said common genetic predisposition layer, the genotypes of the single nucleotide polymorphism sites are encoded and weighted as follows: first, each locus is classified based on the allele status to reflect the cumulative number of mutant alleles; then, according to the stability characteristics of the sites in the detection crowd, corresponding confidence correction factors are distributed to each site; And combining the grading coding result with a corresponding confidence coefficient correction factor to reduce the influence of the detection of the position with higher fluctuation on the overall risk assessment.
  6. 6. The pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 4, wherein in said rare mutation layer, rare germ-line mutations are treated according to the following rules: When no pancreatic cancer-associated rare germ line mutations are detected, the rare mutation layer is marked as a low risk state; When at least one pancreatic cancer-associated rare germ-line mutation is detected, mapping the rare mutation layers collectively to a single high-risk marker; To avoid the occurrence of undesired cumulative amplification effects of multiple rare germ line mutations during risk assessment.
  7. 7. The pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 1, wherein in step S3, the data encoding and preprocessing at least comprises: substituting a representative genotype based on population distribution characteristics for a deletion detection result in genotype data; Centering and scale unifying processing are carried out on the age information so as to eliminate the influence of different scales on risk assessment; The past history of tumor or family history information of tumor is recorded in the presence or absence mode.
  8. 8. The pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to claim 1, wherein in step S4, the pancreatic cancer risk prediction model is constructed based on a multimodal evaluation strategy, and the model comprises at least one of a logistic regression model, a support vector machine model, a random forest model, or a gradient lifting tree model.
  9. 9. A pancreatic cancer risk prediction system based on pancreatic cancer-related genomic markers, comprising: the basic information acquisition module is used for acquiring sex, age and tumor past history or family tumor history information of the object to be evaluated; The gene detection data acquisition module is used for acquiring genome marker detection data of the peripheral blood sample of the object to be evaluated; The data coding and preprocessing module is used for coding the genome marker data and the basic information, and processing the missing value and carrying out standardization processing; the pancreatic cancer specific data screening module is used for layering, screening and weighting the data according to a pancreatic cancer related genetic mechanism to generate a pancreatic cancer risk feature set; The risk prediction model module is used for outputting pancreatic cancer risk assessment results based on the pancreatic cancer risk feature set; The risk assessment and layering module is used for calculating the pancreatic cancer risk ratio and carrying out risk classification; the intervention suggestion output module is used for outputting corresponding health management or medical intervention suggestions according to the risk level; Wherein the modules cooperate to implement the method of claim 1.
  10. 10. A computer readable storage medium, wherein said storage medium stores a plurality of instructions adapted to be loaded by a processor more than one line to perform the steps of a pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers according to any one of claims 1 to 8.

Description

Pancreatic cancer risk prediction method and system based on pancreatic cancer related genome markers Technical Field The invention relates to the technical field of pancreatic cancer genomics analysis, in particular to a pancreatic cancer risk prediction method and a pancreatic cancer risk prediction system based on pancreatic cancer related genome markers. Background Pancreatic cancer is a common tumor of the digestive tract. Pancreatic ductal adenocarcinomas are the major pathological type of pancreatic cancer, accounting for 85% -95% of the total pancreatic cancer (pancreatic cancers are all pancreatic ductal adenocarcinomas hereinafter). According to global cancer observation data published by the world health organization in 2018, the number of new cases of pancreatic cancer worldwide can reach 458,918 cases, and the cancer causes 432,242 patients to die, which is the seventh cause of cancer death in both men and women. According to the latest prediction data of the united states cancer society in 2020, the expected new cases of pancreatic cancer in 2020 in the united states will reach 57,600 cases and the death cases will reach 47,050. The number of cases of death approaches the number of new cases. The incidence and mortality of pancreatic cancer is higher in developing countries than in developed countries. According to the result of tumor data in 2015 of China, the number of new pancreatic cancer cases in the whole year is 90,100, wherein the number of death is 79,400. The death rate of the disease is high, and the average 5-year related survival rate is only about 6 percent. The pancreatic cancer has large treatment burden, seriously affects the life quality of patients, and is a serious disease seriously endangering the life health of people. With the implementation of the Human Genome Project (HGP) and the creation of the haplotype map (HapMap) database, and the rapid development of second generation sequencing technologies, genomic studies began to develop their important roles in cancer research. Various genetic markers can be used as diagnostic markers and also can be used for researching the molecular mechanism of pancreatic cancer, searching for novel pancreatic cancer specific tumor markers and setting forth a new research method and a new opportunity for explaining the pathogenesis of pancreatic cancer. Genomic studies include single nucleotide polymorphisms (single nucleotide polymorphism, SNPs), copy number variations (Copy number variation, CNV), and germline mutations (Germline mutation), among others. The genome marker is used as a marker for stable inheritance in human genome, and can be directly detected through a peripheral blood specimen. Genomic markers that have been shown to be associated with disease can be used for disease risk assessment. It is therefore entirely feasible to use genomic markers as a risk prediction for pancreatic cancer. However, the conventional pancreatic cancer genome marker research faces the practical problems that the available sample size is small, the genome marker risk value is not higher than that of other tumors, and the germ line mutation rate is low. Through the addition of an artificial intelligence technology algorithm, a pancreatic cancer risk prediction model which can be practically used for clinical reference can be constructed. Disclosure of Invention The invention aims to provide a pancreatic cancer risk prediction method and a pancreatic cancer risk prediction system based on pancreatic cancer related genome markers, and aims to provide an artificial intelligence-assisted risk prediction model of gene pancreatic cancer related genome markers, so that the problem that pancreatic cancer does not have a risk prediction model is solved. In order to achieve the above object, in one aspect, the present invention provides a pancreatic cancer risk prediction method based on pancreatic cancer-related genomic markers, comprising the steps of: S1, basic information of an object to be evaluated is collected, wherein the basic information at least comprises gender, age and whether the object to be evaluated has a past tumor history or a family tumor history; s2, collecting a peripheral blood sample of the object to be evaluated, and carrying out gene detection on the peripheral blood sample to obtain genome marker data related to pancreatic cancer; s3, carrying out data coding and preprocessing on the genome marker data and the basic information, and screening and weighting the data based on a pancreatic cancer specific data screening algorithm to form a pancreatic cancer risk feature set; S4, inputting the pancreatic cancer risk feature set into a pre-constructed pancreatic cancer risk prediction model to obtain a pancreatic cancer risk assessment result of the object to be assessed; S5, calculating a pancreatic cancer risk ratio based on the risk assessment result, and carrying out risk stratification on the object to be assessed according to a