Search

CN-121999861-A - Method and system for detecting whole genome replication and chromosome instability

CN121999861ACN 121999861 ACN121999861 ACN 121999861ACN-121999861-A

Abstract

The invention relates to a marker for whole genome replication and chromosome instability, a detection method and a detection system, and belongs to the technical field of medical molecular biology. The invention constructs WGD and CIN scoring methods. The marker for whole genome replication and chromosome instability provided by the invention can effectively score WGD and CIN, and the calculated result has good consistency with the WGS sequencing result.

Inventors

  • LIU RUI
  • BAO HUA
  • CHANG ZHILI
  • WU XUE
  • SHAO YANG

Assignees

  • 南京世和基因生物技术股份有限公司
  • 南京世和医疗器械有限公司
  • 南京世和医学检验有限公司

Dates

Publication Date
20260508
Application Date
20260121

Claims (10)

  1. 1. A method for calculating an index of whole genome replication and chromosomal instability, for non-therapeutic and diagnostic purposes, comprising the steps of: a) Performing targeted sequencing on the biological sample to obtain sequencing depth and allele frequency information at a set of Single Nucleotide Polymorphism (SNP) sites, wherein the set of SNP sites have distribution in the whole genome range of a species to which the sample to be tested belongs; b) Dividing the genome of the sample to be tested into a plurality of genome segments by allele-specific copy number analysis based on the sequencing depth and allele frequency information obtained in step a), and determining the primary copy number and the secondary copy number of each genome segment; c) Calculating a WGD index and a CIN index based on the result of step b), wherein: Calculating the length sum of all genome segments with the main copy number greater than or equal to 2, and dividing the length sum of all genome segments by the length sum of all genome segments to obtain a ratio as a WGD index; The CIN index is calculated by defining the copy number state of the genome segment with the longest length in the genome as a baseline state, wherein the copy number state is jointly determined by the primary copy number and the secondary copy number of the genome segment; d) And outputting the calculated WGD index and CIN index.
  2. 2. A system for detecting whole genome replication and chromosomal instability, comprising: A sequencing module for targeted sequencing of the biological sample to obtain sequencing depth and allele frequency information at a set of Single Nucleotide Polymorphism (SNP) sites, wherein the set of SNP sites have a distribution over the whole genome of the species to which the sample to be tested belongs; The copy number calculation module is used for dividing the genome of the sample to be detected into a plurality of genome fragments through allele specific copy number analysis based on the sequencing depth and the allele frequency information acquired in the sequencing module, and determining the primary copy number and the secondary copy number of each genome fragment; The WGD index and CIN index output module is used for calculating the WGD index and the CIN index, wherein the WGD index is calculated by calculating the length sum of all genome fragments with the main copy number greater than or equal to 2 and dividing the length sum of all genome fragments to obtain a ratio as the WGD index; the CIN index is calculated by defining the copy number state of the genome segment with the longest length in the genome as a baseline state, wherein the copy number state is jointly determined by the primary copy number and the secondary copy number of the genome segment; And the result output module is used for outputting the calculated WGD index and CIN index.
  3. 3. The detection system of claim 2, wherein the set of SNP loci satisfy one or more of the following characteristics in a specific target population of a species to which the sample to be tested belongs, allele frequencies in the target population within a predetermined range, positions in the genome outside of simple repetitive sequences, low complexity regions, centromere regions or regions of known sequencing difficulties, and approximately even distribution across the genome.
  4. 4. The detection system according to claim 2, wherein the near uniform distribution is achieved by dividing the whole genome into a plurality of consecutive windows of preset length and selecting at least one SNP site satisfying the characteristic within each window, and preferably a SNP site having a higher heterozygosity within each window.
  5. 5. The detection system according to claim 2, wherein the determination of SNP loci further comprises the steps of i. Performing a technical feasibility assessment of candidate SNP loci, said assessment comprising analyzing whether the loci and their adjacent sequences are suitable for designing an oligonucleotide capture probe with a unique comparison, and ii. Among SNP loci meeting the technical feasibility, loci with a higher expected heterozygosity in the target population are preferentially selected in combination with the requirement of an approximately uniform distribution.
  6. 6. The detection system according to claim 2, wherein the window of the preset length is 100bp to 1kb in length, and wherein the selection of the SNP site having a higher heterozygosity in each window is specifically that, in each window, one SNP site having the highest expected heterozygosity is selected from all candidate SNP sites satisfying the characteristic.
  7. 7. The detection system of claim 2, wherein the number of SNP sites in the set is 5000 to 50000.
  8. 8. The test system of claim 2 wherein the step of evaluating the WGD status of the biological sample comprises comparing the calculated WGD indicator with a preset WGD positive determination threshold, and determining that the WGD is positive when the WGD indicator is greater than the threshold, and wherein the step of evaluating the CIN status of the biological sample comprises comparing the calculated CIN indicator with a preset CIN positive determination threshold, and determining that the CIN is positive when the CIN indicator is greater than the threshold.
  9. 9. A test system for assessing the status of whole genome replication (WGD) and Chromosome Instability (CIN) in a biological sample, comprising: the data acquisition module is used for acquiring sequencing depth and allele frequency information at a group of single nucleotide polymorphism sites from a target sequencing result of a biological sample; A processor; and a memory having stored thereon computer program instructions which, when executed by the processor, implement the method of claim 1.
  10. 10. A computer readable storage medium, on which a computer program is stored, which program, when being executed by a processor, implements the method according to claim 1.

Description

Method and system for detecting whole genome replication and chromosome instability Technical Field The invention relates to a marker for whole genome replication and chromosome instability, a detection method and a detection system, and belongs to the technical field of medical molecular biology. Background Whole genome replication (whole genome duplication, WGD) and chromosomal instability (chromosomal instability, CIN) are important biological phenomena in the evolution of cellular genome structure and maintenance of stability. WGD refers to the event that a cell or individual undergoes doubling of the entire chromosome during evolution or pathology. This process is widely present in plant evolution and is considered to be an important driving force for species differentiation and adaptive evolution. In mammalian and human cells, WGD, although rare, has been shown in recent studies to be common in a variety of tumor types and is closely related to chromosomal instability. WGD events result in a transient increase in the number of chromosomes in the cell, providing a "hotbed" for subsequent aneuploidy, genomic rearrangements, and chromosome breaks, among other instabilities. Thus, WGD is considered to be an important early step in driving the occurrence of CIN, and is also one of key markers for the occurrence and progression of malignant tumors. CIN is usually expressed as an abnormality in chromosome number (aneuploidy) or chromosome structure (deletion, amplification, translocation, etc.). Evidence has shown that populations of cells that have undergone WGD are more prone to CIN production, thereby promoting clonal evolution, tumor heterogeneity and the development of drug resistance. Clinically, the presence of WGD and CIN is often associated with poor prognosis and with the efficacy of radiotherapy, chemotherapy and targeted therapies. Thus, the combined detection of WGD and CIN levels is of great clinical guidance significance. Currently, detection methods for WGD and CIN mainly include: (1) Cytogenetic methods such as flow cytometry (DNA content analysis), karyotyping, fluorescence In Situ Hybridization (FISH) can identify euploid and aneuploidy states, but have limited resolution, and are difficult to accurately characterize complex variations of genomes. (2) Molecular detection methods such as Comparative Genomic Hybridization (CGH), whole Genome Sequencing (WGS), single cell sequencing, etc., can identify WGD events and related copy number variations at higher resolution. However, the methods have long detection period, high cost and large dependence on data analysis, and are not beneficial to clinical rapid popularization. (3) Indirect functional indicators such as DNA damage markers (γ -H2 AX), cyclin expression, etc. may suggest a CIN-related state, but do not specifically reflect the occurrence and extent of WGD. The prior art still has the following defects: (1) The detection of WGD is dependent on sequencing means, so that the conventional detection with high speed and low cost is difficult to realize; (2) Lack of a joint marker system capable of simultaneously and dynamically reflecting WGD and CIN states; (3) At present, a unified and standardized detection platform is not available, so that data among different laboratories are difficult to compare and clinically transform. Therefore, the development of a novel marker, a detection method and a detection system which can effectively identify the whole genome replication event and jointly evaluate the chromosome instability has important significance for genomics research, early diagnosis of tumor, prognosis evaluation and monitoring of drug efficacy. Disclosure of Invention The invention provides a whole genome replication event and chromosome instability detection method and system based on whole genome single nucleotide polymorphism information. The method combines specific SNP data of east Asia population and Chinese population by a high-throughput sequencing means to realize high-efficiency and accurate detection of WGD and CIN, and provides a reliable technical means for early diagnosis, prognosis evaluation, drug resistance mechanism research and personalized treatment of tumors. The technical scheme of the invention comprises the following key steps: a test method for assessing the status of whole genome replication (WGD) and Chromosome Instability (CIN) in a biological sample, comprising the steps of: a) Performing targeted sequencing on the biological sample to obtain sequencing depth and allele frequency information at a set of Single Nucleotide Polymorphism (SNP) sites, wherein the set of SNP sites have distribution in the whole genome range of a species to which the sample to be tested belongs; b) Dividing the genome of the sample to be tested into a plurality of genome segments by allele-specific copy number (ASCN) analysis based on the sequencing depth and allele frequency information obtained in step a), and determining a