EP-4455309-B1 - METHODS FOR ANALYSING NUCLEIC ACID SEQUENCE INFORMATION USING GC BIAS, OPTIONALLY FOR DETECTING FETAL NUCLEIC ACID ABNORMALITIES
Inventors
- LAPIDUS, STANLEY, N.
- THOMPSON, JOHN, F.
- LIPSON, DORON
- MILOS, PATRICE, M.
- EFCAVITCH, J., WILLIAM
- LETOVSKY, STANLEY
Dates
- Publication Date
- 20260513
- Application Date
- 20110209
Claims (14)
- A method for chromosomal counting or chromosomal segment counting with reduced or eliminated effects of GC bias in sequence information, wherein GC bias results in skewing the counting such that chromosomes or chromosome segments with extreme GC content appear to have more or fewer copy numbers than their real copy number, the method comprising: sequencing a sample to obtain nucleic acid sequence information; and performing chromosomal counting or chromosomal segment counting using counts of a selected subset of bins, wherein the subset is selected according to a given range of GC content values such that average GC content per chromosome or chromosome segment is equalized or less skewed, thereby correcting the sequence information to account for GC bias.
- The method according to claim 1, comprising: comparing corrected sequence information to a reference sequence; identifying fetal nucleic acid in the sample; and determining whether the fetus has an abnormality based on the comparison to the reference sequence.
- The method according to claim 1 or 2, wherein the bins in the selected subset of bins have a range of GC content of about 0.42 to 0.48.
- The method according to claim 1 or 2, wherein the selected subset of bins represents about 25% of a reference genome.
- The method according to any one of claims 1 to 4, wherein the size of a bin is about 1000 kilobase pairs (kb).
- The method according to any one of claims 1 to 5, wherein the sample is a tissue or body fluid.
- The method according to claim 6, wherein the sample is suspected to contain fetal nucleic acid.
- The method according to claim 7, wherein the body fluid is maternal blood, blood plasma, or serum.
- The method according to any one of claims 1 to 8, wherein sequencing is single molecule sequencing, sequencing by synthesis, and/or sequencing by nanopore detection.
- The method according to claim 2, wherein the reference sequence is selected from a maternal reference sequence, a fetal reference sequence, or a consensus human genomic sequence.
- The method according to claim 10, wherein the maternal reference sequence is selected from a sequence obtained from a buccal sample, a saliva sample, a urine sample, a breast nipple aspirate sample, a sputum sample, a tear sample, and an amniotic fluid sample.
- The method according to claim 2, wherein the fetal nucleic acid is cell free circulating fetal nucleic acid.
- The method according to claim 2, wherein prior to the sequencing step, the method further comprises enriching for fetal nucleic acid in the sample.
- The method according to claim 2, wherein the identifying step comprises a technique selected from sparse allele calling, targeted gene sequencing, identification of Y chromosomal material, enumeration, copy number analysis, and inversion analysis.
Description
Field of the Invention The invention generally relates to methods for chromosome and chromosomal segment counting with reduced GC bias, Background Fetal aneuploidy (e.g., Down syndrome, Edward syndrome, and Patau syndrome) and other chromosomal aberrations affect 9 of 1,000 live births (Cunningham et al. in Williams Obstetrics, McGraw-Hill, New York, p. 942, 2002). Chromosomal abnormalities are generally diagnosed by karyotyping of fetal cells obtained by invasive procedures such as chorionic villus sampling or amniocentesis. Those procedures are associated with potentially significant risks to both the fetus and the mother. Noninvasive screening using maternal serum markers or ultrasound are available but have limited reliability (Fan et al., PNAS, 105(42):16266-16271, 2008). Since the discovery of intact fetal cells in maternal blood, there has been intense interest in trying to use those cells as a diagnostic window into fetal genetics (Fan et al., PNAS, 105(42):16266-16271, 2008). The discovery that certain amounts (between about 3% and about 6%) of cell-free fetal nucleic acids exist in maternal circulation has led to the development of noninvasive PCR based prenatal genetic tests for a variety of traits. A problem with those tests is that PCR based assays trade off sensitivity for specificity, making it difficult to identify particular mutations. Further, due to the stochastic nature of PCR, a population of molecules that is present in a small amount in the sample often is overlooked, such as fetal nucleic acid in a sample from a maternal tissue or body fluid. In fact, if rare nucleic acid is not amplified in the first few rounds of amplification, it becomes increasingly unlikely that the rare event will ever be detected. Additionally, there is also the potential that fetal nucleic acid in a maternal sample is degraded and not amendable to PCR amplification due to the small size of the nucleic acid. There is a need for methods that can noninvasively detect fetal nucleic acids and diagnose fetal abnormalities. Alkan et al (Nat Genet. 41(10): 1061-1067, 2009) discloses an algorithm for mapping next generation sequence reads. The authors utilized a statistical correction to correct GC biases. Chu et al (Bioinformatics 25( 10): 1244-50. 2009) discloses a whole genome sequencing method for diagnosis of aneuploidy. The authors developed a statistical model to automatically correct GC bias in whole genome sequencing data. WO 2010/033578 A2 discloses a method for analysing a maternal sample which is collected non-invasively. The method involves generating a large number of short sequence reads of maternal and fetal DNA. The method further involves steps for correcting sequence tag density bias. Summary The invention is defined by the appended claims. Methods of the invention take advantage of sequencing technologies, particularly single molecule sequencing-by-synthesis technologies, to detect fetal nucleic acid in maternal tissues or body fluids. Methods of the invention are highly sensitive and allow for the detection of the small population of fetal nucleic acids in a maternal sample, generally without the need for amplification of the nucleic acid in the sample. Methods of the invention involve sequencing nucleic acid obtained from a maternal sample and distinguishing between maternal and fetal nucleic acid. Distinguishing between maternal and fetal nucleic acid identifies fetal nucleic acid, thus allowing the determination of abnormalities based upon sequence variation. Such abnormalities may be determined as single nucleotide polymorphisms, variant motifs, inversions, deletions, additions, or any other nucleic acid rearrangement or abnormality. Methods of the invention are also used to determine the presence of fetal nucleic acid in a maternal sample by identifying nucleic acid that is unique to the fetus. For example, one can look for differences between obtained sequence and maternal reference sequence; or can involve the identification of Y chromosomal material in the sample. The maternal sample may be a tissue or body fluid. In particular embodiments, the body fluid is maternal blood, maternal blood plasma, or maternal serum. The invention also provides a way to confirm the presence of fetal nucleic acid in a maternal sample by, for example, looking for unique sequences or variants. The sequencing reaction may be any sequencing reaction. In particular embodiments, the sequencing reaction is a single molecule sequencing reaction. Single-molecule sequencing is shown for example in Lapidus et al. (U.S. patent number 7,169,560), Lapidus et al. (U.S. patent application number 2009/0191565), Quake et al. (U.S. patent number 6,818,395), Harris (U.S. patent number 7,282,337), Quake et al. (U.S. patent application number 2002/0164629), and Braslavsky, et al., PNAS (USA), 100: 3960-3964 (2003). Briefly, in some implementations, a single-stranded nucleic acid (e.g., DNA or cDNA) is hybridized to oligonucleotide