US-12624400-B2 - Systems and methods to detect rare mutations and copy number variation
Abstract
The present disclosure provides a system and method for the detection of rare mutations and copy number variations in cell free polynucleotides. Generally, the systems and methods comprise sample preparation, or the extraction and isolation of cell free polynucleotide sequences from a bodily fluid; subsequent sequencing of cell free polynucleotides by techniques known in the art; and application of bioinformatics tools to detect rare mutations and copy number variations as compared to a reference. The systems and methods also may contain a database or collection of different rare mutations or copy number variation profiles of different diseases, to be used as additional references in aiding detection of rare mutations, copy number variation profiling or general genetic profiling of a disease.
Inventors
- AmirAli Talasaz
Assignees
- GUARDANT HEALTH, INC.
Dates
- Publication Date
- 20260512
- Application Date
- 20250827
Claims (20)
- 1 . A method for monitoring residual disease in a subject, the method comprising: (a) providing a first sample from the subject comprising genomic polynucleotides and sequencing the genomic polynucleotides or amplicons thereof to generate a first set of sequence data, wherein the first sample is obtained from a tumor biopsy from the subject; (b) determining a frequency of cancer mutations which are single base substitutions from the first set of sequence data from the first sample; (c) providing a second sample from the subject comprising cell-free deoxyribonucleic acid (cfDNA) molecules, enriching cfDNA molecules, or amplicons thereof, which comprise target regions of interest comprising the cancer mutations to provide enriched molecules and sequencing the enriched molecules or amplicons thereof to produce a second set of sequence data, wherein the second sample is obtained from the subject after the subject has undergone a course of treatment for cancer; (d) determining a frequency of the cancer mutations discovered in the first sample from the second set of sequence data from the second sample; and (e) determining a presence or absence of cancer in the subject based on an analysis of the frequency of the cancer mutations from the second set of sequence data from the second sample, thereby monitoring for the residual disease in the subject.
- 2 . The method of claim 1 , wherein the second sample is a blood sample.
- 3 . The method of claim 2 , wherein the second sample comprises between 1 nanogram (ng) to 100 ng of cfDNA molecules.
- 4 . The method of claim 3 , wherein adapters are ligated to a plurality of the cfDNA molecules prior to sequencing to generate tagged polynucleotides.
- 5 . The method of claim 4 , wherein the adapters comprise binding sites for universal amplification primers.
- 6 . The method of claim 5 , wherein the tagged polynucleotides are amplified to generate amplified progeny polynucleotides.
- 7 . The method of claim 6 , wherein the enrichment is of the amplified progeny polynucleotides.
- 8 . The method of claim 7 , wherein the enrichment comprises an amplification-based enrichment.
- 9 . The method of claim 7 , wherein the enrichment comprises oligonucleotide probes that selectively hybridize to the target regions of interest.
- 10 . The method of claim 7 , wherein the target regions of interest comprise exon regions, non-coding regions, or both.
- 11 . The method of claim 1 , wherein the cancer is colorectal cancer, lung cancer, or breast cancer.
- 12 . The method of claim 7 , wherein the first set of sequence data and the second set of sequence data comprise sequencing reads.
- 13 . The method of claim 12 , wherein the sequencing reads from the second set of sequencing data from the second sample are mapped to a reference sequence.
- 14 . The method of claim 13 , wherein a plurality of sequencing reads that map to the reference sequence are grouped into families having a same start and stop base position.
- 15 . The method of claim 13 , wherein the adapters further comprise molecular barcodes, and wherein a plurality of sequencing reads that map to the reference sequence are grouped into families having a same start and stop base position and the same molecular barcode sequence.
- 16 . The method of claim 13 , wherein cancer mutations in the second set of sequence data from the second sample are determined by comparing the sequencing reads to the reference sequence.
- 17 . The method of claim 14 , wherein cancer mutations in the second set of sequence data from the second sample are determined by collapsing sequencing reads within each family to yield a base call at a genetic locus and determining a frequency of one or more bases called at the locus from among the families.
- 18 . The method of claim 15 , wherein cancer mutations in the second set of sequence data from the second sample are determined by collapsing sequencing reads within each family to yield a base call at a genetic locus and determining a frequency of one or more bases called at the locus from among the families.
- 19 . The method of claim 1 , wherein the first set of sequence data comprises sequencing reads from the genomic polynucleotides, wherein the frequency of cancer mutations from the first set of sequence data from the first sample is determined by comparing sequencing reads to a reference sequence.
- 20 . The method of claim 1 , wherein the course of treatment is selected based on the frequency of cancer mutations determined from the first sample.
Description
CROSS-REFERENCE This application is a continuation of U.S. patent application Ser. No. 19/088,591, filed Mar. 24, 2025, which is a continuation of U.S. patent application Ser. No. 18/930,072, filed Oct. 29, 2024, (now U.S. Pat. No. 12,319,972), which is a continuation of U.S. patent application Ser. No. 18/426,665, filed Jan. 30, 2024, (now U.S. Pat. No. 12,252,749, issued Mar. 3, 2025), which is a continuation of U.S. patent application Ser. No. 18/333,436, filed Jun. 12, 2023, (now U.S. Pat. No. 12,049,673, issued Jul. 30, 2024), which is a continuation of U.S. patent application Ser. No. 17/696,524, filed Mar. 16, 2022, (now U.S. Pat. No. 11,879,158, issued Jan. 24, 2023), which is a continuation of U.S. patent application Ser. No. 17/386,338, filed Jul. 27, 2021, (now U.S. Pat. No. 11,319,598, issued May 3, 2022), which is a continuation of U.S. patent application Ser. No. 17/370,941, filed Jul. 8, 2021, (now U.S. Pat. No. 11,319,597, issued May 3, 2022), which is a continuation of U.S. patent application Ser. No. 17/210,191, filed Mar. 23, 2021, (now U.S. Pat. No. 12,054,783, issued Aug. 6, 2024), which is a continuation of U.S. patent application Ser. No. 16/709,437, filed Dec. 10, 2019 (now U.S. Pat. No. 10,961,592, issued Mar. 30, 2021), which is a continuation of U.S. patent application Ser. No. 16/593,633, filed Oct. 4, 2019, (now U.S. Pat. No. 10,822,663, issued Nov. 3, 2020) which is a continuation of U.S. patent application Ser. No. 16/575,128, filed Sep. 18, 2019 (now U.S. Pat. No. 10,793,916, issued Oct. 6, 2020), which is a continuation of U.S. patent application Ser. No. 16/283,635, filed Feb. 22, 2019 (now U.S. Pat. No. 10,494,678, issued Dec. 3, 2019), which is a continuation of U.S. patent application Ser. No. 15/872,831, filed Jan. 16, 2018 (now U.S. Pat. No. 10,457,995, issued Oct. 29, 2019), which is a continuation application of U.S. patent application Ser. No. 15/828,099, filed Nov. 30, 2017 (now U.S. Pat. No. 10,837,063, issued Nov. 17, 2020), which is a continuation application of U.S. patent application Ser. No. 15/467,570, filed Mar. 23, 2017 (now U.S. Pat. No. 9,840,743, issued Dec. 12, 2017), which is a continuation application of U.S. patent application Ser. No. 14/425,189, filed Mar. 2, 2015 (now U.S. Pat. No. 10,041,127, issued Aug. 7, 2018), which is a national stage entry of international Application No. PCT/US2013/058061, filed Sep. 4, 2013, which claims priority to 61/845,987, filed Jul. 13, 2013, and 61/793,997, filed Mar. 15, 2013, and 61/704,400, filed Sep. 21, 2012, and 61/696,734, filed Sep. 4, 2012, each of which is entirely incorporated herein by reference for all purposes. BACKGROUND OF THE INVENTION The detection and quantification of polynucleotides is important for molecular biology and medical applications such as diagnostics. Genetic testing is particularly useful for a number of diagnostic methods. For example, disorders that are caused by rare genetic alterations (e.g., sequence variants) or changes in epigenetic markers, such as cancer and partial or complete aneuploidy, may be detected or more accurately characterized with DNA sequence information. Early detection and monitoring of genetic diseases, such as cancer is often useful and needed in the successful treatment or management of the disease. One approach may include the monitoring of a sample derived from cell free nucleic acids, a population of polynucleotides that can be found in different types of bodily fluids. In some cases, disease may be characterized or detected based on detection of genetic aberrations, such as a change in copy number variation and/or sequence variation of one or more nucleic acid sequences, or the development of other certain rare genetic alterations. Cell free DNA (“cfDNA”) has been known in the art for decades, and may contain genetic aberrations associated with a particular disease. With improvements in sequencing and techniques to manipulate nucleic acids, there is a need in the art for improved methods and systems for using cell free DNA to detect and monitor disease. SUMMARY OF THE INVENTION The disclosure provides for a method for detecting copy number variation comprising: a) sequencing extracellular polynucleotides from a bodily sample from a subject, wherein each of the extracellular polynucleotide are optionally attached to unique barcodes; b) filtering out reads that fail to meet a set threshold; c) mapping sequence reads obtained from step (a) to a reference sequence; d) quantifying/counting mapped reads in two or more predefined regions of the reference sequence; e) determining a copy number variation in one or more of the predefined regions by (i) normalizing the number of reads in the predefined regions to each other and/or the number of unique barcodes in the predefined regions to each other; and (ii) comparing the normalized numbers obtained in step (i) to normalized numbers obtained from a control sample. The disclosure also provides for a method for d