EP-4735890-A1 - METHODS FOR DETERMINING SURVEILLANCE AND THERAPY FOR DISEASES

EP4735890A1EP 4735890 A1EP4735890 A1EP 4735890A1EP-4735890-A1

Abstract

Disclosed herein are methods, compositions, and devices for use in early detection of cancer. The methods include sequencing a panel of regions in cell-free nucleic acid molecules and detecting one or more biomarkers that are indicative of a cancer.

Inventors

ELTOUKHY, HELMY

Assignees

Guardant Health, Inc.

Dates

Publication Date: 20260506
Application Date: 20240628

Claims (20)

1. A method, comprising: determining a state of biological molecules obtained from a sample derived from a human subject testing for minimal residual disease (MRD) determining the likelihood of recurrence based on the MRD test generating a schedule for one or more additional MRD tests based on the determination of the likelihood of recurrence.
2. The method of claim 1, wherein the biological molecules are one or more of: DNA, methylated DNA, RNA, methylated RNA, proteins, and peptides.
3. The method of claim 1, wherein testing for MRD comprises combining a plurality of nucleic acid molecules derived from a subject with a solution including an amount of methyl binding domain (MBD) proteins to produce a nucleic acid-MBD protein solution; and performing a plurality of washes of the nucleic acid-MBD protein solution with a salt solution to produce a number of nucleic acid fractions, individual nucleic acid fractions having a threshold number of methylated cytosines in regions of the plurality of nucleic acids having at least the threshold cytosine-guanine content.
4. The method of claim 3, wherein a wash of the plurality of washes is performed with a solution having a concentration of sodium chloride (NaCl) and produces a nucleic acid fraction of the number of nucleic acid fractions having a range of binding strengths to MBD proteins.
5. The method of claim 3, comprising: determining that a first nucleic acid fraction is associated with a first partition of a plurality of partitions of nucleic acids, the first partition corresponding to a first range of binding strengths to MBD proteins; attaching a first molecular barcode to nucleic acids of the first nucleic acid fraction, the first molecular barcode being included in a first set of molecular barcodes associated with the first partition; determining that a second nucleic acid fraction is associated with a second partition of the plurality of partitions of nucleic acids, the second partition corresponding to a second range of binding energies to MBD proteins different from the first range of binding strengths to MBD proteins; and attaching a second molecular barcode to nucleic acids of the second nucleic acid fraction, the second molecular barcode being included in a second set of molecular barcodes associated with the second partition.
6. The method of claim 3, comprising: combining at least a portion of the number of nucleic acid fractions with an amount of restriction enzyme that cleaves molecules with one or more unmethylated cytosines to produce at least a portion of the plurality of samples used to produce the sequencing reads; wherein the threshold amount of methylated cytosines corresponds to a minimum frequency of methylated cytosines within a region having at least the threshold cytosine-guanine content.
7. The method of claim 3, comprising: combining at least a portion of the number of nucleic acid fractions with an amount of a restriction enzyme that cleaves molecules with one or more methylated cytosines to produce at least a portion of the plurality of samples used to produce the sequencing reads; wherein the threshold amount of unmethylated cytosines corresponds to a maximum frequency of methylated cytosines that are not cleaved within a region having at least the threshold cytosine-guanine content.
8. The method of claim 1, wherein testing for MRD comprises sequencing nucleic acid molecules derived from a sample obtained from a subject; analyzing sequence reads derived from the sequencing to identify one or more driver mutations in the nucleic acid molecules; and using information about the presence, absence, or amount of the one or more driver mutations in the nucleic acid molecules to identify a tumor in the subject.
9. The method of claim 3-8, wherein the nucleic acid molecules comprise cell-free DNA.
10. The method of any of the preceding claims, wherein the sample is at least one of blood, serum, plasma or tissue.
11. The method of any of the preceding claims, comprising determination of treatment for the subject.
12. The method of any of the preceding claims, wherein a limit of detection for the model to determine tumor fraction of samples is no greater than 0.05%.
13. The method of any of the preceding claims, wherein the one or more driver mutations comprises a somatic variant detected at a mutant allele frequency (MAF) of no more than 0.05%.
14. The method of any of the preceding claims, wherein the one or more driver mutations comprises a fusion detected at a mutant allele frequency (MAF) of no more than 0.1%.
15. The method of any of the preceding claims, further comprising detecting mutation distributions for each of one or more driver mutations, wherein the mutation distribution for each of the one or more driver mutations is detected with a correlation of at least 0.99 to a mutation distribution of the driver mutation detected in a cohort of the subject by tissue genotyping.
16. The method of any of the preceding claims, wherein the method detects the tumor in the subject with a sensitivity of at least 85%, a specificity of at least 99%, and a diagnostic accuracy of at least 99%.
17. The method of any of the preceding claims, comprising identify circulating tumor DNA (ctDNA) and one or more driver mutations in the ctDNA.
18. A method comprising: obtaining, by a computing system having one or more hardware processors and memory, testing sequence data from a subject, the testing sequence data including testing sequencing reads derived from a sample of the subject; analyzing, by the computing system, the testing sequencing reads to determine a first quantitative measure derived from the testing sequencing reads to genomic regions of a reference genome; analyzing, by the computing system, the testing sequencing reads to determine a second quantitative measure derived from the testing sequencing reads to genomic regions of a reference genome; determining, by the computing system, a metric based on the first quantitative and the second quantitative measure; and generating, by the computing system, an input vector that includes the metrics; determining, by the computing system, an indication of cancer status in the subject by providing the input vector to a model that implements one or more machine learning techniques to generate indications of cancer status in subjects, the model including weights for individual classification regions of a plurality of classification regions and at least a portion of the weights of the individual classification regions being different from one another.
19. The method of claim 18, wherein the individual testing sequencing reads include a nucleotide sequence corresponding to a fragment of a nucleic acid included in the sample and the individual testing sequencing reads correspond to molecules having a threshold amount of methylated cytosines included in regions of the nucleotide sequence having at least the threshold cytosine-guanine content; the first quantitative measure derived from the testing sequencing reads that correspond to individual classification regions of a plurality of classification regions at least a portion of the individual classification regions of the plurality of classification regions corresponding to genomic regions of a reference genome that have the threshold amount of methylated cytosines in subjects in which cancer is detected and that have at least the threshold cytosine-guanine content; the second quantitative measure derived from the testing sequencing reads that correspond to individual control regions a plurality of control regions, individual control regions of the plurality of control regions corresponding to additional genomic regions of the reference genome that have at least the threshold cytosine-guanine content and that have at least the threshold amount of methylated cytosines in subjects in which cancer is detected and in additional subjects in which cancer is not detected.
20. The method of claim 18 and 19, comprising: obtaining, by the computing system having one or more hardware processors and memory, training sequence data including training sequencing reads derived from a plurality of samples of a plurality of training subjects, individual training sequencing reads including a nucleotide sequence corresponding to a fragment of a nucleic acid included in a sample of the plurality of samples and individual training sequencing reads corresponding to molecules having a threshold amount of methylated cytosines included in regions of the nucleotide sequence having at least a threshold cytosine-guanine content; analyzing, by the computing system, the training sequencing reads to determine an additional first quantitative measure derived from the training sequencing reads that corresponds to individual classification regions of the plurality of classification regions; analyzing, by the computing system, the training sequencing reads to determine an additional second quantitative measure derived from the training sequencing reads that correspond to a plurality of control regions; determining, by the computing system, an additional metric for the individual classification regions of the plurality of classification regions based on the additional first quantitative measure for the individual classification regions and the additional second quantitative measure for the plurality of control regions; generating, by the computing device, training data that includes the additional metric for the individual classification regions of the plurality of classification regions for the training sequence reads from samples of the plurality of training subjects; implementing, by the computing system and using the training data, one or more machine learning algorithms to generate the model to determine the indications of cancer status in subjects based on amounts of methylated cytosines in at least a portion of the plurality of classification regions.

Description

METHODS FOR DETERMINING SURVEILLANCE AND THERAPY FOR DISEASES CROSS REFERENCE TO RELATED APPLICATIONS [0001] This application claims the benefit of U.S. provisional patent application no. 63/511,082 filed June 29, 2023, which is incorporated by reference herein in its entirety. BACKGROUND [0002] Cancer is a major cause of disease worldwide. Each year, tens of millions of people are diagnosed with cancer around the world, and more than half of the patients eventually die from it. In many countries, cancer ranks the second most common cause of death following cardiovascular diseases. Early detection is associated with improved outcomes for many cancers. [0003] To detect cancer, several screening tests are available. A physical exam and history survey general signs of health, including checking for signs of disease, such as lumps or other unusual physical symptoms. A history of a patient’s health habits and past illnesses and treatments will also be taken. Laboratory tests are another type of screening test and may include medical procedures to procure samples of tissue, blood, urine, or other substances in the body before conducting laboratory testing. Imaging procedures screen for cancer by generating visual representations of areas inside the body. Genetic tests detect certain gene deleterious mutations linked to some types of cancer. Genetic testing is particularly useful for a number of diagnostic methods. SUMMARY OF INVENTION [0004] Described herein is a method, including: determining a state of biological molecules obtained from a sample derived from a human subject, testing for minimal residual disease (MRD), determining the likelihood of recurrence based on the MRD test, generating a schedule for one or more additional MRD tests based on the determination of the likelihood of recurrence. In other embodiments, the biological molecules are one or more of DNA, methylated DNA, RNA, methylated RNA, proteins, and peptides. In other embodiments, the method includes testing for MRD includes combining a plurality of nucleic acid molecules derived from a subject with a solution including an amount of methyl binding domain (MBD) proteins to produce a nucleic acid-MBD protein solution; and performing a plurality of washes of the nucleic acid- MBD protein solution with a salt solution to produce a number of nucleic acid fractions, individual nucleic acid fractions having a threshold number of methylated cytosines in regions of the plurality of nucleic acids having at least the threshold cytosine-guanine content. In other embodiments, the wash of the plurality of washes is performed with a solution having a concentration of sodium chloride (NaCl) and produces a nucleic acid fraction of the number of nucleic acid fractions having a range of binding strengths to MBD proteins. In other embodiments, the method includes determining that a first nucleic acid fraction is associated with a first partition of a plurality of partitions of nucleic acids, the first partition corresponding to a first range of binding strengths to MBD proteins, attaching a first molecular barcode to nucleic acids of the first nucleic acid fraction, the first molecular barcode being included in a first set of molecular barcodes associated with the first partition, determining that a second nucleic acid fraction is associated with a second partition of the plurality of partitions of nucleic acids, the second partition corresponding to a second range of binding energies to MBD proteins different from the first range of binding strengths to MBD proteins, and attaching a second molecular barcode to nucleic acids of the second nucleic acid fraction, the second molecular barcode being included in a second set of molecular barcodes associated with the second partition. In other embodiments, the method includes combining at least a portion of the number of nucleic acid fractions with an amount of restriction enzyme that cleaves molecules with one or more unmethylated cytosines to produce at least a portion of the plurality of samples used to produce the sequencing reads, wherein the threshold amount of methylated cytosines corresponds to a minimum frequency of methylated cytosines within a region having at least the threshold cytosine-guanine content. [0005] In other embodiments, the method includes combining at least a portion of the number of nucleic acid fractions with an amount of a restriction enzyme that cleaves molecules with one or more methylated cytosines to produce at least a portion of the plurality of samples used to produce the sequencing reads, wherein the threshold amount of unmethylated cytosines corresponds to a maximum frequency of methylated cytosines that are not cleaved within a region having at least the threshold cytosine-guanine content. In other embodiments, the method includes testing for MRD includes sequencing nucleic acid molecules derived from a sample obtained from a subject, analyzing sequence reads derived from the seq