EP-4738371-A2 - METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF CHROMOSOME ALTERATIONS
Abstract
Provided herein are methods, processes, systems, machines and apparatuses for non-invasive assessment of chromosome alterations.
Inventors
- KIM, SUNG
- JENSEN, TAYLOR JACOB
- EHRICH, MATHIAS
Assignees
- Sequenom, Inc.
Dates
- Publication Date
- 20260506
- Application Date
- 20141003
Claims (15)
- A method for determining the presence or absence of one or more chromosome alterations in sample nucleic acid comprising : (a) identifying discordant read pairs from paired-end sequence reads obtained from a biological sample of a subject, thereby identifying discordant read mates; (b) characterizing the mappability of a plurality of sequence read subsequences of each discordant read mate aligned to a reference genome, each subsequence being of a different length; (c) selecting a subset of the discordant read mates according to a change in mappability, wherein the subset comprises reads comprising a candidate breakpoint; (d) comparing (i) the number of discordant read mates from the sample associated with a candidate breakpoint and optionally one or more substantially similar breakpoints, to (ii) the number of discordant read mates from a reference associated with the candidate breakpoint and optionally the one or more substantially similar breakpoints, for the discordant read mates in the subset selected in (c), thereby generating a comparison; and (e) determining the presence or absence of one or more chromosome alterations for the sample according to the comparison in (d).
- The method of claim 1, wherein the biological sample (a) comprises circulating, cell free nucleic acid obtained from plasma or serum and preferably wherein the paired-end sequence reads are reads of circulating, cell-free nucleic acid from a test subject sample, (b) is tumor biopsy, or (c) is buffy coat.
- The method of claim 1, wherein the sequence reads are single-end sequence reads, the sequence reads are discordant reads, and the change in mappability is determined for the discordant reads.
- The method of claim 1, wherein the sequence reads are paired-end sequence reads and the sequence reads are discordant read pairs.
- The method of any one of claims 1 to 3, wherein the one or more chromosome alterations comprise a chromosome translocation, a chromosome deletion, a chromosome inversion, or a heterologous insertion.
- The method of any one of claims 1 to 3, comprising determining the position of one or more candidate breakpoints.
- The method of any one of claims 1 to 4, wherein the characterizing in (b) comprises generating a fitted relationship between the mappability and the length of each of the sequence read subsequences of each discordant read mate.
- The method of any one of claims 1 to 7, wherein the change in mappability comprises a slope of the fitted relationship.
- The method of any one of claims 1 to 8, wherein the selecting in (c) is according to a mappability threshold.
- The method of any one of claims 1 to 9, wherein the method comprises filtering reads, wherein the filtering preferably (a) comprises removing one or both of the discordant read mates and/or (b) is chosen from one or more of (i) removing low quality reads, (ii) removing concordant reads, (iii) removing PCR duplicated reads, (iv) removing reads mapped to mitochondrial DNA, (v) removing reads mapped to repetitive elements, (vi) removing unmappable reads, (vi) removing reads comprising step-wise multiple alignments, (vii) removing reads mapped to a centromere, and (viii) removing one or more singleton events, (ix) removing the subset of reads identified in (b) in instances where the number of each of the sequence reads in the subset from the sample are substantially similar to the number of each of the sequence reads in the subset from the reference.
- The method of any one of claims 1 to 10, wherein the location of the breakpoint is identified at a single base resolution.
- The method of any one of claims 1 to 11, wherein the presence of a balanced translocation or an unbalanced translocation is determined in (e).
- The method of any one of claims 1 to 12, wherein determining the presence of the translocation in (e) comprises identifying a substantially greater number of sequence reads from the sample compared to the reference in the comparison of (d).
- The method of any one of claims 1 to 13, wherein a first break point and a second breakpoint are identified according to the comparison in (d), wherein preferably the presence of a chromosome alteration is identified in (e) according to the first and second breakpoints.
- The method of any one of claims 1 to 14, wherein the comparison in (d) comprises determining a level of confidence, wherein determining the level of confidence preferably comprises determining a p value or a z-score.
Description
Related Patent Applications This patent application claims the benefit of U.S. provisional patent application no. 61/887,801 filed on October 7, 2013, entitled METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF CHROMOSOME ALTERATIONS, naming Sung Kim, Taylor Jacob Jensen, and Mathias Ehrich as inventors, and designated by attorney docket no. SEQ-6074-PV. The entire content of the foregoing application is incorporated herein by reference, including all text, tables and drawings. Field Technology provided herein relates in part to methods, processes, machines and apparatuses for non-invasive assessment of chromosome alterations. Background Genetic information of living organisms (e.g., animals, plants and microorganisms) and other forms of replicating genetic information (e.g., viruses) is encoded in deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Genetic information is a succession of nucleotides or modified nucleotides representing the primary structure of chemical or hypothetical nucleic acids. In humans, the complete genome contains about 30,000 genes located on twenty-four (24) chromosomes (see The Human Genome, T. Strachan, BIOS Scientific Publishers, 1992). Each gene encodes a specific protein, which after expression via transcription and translation fulfills a specific biochemical function within a living cell. Identifying one or more chromosome alterations can lead to diagnosis of, or determining predisposition to, a particular medical condition. Identifying a chromosome alteration can result in facilitating a medical decision and/or employing a helpful medical procedure. In certain embodiments, identification of one or more chromosome alterations involves the analysis of cell-free DNA. Cell-free DNA (CF-DNA) is composed of DNA fragments that originate from cell death and circulate in peripheral blood. High concentrations of CF-DNA can be indicative of certain clinical conditions such as cancer, trauma, burns, myocardial infarction, stroke, sepsis, infection, and other illnesses. Additionally, cell-free fetal DNA (CFF-DNA) can be detected in the maternal bloodstream and used for various noninvasive prenatal diagnostics. The presence of fetal nucleic acid in maternal plasma allows for non-invasive prenatal diagnosis through the analysis of a maternal blood sample. For example, quantitative abnormalities of fetal DNA in maternal plasma can be associated with a number of pregnancy-associated disorders and genetic diseases associated with fetal chromosomal alterations. Hence, fetal nucleic acid analysis in maternal plasma can be a useful mechanism for the monitoring of feto-maternal well-being. Summary Provided herein, in certain aspects, is a system comprising memory and one or more microprocessors, which memory comprises instructions and which one or more microprocessors are configured to perform, according to the instructions, a process for determining the presence or absence of one or more chromosome alterations in sample nucleic acid, which process comprises (a) characterizing mappability of a plurality of sequence read subsequences for sequence reads, where there are multiple sequence read subsequences for each sequence read, the sequence read subsequences for each sequence read are of different lengths, and the sequence reads are of the sample nucleic acid, (b) identifying a subset of sequence reads for which there is a change in mappability of one or more subsequences, (c) comparing (i) the number of each of the sequence reads in the subset identified in (b) from the sample, to (ii) the number of each of the sequence reads in the subset identified in (b) from a reference, thereby generating a comparison; and (d) determining the presence or absence of one or more chromosome alterations for the sample according to the comparison in (c). Also provided herein, in certain aspects, is a method comprising memory and one or more microprocessors, which memory comprises instructions and which one or more microprocessors are configured to perform, according to the instructions, a process for determining the presence or absence of one or more chromosome alterations in sample nucleic acid, which process comprises (a) characterizing mappability of a plurality of sequence read subsequences for sequence reads, where there are multiple sequence read subsequences for each sequence read, the sequence read subsequences for each sequence read are of different lengths, and the sequence reads are of the sample nucleic acid, (b) identifying a subset of sequence reads for which there is a change in mappability of one or more subsequences, (c) comparing (i) the number of each of the sequence reads in the subset identified in (b) from the sample, to (ii) the number of each of the sequence reads in the subset identified in (b) from a reference, thereby generating a comparison; and (d) determining the presence or absence of one or more chromosome alterations for the sample according to the comparison in (c). Also pr