CN-114724627-B - Methods and processes for non-invasive assessment of genetic variation
Abstract
Provided herein are methods and processes, as well as operations, systems, devices, and apparatuses, for non-invasive assessment of genetic variation.
Inventors
- S.K.Jin
- G. Hannam
- J. Gaith
- C - especially Mendez
Assignees
- 塞昆纳姆股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20140620
- Priority Date
- 20130621
Claims (18)
- 1. A system comprising one or more microprocessors and a memory, wherein the memory comprises instructions executable by the one or more microprocessors, and wherein the instructions executable by the one or more microprocessors are for: (a) Obtaining a count of sequence reads mapped to portions of a reference genome, wherein the sequence reads are reads of circulating cell-free (CCF) nucleic acid of a pregnant female test sample, (B) Selecting a subset of the portions, thereby providing a counted subset, wherein (I) The selection is based on mapping the portion with an increased number of fetal nucleic acid reads, an (Ii) The portion mapped with the increased number of fetal nucleic acid reads is determined based on a ratio of X to Y, where X is the number of reads from CCF fragments shorter in length than the first selected fragment length and Y is the number of reads from CCF fragments shorter in length than the second selected fragment length, and (C) Assessing the fetal nucleic acid fraction of the test sample based on the subset of counts.
- 2. The system of claim 1, wherein the ratio is an average ratio of a plurality of samples.
- 3. The system of claim 2, wherein the portions are selected based on an average ratio of a portion being greater than an average ratio of the portions averaged.
- 4. The system of claim 1, wherein the first selected fragment length is about 140 to about 160 bases and the second selected fragment length is about 500 to about 700 bases.
- 5. The system of claim 4, wherein the first selected fragment length is about 150 bases and the second selected fragment length is about 600 bases.
- 6. The system of any one of claims 1 to 5, wherein the count is a normalized count.
- 7. The system of claim 6, wherein the count is normalized to guanine-cytosine (GC) content.
- 8. The system of any one of claims 1 to 7, wherein the subset of portions is a portion of one or more autosomes.
- 9. The system of claim 8, wherein the subset of parts is a part of one or more euploid chromosomes.
- 10. A non-transitory computer-readable storage medium containing instructions that, when executed, cause a microprocessor to: (a) Obtaining a count of sequence reads mapped to portions of a reference genome, wherein the sequence reads are reads of circulating cell-free (CCF) nucleic acid of a pregnant female test sample, (B) Selecting a subset of the portions, thereby providing a counted subset, wherein (I) The selection is based on mapping the portion with an increased number of fetal nucleic acid reads, an (Ii) The portion mapped with the increased number of fetal nucleic acid reads is determined based on a ratio of X to Y, where X is the number of reads from CCF fragments shorter in length than the first selected fragment length and Y is the number of reads from CCF fragments shorter in length than the second selected fragment length, and (C) Assessing the fetal nucleic acid fraction of the test sample based on the subset of counts.
- 11. The non-transitory computer readable storage medium of claim 10, wherein the ratio is an average ratio of a plurality of samples.
- 12. The non-transitory computer-readable storage medium of claim 11, wherein the portions are selected based on an average ratio of a portion being greater than an average ratio of averaging the portions.
- 13. The non-transitory computer readable storage medium of claim 10, wherein the first selected fragment length is about 140 to about 160 bases and the second selected fragment length is about 500 to about 700 bases.
- 14. The non-transitory computer readable storage medium of claim 13, wherein the first selected fragment length is about 150 bases and the second selected fragment length is about 600 bases.
- 15. The non-transitory computer readable storage medium of any of claims 10 to 14, wherein the count is a normalized count.
- 16. The non-transitory computer readable storage medium of claim 15, wherein the count is normalized to guanine-cytosine (GC) content.
- 17. The non-transitory computer readable storage medium of any one of claims 10 to 16, wherein the subset of portions is one or more autosomal portions.
- 18. The non-transitory computer-readable storage medium of claim 17, wherein the subset of parts are parts of one or more euploid chromosomes.
Description
Methods and processes for non-invasive assessment of genetic variation Related patent application The present patent application claims the benefit of U.S. provisional patent application 61/838,048, entitled "method and procedure for non-invasive assessment of genetic variation (METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS)", filed on 6/21 of 2013, entitled "Sung K.Kim et al, docket No. SEQ-6071-PV. The entire contents of the aforementioned patent application are incorporated herein by reference, including the text, tables, and figures thereof. FIELD The technology provided herein relates in part to methods, processes, and apparatus for non-invasive assessment of genetic variation. Background Genetic information of living organisms (e.g., animals, plants, and microorganisms) and other forms of replicating genetic information (e.g., viruses) are encoded as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Genetic information is a sequence of nucleotides or modified nucleotides representing the primary structure of a chemical or putative nucleic acid. The complete Genome of humans comprises about 30,000 genes located on twenty-four (24) chromosomes (see Human Genome, T.Strachan, BIOS scientific press, 1992). Each gene encodes a specific protein that, after expression by transcription and translation, fulfills a specific biochemical function in living cells. Many medical conditions result from one or more genetic variations. Certain genetic variations cause medical conditions including, for example, hemophilia, thalassemia, duchenne Muscular Dystrophy (DMD), huntington's Disease (HD), alzheimer's disease, and Cystic Fibrosis (CF) ("mutation of human genome" (Human Genome Mutations), d.n. cooper and M.Krawczak, BIOS press, 1993). Such genetic diseases may be caused by the addition, substitution or deletion of a single nucleotide in the DNA of a specific gene. Some birth defects are caused by chromosomal abnormalities (also known as aneuploidy), such as trisomy 21 (Down's syndrome), trisomy 13 (Papanic syndrome), trisomy 18 (Edwardsies syndrome), monosomy X (Techner's syndrome), and certain sex chromosome aneuploidy such as Ke's syndrome (XXY). Other genetic variations are fetal gender, which can generally be determined based on sex chromosomes X and Y. Some genetic variations predispose an individual to or cause any of a number of diseases, such as diabetes, arteriosclerosis, obesity, various autoimmune diseases and cancers (e.g., colorectal cancer, breast cancer, ovarian cancer, lung cancer). Identification of one or more genetic variations or changes may be useful in diagnosing a particular medical condition, or determining predisposition to a particular medical condition. Identification of genetic variations can aid in medical decisions and/or use beneficial medical protocols. In certain embodiments, the identification of one or more genetic variations or changes involves analysis of cell-free DNA. Cell-free DNA (CF-DNA) consists of DNA fragments from cell death and peripheral blood circulation. High concentrations of CF-DNA can be indicative of certain clinical conditions such as cancer, trauma, burns, myocardial infarction, stroke, sepsis, infection and other diseases. In addition, cell-free fetal DNA (CFF-DNA) can be detected in maternal blood flow and used for a variety of non-invasive prenatal diagnoses. The presence of fetal nucleic acid in maternal plasma allows for non-invasive prenatal diagnosis by analysis of maternal blood samples. For example, quantitative abnormalities in fetal DNA in maternal plasma can be associated with a variety of pregnancy related disorders including preeclampsia, undermonth, prenatal hemorrhage, invasive placentation, fetal down syndrome, and other fetal chromosomal aneuploidies. Thus, analysis of fetal nucleic acid in maternal plasma can be a useful mechanism to monitor maternal and infant health. SUMMARY In certain aspects, provided herein is a method of assessing the fraction of fetal nucleic acid in a test sample from a pregnant female, the method comprising (a) obtaining a count of sequence reads mapped to portions of a reference genome, wherein the sequence reads are reads of circulating cell-free nucleic acid from the test sample from the pregnant female, (b) using a microprocessor, weighting (i) the count of sequence reads mapped to portions, or (ii) other portion-specific parameters with portion-specific fetal nucleic acid fractions, by weighting factors independently associated with the portions, thereby providing a portion-specific fetal fraction estimate based on the weighting factors, wherein each weighting factor has been determined by a fit correlation between (i) the fetal nucleic acid fraction of each sample in the plurality of samples and (ii) the count of sequence reads mapped to each portion or other portion-specific parameters of the plurality of samples, and (c) assessing the fraction of fetal nucleic acid of the test