Search

EP-4118652-B1 - METHOD FOR THE ANALYSIS OF GENETIC MATERIAL

EP4118652B1EP 4118652 B1EP4118652 B1EP 4118652B1EP-4118652-B1

Inventors

  • VERDYCK, PIETER

Dates

Publication Date
20260506
Application Date
20210309

Claims (15)

  1. A computer implemented method for the analysis of genetic material in a subject, said method comprising: - obtaining unphased genotype information of polymorphic variants of a first and second parent of the subject; - obtaining the genomic location of the polymorphic variants; - selection of the polymorphic variants based on one or more of the following criteria: - polymorphic variants for which the first and second parent are homozygous or hemizygous for a different allele (category 1 polymorphic variants); - polymorphic variants for which the first parent is homozygous for a specific allele and the second parent is heterozygous for said specific allele (category 2 polymorphic variants); or - polymorphic variants for which the second parent is homozygous for a specific allele and the first parent is heterozygous for said specific allele (category 3 polymorphic variants); - obtaining the allele frequency (AF) values for said selected polymorphic variants in genetic material of the subject; - selection of one allele per polymorphic variant and subcategorization of its corresponding AF frequency of the subject in one of the following subcategories: - AF values of the category 1 polymorphic variants, representing the AF values for alleles present in homozygous or hemizygous state in the first parent (subcategory 1A); - AF values of the category 1 polymorphic variants, representing the AF values for alleles present in homozygous or hemizygous state in the second parent (subcategory 1B); - AF values of the category 2 polymorphic variants, representing the AF values for alleles present in homozygous or hemizygous state in the first parent (subcategory 2A); - AF values of the category 2 polymorphic variants, representing the AF values for alleles heterozygous in the second parent and absent in the first parent (subcategory 2B); - AF values of the category 3 polymorphic variants, representing the AF values for alleles present in homozygous or hemizygous state in the second parent (subcategory 3A); or - AF values of the category 3 polymorphic variants, representing the AF values for alleles heterozygous in the first parent and absent in the second parent (subcategory 3B); - calculation of the mean AF values, the trimmed mean AF values or the median AF values of the polymorphic variants for each of the given subcategories, wherein the polymorphic variants are located between two genomic locations on a chromosome and evaluating whether a genetic anomaly is present in the genetic material of the subject based on the AF values of the polymorphic variants in one or more of the subcategories and the genomic location of said polymorphic variants; and wherein a genetic anomaly is present in the genetic material of the subject when the AF values deviates from 0.5; in particular when the AF value deviates from 0.5 with a value of at least and about 0.025 ; more in particular when the AF value deviates from 0.5 with a value of at least and about 0.045.
  2. The method according to claim 1, further comprising; - calculation of the difference between the median AF values, the mean AF values or the trimmed mean AF values of subcategories 1A and 1B or between the median AF values, the mean AF values or the trimmed mean AF values of subcategories 2A and 2B, or between the median AF values, the mean AF values or the trimmed mean AF values of subcategories 3A and 3B said difference being indicated as 'delta AF'; - evaluating whether a genetic anomaly is present in the genetic material of the subject based on the 'delta AF' values observed between said genomic locations, and wherein a genetic anomaly is present when the delta AF value deviates from 0; in particular when the delta AF value deviates from 0 with a value of at least and about 0.05; more in particular when the delta AF value deviates from 0 with a value of at least and about 0.09.
  3. The method according to claims 1 or 2 wherein polymorphic variants or SNPs that are distributed less than 50 kb, preferably less than 20 kb from each other are removed from further analysis.
  4. The method according to any of the preceding claims wherein delta AF values are used to calculate a value for the parental contribution between said genomic locations and evaluating whether a genetic anomaly is present in the genetic material of the subject based on said value for parental contribution observed between said genomic locations.
  5. The method according to claim 4, wherein calculating a value for the parental contribution between said genomic locations, and in particular a value for the percentage parental contribution (%Mat or %Pat), is based on a second order generalized linear model between the delta AF values and the percentage parental contribution (%Mat or %Pat) across said genomic locations; and wherein a parental contribution deviating from 50%; in particular a deviation of at least and about 3% is indicative for a chromosomal anomaly.
  6. The method according to claim 4, wherein a %Mat or %Pat between and about 44.4% and 55.6% is indicative for a normal disomy; wherein a %Mat ot %Pat between and about 63.6% and 72.7% is indicative for a trisomy; and wherein a %Mat or %Pat between and about 0% and 3.3%, is indicative for a monosomy.
  7. The method according to any of the preceding claims wherein the selected allele per polymorphic variant is an allele with a specific feature, said feature selected from the A allele, the B allele, the allele with the higher allele frequency in a given population, the allele with the lower allele frequency in a given population, a reference allele in a given reference genome, the allele present in homozygous state in parent 1, the allele present in homozygous state in parent 2, the allele present in heterozygous state in parent 1 but absent in parent 2, or the allele present in heterozygous state in parent 2 but absent in parent 1; preferably wherein the selected allele per polymorphic variant is the B allele; even more preferably wherein the selected allele per polymorphic variant is the B allele comprising a single nucleotide polymorphism (SNP).
  8. The method according to any of the preceding claims wherein the selected AF values of the polymorphic variants of the subcategories 2A, 2B, 3A and/or 3B are converted into discrete genotype calls and wherein it is evaluated whether homozygous or heterozygous allele frequency values are underrepresented or overrepresented between two particular genomic locations.
  9. The method according to any of the preceding claims wherein the genetic material of the subject is isolated from a sample comprising a low amount of genetic material of said subject; such as a sample comprising only one or few cells of said subject; or a plasma sample obtained from a mother pregnant with said subject.
  10. The method of any of the preceding claims, wherein the genetic anomaly comprises a numerical or structural chromosomal abnormality present in all (non-mosaic) or only a part of the biopsied cells (mosaic); in particular a numerical or structural chromosomal abnormality selected from a monosomy, uniparental disomy, trisomy, tetrasomy, a duplication, a deletion, respectively a mosaic monosomy, mosaic disomy, mosaic trisomy, mosaic tetrasomy, a mosaic tandem duplication, a mosaic deletion and combinations thereof.
  11. The method according to claim 10, wherein when the delta AF value deviates from 0, below the threshold of about 0.09 (in particular below 0.0924) the value is considered normal (normal disomy); when the delta AF value deviates from 0, above a threshold of about 0.2 (in particular above 0.234) and an increased copynumber is observed, the value indicates a full trisomy or duplication; and when the delta AF value deviates from 0, between both threshold values and an increased copy number is observed, the sample is categorized as mosaic trisomy/disomy or mosaic duplication.
  12. The method according to claim 10, wherein when the delta AF value deviates from 0, below the threshold of about 0.09 (in particular below 0.0924) the value is considered normal (normal disomy); when the delta AF value deviates from 0 above a threshold of about 0.9 and a decreased copynumber is observed, the value indicates a full monosomy or deletion; and when the delta AF value deviates from 0, between both threshold values and an decreased copynumber is observed the sample is categorized as mosaic monosomy/disomy or mosaic deletion.
  13. The method of any of the preceding claims, wherein the polymorphic variants are selected from single nucleotide polymorphisms (SNPs), short tandem repeats (STRs); preferably selected from single nucleotide polymorphisms.
  14. A computer program product which is capable, when executed on a processing engine, to perform the method of any of the preceding claims.
  15. A non-transitory machine-readable storage medium storing the computer program product of claim 14.

Description

FIELD OF THE INVENTION The present invention is directed to a method for the analysis of genetic material in a subject. More in particular, the present invention relates to a method for the analysis of genetic material using unphased genotype information of polymorphic variants of a first and second parent of a subject in combination with the allele frequency of the polymorphic variants in the genetic material of the subject. In a further aspect, the method of the present invention is particularly useful for the analysis of genetic material isolated from a sample comprising a low amount of genetic material and/or detection of low level chromosomal mosaicism. BACKGROUND TO THE INVENTION Chromosome anomalies include a wide variety of anomalies such as errors in ploidy, aneuploidy, structural chromosomal rearrangements and uniparental disomy (UPD). Errors in ploidy are defined by the presence of an abnormal number of complete chromosome sets. In human and mammals this corresponds to the presence of only one (monoploidy or haploidy) or more than two (3: triploidy, 4: tetraploidy, >2: polyploidy) complete sets of chromosomes instead of two (diploidy) in somatic cells. Aneuploidy refers to chromosomal abnormalities in which too few or too many copies of one or more chromosomes are observed (nullisomy, monosomy, trisomy, tetrasomy, etc.). Structural chromosomal rearrangements are abnormalities in which the structure of one or more chromosomes is altered. These include reciprocal translocations, Robertsonian translocations, deletions, duplications, insertions, inversions, etc.. Structural chromosomal rearrangements can be either (apparently) balanced (no loss or gain of genetic material) or unbalanced (loss or gain of genetic material). Last, uniparental disomy (UPD) is caused by the inheritance of two chromosome copies from the same parent which, depending on the chromosome at hand and the parental origin of the chromosomes, can cause disease. Chromosomal anomalies are very important in medicine as they cause a multitude of disorders and syndromes (Schinzel 2001). In addition, they play a role in infertility and miscarriage. To date many techniques have been developed to detect chromosomal anomalies. They include karyotyping, Fluorescence In Situ Hybridization (FISH), array comparative genome hybridization (aCGH), single nucleotide polymorphism (SNP) array, (quantitative-) polymerase chain reaction (PCR) and more recently sequencing based methods. These methods all have their own strengths and weaknesses but generally work well on relative large amounts of genomic DNA or a large amount of cells. Diagnosis is much more difficult when analyzing DNA obtained from only few cells. In this case amplification methods are required prior to molecular analysis which introduces a bias in locus- or allelic representation. By chance a given locus can be over- or underrepresented after amplification. In addition, one allele can be overrepresented compared to the other(s). These phenomena cause the data to be more noisy and interpretation is more difficult. Also chromosomal mosaicism, a phenomenon in which not all analyzed cells are chromosomally identical, adds to the complexity and can make interpretation more difficult. We created a new method for improved detection of chromosomal anomalies using data generated by high throughput genotyping technologies such as SNP array and sequencing. When high throughput genotyping technologies are used for detection of chromosomal anomalies such as SNP array or sequencing, the sample of the subject is mostly analyzed by itself without the use of genotype data from both parents of the subject. By taking into account the genotype data from both parents and using the allele frequency values obtained in the sample of the subject, the methods described in the current invention allow a more accurate detection of aneuploidy in samples with low amounts of target DNA and improve the detection of chromosomal mosaicism. In addition, in some embodiments of the invention, the method allows discrimination between meiotic and mitotic chromosome anomalies. In the present invention, methods have been identified for the analysis of genetic material of a subject without the need of phased genotype data. In particular, typical for the present invention is that only unphased genotype information of polymorphic variants of a first and a second parent of the subject is needed. As a result, the present invention thus allows using a DNA or cell sample from both parents and subject only, without a phasing reference sample. This is different from other SNP based methods which rely on haplotyping such as siCHILD (e.g. WO2015028576)(Zamani Esteki et al. 2015) and karyomapping (Handyside et al. 2010, Natesan et al. 2014). In the methods already known in the prior art, the use of a DNA sample from a closely related family member is required from each parent to perform reliable aneuploidy detection. This is typically a