EP-3450569-B1 - DNA AMPLIFICATION METHOD

EP3450569B1EP 3450569 B1EP3450569 B1EP 3450569B1EP-3450569-B1

Inventors

GAO, Fangfang
LU, SIJIA
REN, JUN

Dates

Publication Date: 20260513
Application Date: 20170426

Claims (14)

A method of amplifying genomic DNA, said method comprising: (a) providing a first reaction mixture, wherein the first reaction mixture comprises a sample containing the genomic DNA, primers, a mixture of nucleotide monomers, and a nucleic acid polymerase, wherein the primers consist of first primers and third primers, wherein the first primers consist of, in a 5' to 3' orientation, a common sequence and a first variable sequence, or the first primers consist of, in a 5' to 3' orientation, a common sequence, a first spacer sequence and a first variable sequence, wherein the first primers are a mixture of primers comprising the same common sequence and different first variable sequences, wherein each first variable sequence consists of a first random sequence and a fixed sequence at its 3' end, wherein the first random sequence is, in a 5' to 3' orientation, sequentially X a1 X a2 ......X an , and X ai (i=1-n) of the first random sequence all belong to a same set, said set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i th nucleotide from 5' end of the first random sequence, n is a positive integer selected from 4-16, and wherein the first spacer sequence is Y a1 ......Y am , wherein Y aj (j=1-m) ∈ {A, T, G, C}, wherein Y aj represents the j th nucleotide from 5' end of the first spacer sequence, m is a positive integer selected from 1-3; wherein the third primers consist of, in a 5' to 3' orientation, the common sequence and a third variable sequence, or wherein the third primers consist of, in a 5' to 3' orientation, the common sequence, a third spacer sequence and a third variable sequence, wherein the third primers are a mixture of primers comprising the same common sequence and different third variable sequences, wherein each third variable sequence consists of a third random sequence and a fixed sequence at its 3' end, wherein the third random sequence is, in a 5' to 3' orientation, sequentially X b1 X b2 ......X bn , and X bi (i=1-n) of the third random sequence all belong to a same set, said set is selected from B- , or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i th nucleotide from 5' end of the third random sequence, n is a positive integer selected from 4-16, and wherein the third spacer sequence is Y b1 ......Y bm , wherein Y bj (j=1-m) ∈ {A, T, G, C}, wherein Y bj represents the j th nucleotide from 5' end of the third spacer sequence, m is a positive integer selected from 1-3; (b) placing the first reaction mixture in a first thermal cycle program for pre-amplification, to obtain a pre-amplification product; (c) providing a second reaction mixture, wherein said second reaction mixture comprises said pre-amplification product obtained from step (b), a second primer, a mixture of nucleotide monomers, and a nucleic acid polymerase, wherein the second primer comprises or consists of, in a 5' to 3' orientation, a specific sequence and the common sequence; (d) placing the second reaction mixture in a second thermal cycle program for amplification, to obtain an amplification product.
The method of claim 1, (a) wherein X ai (i=1-n) of the first random sequence all belong to set B, X bi (i=1-n) of the third random sequence all belong to set D; or (b) wherein the fixed sequence is individually selected from the group consisting of CCC, AAA, TGGG, GTTT, GGG, TTT, TNTNG or GTGG; or (c) wherein the first variable sequence is selected from X a1 X a2 ......X an TGGG or X a1 X a2 ......X an GTTT, the third variable sequence is selected from X b1 X b2 ......X bn TGGG or X b1 X b2 ......X bn GTTT; or (d) wherein the common sequence is selected from SEQ ID NO: 1 [TTGGTAGTGAGTG], SEQ ID NO: 2 [GAGGTGTGATGGA], SEQ ID NO: 3 [GTGATGGTTGAGGTA], SEQ ID NO: 4 [AGATGTGTATAAGAGACAG], SEQ ID NO: 5 [GTGAGTGATGGTTGAGGTAGTGTGGAG] or SEQ ID NO: 6 [GCTCTTCCGATCT]; or (e) wherein the method further comprises a step of sequencing an amplification product obtained in step (d), wherein the second primer comprises a sequence complementary or identical to part of or whole of a primer used for sequencing; or (f) wherein the second primer comprises a primer mixture having identical common sequences and different specific sequences, said different specific sequences are complementary or identical to part of or whole of different primers in sequencing primer pairs used in a same sequencing, respectively; or (g) wherein the second primer comprises a mixture of sequences set forth in SEQ ID NO: 35 [AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC GCTCTTCCGATCT] and SEQ ID NO: 36 [CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCT CTTCCGATCT]; or (h) wherein step (b) enables the variable sequence of the first primers to pair with the genomic DNA and the genomic DNA is amplified to obtain a genomic pre-amplification product, wherein the genomic pre-amplification product comprises the common sequence at its 5' end and a complementary sequence of the common sequence at its 3' end; wherein the first thermal cycle program comprises: (b1) a thermal program capable of opening the DNA double strands to obtain a single-strand DNA template; (b2) a thermal program that enables binding of the first primers and the third primers to the single-strand DNA template; (b3) a thermal program that enables extension of the length of the first primers that binds to the single-strand DNA template under the action of the nucleic acid polymerase, to produce a pre-amplification product; (b4) repeating steps (b1) to (b3) to a designated first cycle number, wherein the designated first cycle number is more than 1; or (i) wherein the step (d) enables the common sequence of the second primer to pair with 3' end of the genomic pre-amplification product and the genomic pre-amplification product is amplified to obtain an extended genomic amplification product; wherein the step (d) comprises: (d1) a thermal program capable of opening DNA double strands; (d2) a thermal program further capable of opening DNA double strands; (d3) a thermal program that enables binding of the second primer to single strand of the genomic pre-amplification product obtained in step (b); (d4) a temperature program that enables extension of the length of the second primer that binds to the single strand of the genomic pre-amplification product, under the action of the nucleic acid polymerase; (d5) repeating steps (d2) to (d4) to a designated second cycle number, wherein the designated second cycle number is more than 1; or (j) further comprising analyzing the amplification product to identify disease- or phenotype-associated sequence features.
The method of claim 1, wherein the first primers comprise GCTCTTCCGA TCTY a1 X a1 X a2 X a3 X a4 X a5 TGGG, GCTCTTCCGATCTY a1 X a1 X a2 X a3 X a4 X a5 GTTT, or a combination thereof, the third primers comprise GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 TGGG, GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 GTTT, or a combination thereof, wherein Y a1 ∈ {A, T, G, C}, Y b1 ∈ {A, T, G, C}, said X ai (i=1-5) ∈ {T, G, C}, said X bi (i=1-5) ∈ {A, T, G}.
The method of claim 2(e), wherein (a) the common sequence comprises a sequence complementary to or identical to part of or whole of a primer used for sequencing; or (b) the specific sequence of the second primer comprises a sequence complementary to or identical to part of or whole of a primer used for sequencing.
The method of claim 4(b), wherein (a) the specific sequence of the second primer further comprises a sequence complementary to or identical to part of or whole of a capture sequence of a sequencing platform; or (b) the sequence which is comprised in the specific sequence of the second primer and complementary or identical to part of or whole of a primer used for sequencing comprises or consists of SEQ ID NO: 31 [ACACTCTTTCCCTACACGAC], or SEQ ID NO: 32 [GTGACTGGAGTTCAGACGTGT].
The method of claim 5(b), wherein (a) the sequence which is comprised in the specific sequence of the second primer and complementary or identical to part of or whole of a capture sequence of a sequencing platform, comprises or consists of SEQ ID NO: 33 [AATGATACGGCGACCACCGAGATCT], or SEQ ID NO: 34 [CAAGCAGAAGACGGCATACGAGAT]; or (b) the specific sequence of the second primer further comprises a barcode sequence, said barcode sequence is located between the sequence complementary or identical to part of or whole of a capture sequence of a sequencing platform and the sequence complementary or identical to part of or whole of a primer used for sequencing.
The method of claim 2(h), wherein when undergoing the first cycle, the DNA double strands in step (b1) are genomic DNA double strands, the thermal program comprises a denaturing reaction at a temperature between 90-95°C for 1-20 minutes and after the first cycle, the thermal program in step (b1) comprises a melting reaction at a temperature between 90-95°C for 3-50 seconds; wherein when after undergoing a second cycle, the pre-amplification product comprises a genomic pre-amplification product comprising the common sequence at its 5' end and a complementary sequence of the common sequence at its 3' end.
The method of claim 2(h), wherein after step (b1) and prior to step (b2), said method does not comprise an additional step of placing the first reaction mixture in a suitable thermal program such that the 3' end and 5' end of the genomic pre-amplification product hybridize to form a hairpin structure.
The method of claim 2(j), wherein the disease- or phenotype-associated sequence features include chromosomal abnormalities, chromosomal translocation, aneuploidy, partial or complete chromosomal deletion or duplication, fetal HLA haplotypes and paternal mutations, or the disease or phenotype is selected from the group consisting of: beta-thalassemia, Down's syndrome, cystic fibrosis, sickle cell disease, Tay-Sachs disease, Fragile X syndrome, spinal muscular atrophy, hemoglobinopathy, Alpha-thalassemia, X-linked diseases (diseases dominated by genes on the X chromosome), spina bifida, anencephaly, congenital heart disease, obesity, diabetes, cancer, fetal sex, and fetal RHD.
The method of claim 2(j), wherein the genomic DNA is derived from a blastomere, blastula trophoblast layer, cultured cells, extracted gDNA or blastula culture medium.
The method of claim 1, wherein the first random sequence and the third random sequence have the same length, or wherein the first random sequence and the third random sequence have different lengths.
The method of claim 1, wherein steps (b) and (d) are as follows: (b) placing the first reaction mixture in a first thermal cycle program, such that the first variable sequence of the first primers and the third variable sequence of the third primers are capable of pairing with the genomic DNA and the genomic DNA is amplified to obtain a genomic pre-amplification product, wherein the genomic pre-amplification product comprises the common sequence at its 5' end and a complementary sequence of the common sequence at its 3' end; wherein the first thermal cycle program comprises: (b1) for the first cycle, reacting at a first denaturing temperature at a temperature between 90-95°C for 1-20 min; for the cycle following the first cycle, reacting at a first denaturing temperature at a temperature between 90-95°C for 3-50 s; (b2) reacting at a first annealing temperature between 10-20°C for 3-60 s, reacting at a second annealing temperature between 20-30°C for 3-50 s, and reacting at a third annealing temperature between 30-50°C for 3-50 s; (b3) reacting at a first extension temperature between 60-80°C for 10 s-15 min; (b4) repeating steps (b1) to (b3) for 2-40 cycles. and (d) placing the second reaction mixture in a second thermal cycle program, such that the common sequence of the second primer is capable of pairing with 3' end of the genomic pre-amplification product and the genomic pre-amplification product is amplified to obtain an extended genomic amplification product, wherein the second thermal cycle program comprises: (d1) reacting at a second denaturing temperature between 90-95°C for 5 s-20 min; (d2) reacting at a second melting temperature between 90-95°C for 3-50 s; (d3) reacting at a fourth annealing temperature between 45-65°C for 3-50 s; (d4) reacting at a second extension temperature between 60-80°C for 10 s-15 min; (d5) repeating steps (d2) to (d4) for 2-40 cycles.
The method of claim 12, wherein the common sequence comprises or consists of SEQ ID NO: 6; X ai (i=1-n) of the first random sequence all belong to D, X bi (i=1-n) of the third random sequence all belong to B.
A kit for amplifying genomic DNA, said kit comprises primers, wherein the primers consist of first primers and third primers, wherein the first primers consist of, in a 5' to 3' orientation, a common sequence and a first variable sequence, or wherein the first primers consist of, in a 5' to 3' orientation, a common sequence, a first spacer sequence and a first variable sequence, wherein the first primers are a mixture of primers comprising the same common sequence and different first variable sequences, wherein each first variable sequence consists of a first random sequence and a fixed sequence at its 3' end, wherein the first random sequence is, in a 5' to 3' orientation, sequentially X a1 X a2 ......X an , and X ai (i=1-n) of the first random sequence all belong to a same set, said set is selected from B , or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i th nucleotide from 5' end of a first random sequence, n is a positive integer selected from 4-16, and wherein the first spacer sequence is Y a1 ......Y am , wherein Y aj (j=1-m) ∈ {A, T, G, C}, wherein Y aj represents the j th nucleotide from 5' end of the first spacer sequence, m is a positive integer selected from 1-3, wherein the third primers consist of, in a 5' to 3' orientation, the common sequence and a third variable sequence, or wherein the third primers consist of, in a 5' to 3' orientation, the common sequence, a third spacer sequence and a third variable sequence, wherein the third primers are a mixture of primers comprising the same common sequence and different third variable sequences, wherein each third variable sequence consists of a third random sequence and a fixed sequence at its 3' end, wherein the third random sequence is, in a 5' to 3' orientation, sequentially X b1 X b2 ......X bn , and X bi (i=1-n) of the third random sequence all belong to a same set, said set is selected from B , or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i th nucleotide from 5' end of the third random sequence, n is a positive integer selected from 4-16, and wherein the third spacer sequence is Y b1 ......Y bm , wherein Y bj (j=1-m) ∈ {A, T, G, C}, wherein Y bj represents the j th nucleotide from 5' end of the third spacer sequence, m is a positive integer selected from 1-3.

Description

FIELD OF THE INVENTION The present invention relates to a method of amplifying DNA, in particular, a method for amplifying and sequencing single-cell whole genomic DNA. BACKGROUND Single-cell whole genome sequencing is a new technique for amplifying and sequencing whole-genome at single-cell level. Its principle is to amplify minute amount whole-genome DNA isolated from a single cell, and perform high-throughput sequencing after obtaining a high coverage of the complete genome. Currently, there are four major types of whole-genome amplification techniques: Primer Extension Preamplification-Polymerase Chain Reaction (referred to as PEP-PCR, for detailed method see Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N. 1992. Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci U S A.89 (13):5847-51.), Degenerate Oligonucleotide-Primed Polymerase Chain Reaction (referred to as DOP-PCR, for detailed method see Telenius H, Carter NP, Bebb CE, Nordenskjo M, Ponder BA, Tunnacliffe A. 1992. Degenerate oligonucleotide-primed PCR: general amplification of target DNA by a single degenerate primer.Genomics13:718-25), Multiple Displacement Amplification (referred to as MDA, for detailed method see Dean FB, Nelson JR, Giesler TL, LaskenRS. 2001. Rapid amplification of plasmid and phageDNA using phi29 DNA polymerase and multiply-primed rolling circle amplification. Genome Res.11:1095-99), and Multiple Annealing and Looping Based Amplification Cycles (referred to as MALBAC, for detailed method see PCT patent application No. WO2012166425). Gene sequencing technology experienced three stages of development: the first-generation DNA sequencing technology includes chemical degradation, dideoxy chain termination method, and various sequencing technologies developed on the basis thereof, wherein the most representative is the chain termination method proposed by Sanger and Coulson in 1975. The first-generation technology has high accuracy and long read, and is so far the only method that can perform "head-to-tail" sequencing, but it has drawbacks such as being costly and slow, and is thus not the ideal method for sequencing. The succeeding second- and third-generation sequencing technologies have a common characteristic of high throughput, and are also known as "next-generation sequencing technology (NGS)", wherein the second-generation sequencing technology is represented by pyrosequencing technology, sequencing-by-synthesis (SBS) technology, and sequencing-by-ligation technology. Upon several years of development, pyrosequencing technique and sequencing-by-ligation technology are being rarely used, while the mainstream second-generation sequencing technology nowadays is sequencing-by-synthesis technology, semiconductor sequencing technology and CG sequencing technology. The third-generation sequencing technology is generally divided into two categories, one is single-molecule fluorescence sequencing, the representative technologies of which are TSMS technology and SMRT technology, and the other is nanopore single molecule technology. Compared with the previous two generations of technology, the major feature of the third-generation sequencing technology is single-molecule sequencing. Although the third-generation sequencing technology has made certain progress, the current mainstream sequencing technology remains to be the second-generation sequencing technology. The whole-genome sequence amplified using current whole-genome amplification technology cannot be directly applied in second-generation sequencing technology. Therefore, no matter the whole-genome sequence described above is applied in sequencing-by-synthesis technology, semiconductor sequencing technology or CG sequencing technology of the second-generation sequencing technologies, a library preparation process is required before loading the whole-genome sequence for sequencing. Each sequencing technology has a corresponding library preparation method, among which library preparations for sequencing-by-synthesis platform are mainly divided into two categories, one is the technology of Y-shaped linker addition or stem-loop linker addition to fragmented DNA after end repair, and the other is transpson technology. Library preparations for semiconductor sequencing platform are also divided into two categories, one is the technology of linker addition to fragmented DNA after end repair, and the other is transpson technology. The library preparation process for CG platform is relatively complex: fragmented DNA need to be subject to enzymatic digestion and two cyclization processes after end repair, which is complicated to operate and time-consuming. When products amplified in the current mainstream amplification methods are used for the sequencing technologies described above, either library needs to be built separately, or the sequencing yields poor results. Therefore, at present there is an urgent need for an improved am