CN-122029175-A - Modified nucleotide for improving chain offset, preparation method and application thereof
Abstract
The present invention relates to the field of sequencing. In particular, the present invention relates to modified nucleotides as shown in formula (I-1), a preparation method thereof and use for sequencing.
Inventors
- JIA MAN
- CHEN LIQIN
- ZHANG FENG
- HUANG SIQIAN
- LUO YUFEN
- XU CHONGJUN
Assignees
- 深圳华大智造科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20240115
Claims (17)
- A modified nucleotide, a salt thereof, or an ester thereof, the modified nucleotide having a structure represented by general formula I-1 or I-2: wherein D is a nucleotide; dye is fluorescent dye; m1 is selected from 1, 2, 3,4, 5, r1 is selected from 1, 2, 3,4, 5, n1 is selected from 1, 2, 3,4, 5, 6, 7.
- The modified nucleotide, salt or ester thereof of claim 1, wherein D is dNTP or rtp, e.g. dATP, dGTP, dTTP, dCTP, ATP, GTP, CTP or UTP.
- The modified nucleotide, salt or ester thereof according to claim 1 or2, wherein D is modified with a reversible blocking group, such as an azidomethylene or allyl group at deoxyribose 3' -O.
- A modified nucleotide, a salt or an ester thereof according to any one of claims 1 to 3, selected from the following structures:
- The modified nucleotide, salt or ester thereof of any one of claims 1-4, wherein each of the fluorescent dyes is independently selected from the group consisting of cyanine dyes, fluorescein-based dyes, rhodamine-based dyes, and AF-series dyes; preferably, the fluorescent dyes are each independently selected from AF532 or Cy5.
- The modified nucleotide, salt or ester thereof of any one of claims 1-5, including but not limited to:
- A method of preparing the modified nucleotide of any one of claims 1-6, the method comprising: 1) The connector is synthesized and then the mixture is used for the preparation of the composite, 2) Coupling the linker to a fluorescent dye derivative (e.g. an active ester thereof), 3) Coupling the product of step 2) with a nucleotide derivative having an ethynylamino modification.
- The method of claim 7, wherein the modified nucleotide is prepared according to the following synthetic route: Wherein D is a nucleotide, And dye is fluorescent dye.
- A method of sequencing, the method of sequencing comprising: (a) Incorporating into a polynucleotide at least one modified nucleotide, salt or ester thereof according to any one of claims 1 to 6, and (B) Detecting the modified nucleotide, salt or ester thereof incorporated into the polynucleotide by detecting a fluorescent signal from a fluorescent dye attached to the modified nucleotide, salt or ester thereof.
- The method of sequencing of claim 9, comprising the steps of: (a) Providing a duplex comprising a growing nucleic acid strand and a nucleic acid molecule to be sequenced; (b) Performing a reaction cycle comprising the following steps (i), (ii) and (iii): step (i) incorporating the nucleotide into the growing nucleic acid strand using a polymerase to form a nucleic acid intermediate comprising a blocking group and a detectable label; Step (ii) detecting the detectable label on the nucleic acid intermediate; step (iii) removing blocking groups on the nucleic acid intermediate using a cleavage reagent; optionally, the reaction cycle further comprises step (iv) of removing the detectable label on the nucleic acid intermediate using a excision reagent.
- The method of sequencing of claim 9 or 10, said method employing a two-color sequencing technique, said method comprising using first, second, third and fourth nucleotides that differ from one another, wherein, The first nucleotide is provided with a first fluorescent label which can be detected at a first emission wavelength; the second nucleotide is provided with a second fluorescent label which can be detected at a second emission wavelength; Part of the third nucleotides bearing a fluorescent label detectable at a first emission wavelength but having a different brightness than the first fluorescent label and part of the third nucleotides bearing a fluorescent label detectable at a second emission wavelength but having a different brightness than the second fluorescent label, and The fourth nucleotide does not contain a fluorescent label.
- The method of sequencing of claim 9 or 10, employing a four-color sequencing technique, comprising using first, second, third and fourth nucleotides that differ from one another, wherein, The first nucleotide is provided with a first fluorescent label which can be detected at a first emission wavelength; The second nucleotide is provided with a second fluorescent label which can be detected at the first emission wavelength; a third nucleotide carrying a third fluorescent label detectable at a second emission wavelength, and The fourth nucleotide carries a fourth fluorescent label that is detectable at the second emission wavelength.
- The method of sequencing of claim 9, wherein the nucleic acid molecule to be sequenced is a DNA nanosphere.
- A kit comprising one or more nucleotides, wherein at least one nucleotide is a modified nucleotide, salt or ester thereof of any one of claims 1-6; Preferably, the kit comprises two or more labelled nucleotides.
- The kit of claim 14, comprising a sequencing reagent; Preferably, the sequencing reagent comprises at least one of dNTPs mixed solution, nucleic acid polymerase mixed solution and eluent.
- The kit of claim 15, wherein the dNTPs mixture comprises at least one modified nucleotide, a salt or an ester thereof according to any one of claims 1 to 6; preferably, the eluent contains a excision reagent; Optionally, the excision reagent is selected from one or more of endonuclease IV, alkaline phosphatase, an organic phosphonate such as tris (3-hydroxypropyl) phosphine (THPP), tris (2-carboxyethyl) phosphine hydrochloride (TCEP), or a PdCl2 and sulfonated triphenylphosphine complex.
- Use of a modified nucleotide, salt or ester thereof according to any one of claims 1 to 6 or a kit according to claims 14 to 16 in sequencing.
Description
Modified nucleotide for improving chain offset, preparation method and application thereof Technical Field The present invention relates to the field of sequencing. In particular, the invention relates to a modified nucleotide, a method for its preparation and its use for sequencing. Background While high throughput sequencing (NGS) produces more data than other traditional approaches and is informative, subsequent analysis of NGS data presents many new difficulties. The most important challenge for sequencing data is how to accurately detect SNP variations, i.e., DNA sequence polymorphisms caused by variation of individual nucleotides. In applying high throughput sequencing techniques to SNP detection, it has been found that when reads (reads) are aligned to genomic sequence references, sometimes the types of variation exhibited by sense strand reads and negative strand reads differ significantly, one of which may exhibit homozygous and the other heterozygous mutation. This phenomenon of inconsistency is known as bias. Through research, it was found that bias was specifically classified into identity bias (orientation bias) and strand bias (strand bias), both resulting from limitations of existing sequencing technologies. The presence of identity bias is easily perceived by the skilled person and is eliminated as much as possible. For example, in the PCR-loop of library preparation, the polymerase incorporates the wrong base during extension. Theoretically, the probability of such amplification errors occurring from both the genomic sense strand fragment and the negative sense strand fragment is the same, meaning that about 50% of the total number of reads with sequencing errors are from sense strand reads, and about 50% are from antisense strand reads. The solution is also simpler, such as selecting a high fidelity polymerase. As for the problem of chain bias, it was not noticed until 2012 (see, for details, academic literature "THE EFFECT of strand bias in Illumina short-read sequencing data", doi: 10.1186/1471-2164-13-666). Strand bias is embodied by the fact that, after alignment of reads to genomic sequence references, it was found that inconsistent base sites preferentially appear on either the sense strand or the antisense strand, meaning that the ratio of the two deviates significantly from 50% to 50%. In a sense, chain bias is not used to characterize the level of error rate, but rather to evaluate the uniformity of error events throughout the system. However, until now, the specific mechanism of chain offset generation is not unified, and it is difficult to propose a corresponding solution. Disclosure of Invention The inventors speculate that one of the possible reasons for generating strand bias is that the N 3 -linker carried on dNTP substrates commonly used at present has strong binding capacity with sequencing enzyme, and can improve the doping efficiency, but can cause the reduction of the elution efficiency of enzyme after base doping. Under the condition of specific sequences, the binding capacity of the enzyme to the DNA template strand is further increased, so that the enzyme cannot be effectively eluted. While the A base-labeled AF532 dye is negatively charged, once eluted incompletely, the dye is easily bound to the positively charged finger region of the enzyme, resulting in a dye that is not well excited, resulting in a chain bias (T- > G, A- > C) such as T to G, A to C. In order to solve the problems, the invention provides a modified nucleotide with novel structure, which can be used as dNTP for sequencing to solve the problem of chain bias (T- > G, A- > C) of chain bias from T to G, A to C. Modified nucleotides The application provides a modified nucleotide, a salt or an ester thereof, wherein the modified nucleotide has a structure shown in a general formula I-1: wherein D is a nucleotide; dye is fluorescent dye; m1 is selected from 1, 2, 3,4, 5, r1 is selected from 1, 2, 3,4, 5, n1 is selected from 1, 2, 3,4, 5, 6, 7. In some embodiments, m1 is 1, r1 is 1, n1 is 7. In some embodiments, D is dNTP, i.e., deoxyribonucleoside triphosphates, can be selected from dATP, dGTP, dTTP, dCTP. In some embodiments, D is rtp, ribonucleoside triphosphates, which can be selected from ATP, GTP, CTP, UTP. In some embodiments, D is modified with a reversible blocking group, such as an azidomethylene (-CH 2-N3) or allyl group in the deoxyribose 3' -O modification. In some embodiments, the modified nucleotide is selected from the following structures: In some embodiments, the fluorescent dyes are each independently selected from cyanine dyes, fluorescein-based dyes, rhodamine-based dyes, and AF-series dyes. In some embodiments, the fluorescent dyes are each independently selected from AF532 or Cy5. Dye molecules can be attached to any position on the nucleotide base through a linker, provided Watson-Crick base pairing is still possible. Specific nucleobase labeling sites include the C5 position of a