Search

EP-4735636-A1 - SYSTEMS AND METHODS OF SEQUENCING POLYNUCLEOTIDES WITH FOUR LABELED NUCLEOTIDES

EP4735636A1EP 4735636 A1EP4735636 A1EP 4735636A1EP-4735636-A1

Abstract

The application relates to DNA sequencing systems and methods. Systems and methods for determining the nucleotide sequence of a polynucleotide may include introducing a fourth labeled nucleotide in a two-channel sequencing by synthesis system that allows for encoding space for a detecting a fifth labeled nucleotide or empty well detection.

Inventors

  • Kagalwalla, Abde Ali Hunaid
  • VIECELI, JOHN SILVIO

Assignees

  • Illumina, Inc.

Dates

Publication Date
20260506
Application Date
20240625

Claims (20)

  1. 1. A method of sequencing polynucleotides bound to a flowcell, comprising: detecting fluorescent emissions from a first labeled nucleotide at a first wavelength and a first intensity; detecting fluorescent emissions from a second labeled nucleotide at a second wavelength, wherein the first wavelength is different from the second wavelength; detecting fluorescent emissions from a third labeled nucleotide at the first and second wavelengths; detecting fluorescent emissions from the fourth labeled nucleotide at the first wavelength and a second intensity that is different from the first intensity; and determining the sequence of the polynucleotides based on the detected fluorescent emissions and intensities.
  2. 2. The method of claim 1, wherein the flowcell comprises wells configured to bind polynucleotides; further wherein the incorporation of each of the at least four labeled nucleotide conjugates into a well is detected from at least one signal state.
  3. 3. The method of claim 2, wherein the presence of an empty well is determined from a dark state.
  4. 4. The method of claim 1, wherein the first intensity is approximately double the second intensity.
  5. 5. The method of claim 1, wherein the fourth labeled nucleotide is guanine.
  6. 6. The method of claim 1, wherein the fluorescent emissions from the first, second, third and fourth labeled nucleotides are plotted onto a cloud plot of intensity and wavelength.
  7. 7. The method of claim 1, wherein the fourth labeled nucleotide is a modified nucleotide.
  8. 8. The method of claim 7, wherein detecting the fluorescent emissions from the second labeled nucleotide at a second wavelength comprises detecting the second labeled nucleotide fluorescent emissions at a third intensity, and the method further comprises: detecting fluorescent emissions from a fifth labeled nucleotide at the second wavelength and at fourth intensity which is different from the third intensity.
  9. 9. The method of claim 6, wherein the modified nucleotide is a 5-methylcytosine, a N6- methyladenine, or an inosine.
  10. 10. A system for sequencing polynucleotides bound to a flowcell, comprising: a machine-readable memory; and a processor configured to execute machine-readable instructions, which, when executed by the processor, cause the system to perform steps including: detecting fluorescent emissions from a first labeled nucleotide at a first wavelength and a first intensity; detecting fluorescent emissions from a second labeled nucleotide at a second wavelength, wherein the first wavelength is different from the second wavelength; detecting fluorescent emissions from a third labeled nucleotide at the first and second wavelengths; detecting fluorescent emissions from the fourth labeled nucleotide at the first wavelength and a second intensity that is different from the first intensity; and determining the sequence of the polynucleotides based on the detected fluorescent emission and intensity.
  11. 11. The system of claim 10, wherein the flowcell comprises wells configured to bind polynucleotides; further wherein the incorporation of the at least four labeled nucleotide conjugates into a well is detected from at least one signal state.
  12. 12. The system of claim 11, wherein the presence of an empty well is determined from a dark state.
  13. 13. The system of claim 10, wherein the first intensity is approximately double the second intensity.
  14. 14. The system of claim 10, wherein the fourth labeled nucleotide is guanine.
  15. 15. The system of claim 10, wherein the fluorescent emissions from the first, second, third and fourth labeled nucleotides are plotted onto a cloud plot of intensity and wavelength.
  16. 16. The system of claim 10, wherein the fourth labeled nucleotide is a modified nucleotide.
  17. 17. The system of claim 16, wherein detecting the fluorescent emissions from the second labeled nucleotide at a second wavelength comprises detecting the second labeled nucleotide fluorescent emissions at a third intensity, and the method further comprises: detecting fluorescent emissions from a fifth labeled nucleotide at the second wavelength and at fourth intensity which is different from the third intensity.
  18. 18. The system of claim 17, wherein the modified nucleotide is a 5-methylcytosine, a N6- methyladenine, or an inosine.
  19. 19. A non-transitory computer-readable medium storing a polynucleotide sequencing program including instructions that, when executed by a processor, causes a polynucleotide sequencing apparatus, to: detect fluorescent emissions from a first labeled nucleotide at a first wavelength and a first intensity; detect fluorescent emissions from a second labeled nucleotide at a second wavelength, wherein the first wavelength is different from the second wavelength; detect fluorescent emissions from a third labeled nucleotide at the first and second wavelengths; detect fluorescent emissions from the fourth labeled nucleotide at the first wavelength and a second intensity that is different from the first intensity; and determine the sequence of the polynucleotides based on the detected fluorescent emission and intensity.
  20. 20. The non-transitory computer-readable medium of claim 19, wherein the fourth labeled nucleotide is a modified nucleotide.

Description

ILLINC.764WO / IP-2536-PCT PATENT SYSTEMS AND METHODS OF SEQUENCING POLYNUCLEOTIDES WITH FOUR LABELED NUCLEOTIDES INCORPORATION BY REFERENCE TO ANY PRIORITY APPLICATIONS [0001] This application claims priority to U.S. Provisional Application No.63/511,364 filed June 30, 2023, the content of which is incorporated by reference in its entirety. BACKGROUND Field [0002] The present disclosure relates to DNA sequencing systems and methods. In particular, this disclosure relates to improved detection methods for detecting four or more nucleotides by labeling four nucleotides. Background [0003] Current sequencing technologies involve determining DNA or RNA sequences by deciphering four natural bases in the genome: A, T (U), G, and C. However, many of these DNA or RNA sequences have modified nucleotide bases. These modified bases play essential roles in biological processes such as epigenetic studies, epi-transcriptomics, human diseases, and cancer. A common form of DNA modification is Methylated C (5-methylcytosine or 5-MeC) found in CpG dinucleotides. RNA modifications can also arise from noncoding RNA, such as ribosomal RNA and transfer RNA. The current standard for DNA methylation analysis typically uses genome sequencing of bisulfite-converted DNA. Since uracil, read as thymine, will bind to complementary adenosine, 5-MeC can be partially inferred with only four base detection. However, as the number and complexity of chemical base modifications continue to grow, it may be advantageous to be able to detect more than the four unmodified bases during a DNA sequencing process. [0004] Current base calling schemes for four bases generally include two and four- channel base calling. Some sequencing systems, such as those from Illumina, Inc. (San Diego, CA) use onboard real-time analysis (RTA) to turn raw image data into base calls. This process can be massively parallelized in order to occur in real time on the instrument. The number of images fed into the RTA base calling software could be either four images (referred to as four-channel base calling or four-dye base calling) or two images (referred to as two-channel base calling). [0005] Normally in such systems, clusters of the polynucleotide inserts to be sequenced are formed within wells on a flowcell. One cluster is positioned into each well so that a single well should cluster with many identical copies of the insert to be sequenced using SBS methods. In current systems, three of the nucleotides are labeled, and the third nucleotide (typically G) remains unlabeled. This is referred to as G being in an “off state.” However, during SBS sequencing runs on the flowcell, the wells in the flowcell that are not occupied by a DNA cluster might be erroneously called as a G nucleotide because the empty well will not fluoresce, similar to if there was an unlabeled G nucleotide present. This issue can also occur in random flowcells without wells due to spurious spots on the flowcell that are assigned as clusters. During a sequencing run, clusters which are dim, empty, or otherwise not fluorescing within enough intensity to be detected by a sequencing system may produce sequencing errors. The sequencing system may assume that the cluster contains an unlabeled G nucleotide, whereas it may actually contain a different nucleotide which is only dimly fluorescent, for example. If these errors are not caught by the sequencing system, then an incorrect sequence may be attributed to the insert bound to the cluster. Without a fluorophore attached to a G nucleotide, these types of errors may occur more since any error in fluorescence may be called as a G nucleotide in the sequencing process. SUMMARY [0006] An aspect of the disclosure is directed to a method of sequencing polynucleotides bound to a flowcell, including: detecting fluorescent emissions from a first labeled nucleotide at a first wavelength and a first intensity; detecting fluorescent emissions from a second labeled nucleotide at a second wavelength, wherein the first wavelength is different from the second wavelength; detecting fluorescent emissions from a third labeled nucleotide at the first and second wavelengths; detecting fluorescent emissions from the fourth labeled nucleotide at the first wavelength and a second intensity that is different from the first intensity; and determining the sequence of the polynucleotides based on the detected fluorescent emissions and intensities. [0007] In some embodiments, the flowcell may comprise wells configured to bind polynucleotides. The incorporation of one of the at least four labeled nucleotide may conjugate into a well may be detected from at least one signal state. In some embodiments, the presence of an empty well may be determined from a dark state. In some embodiments, the first intensity may be approximately double the second intensity. [0008] In some embodiments, the fourth labeled nucleotide may be guanine. In some embodiments, the fluorescent emissions from the