US-12624350-B2 - High-throughput single-nuclei and single-cell libraries and methods of making and of using

US12624350B2US 12624350 B2US12624350 B2US 12624350B2US-12624350-B2

Abstract

Provided herein are methods for preparing a sequencing library that includes nucleic acids from a plurality of single cells. In one embodiment, the method includes nuclear or cellular hashing which permits increased sample throughput and increased doublet detection at high collision rates. In one embodiment, the method includes normalization hashing which aids in estimating and removing technical noise in cell to cell variation and increases sensitivity and specificity.

Inventors

Sanjay Srivatsan
Jose McFaline-Figueroa
Vijay Ramani
Junyue Cao
Gregory Booth
Jay Shendure
Cole Trapnell
Frank J. Steemers

Assignees

ILLUMINA, INC.
UNIVERSITY OF WASHINGTON

Dates

Publication Date: 20260512
Application Date: 20200302

Claims (18)

1 . A method of preparing a sequencing library comprising nucleic acids from a plurality of nuclei or cells, the method comprising: (a) providing a plurality of compartments comprising nuclei or cells; (b) contacting the nuclei or cells in the compartments with a hashing oligo that comprises a compartment specific index to result in absorption of the hashing oligo by the nuclei or cells; and (c) exposing the nuclei or cells to a cross-linking compound to fix hashing oligos to cells or to isolated nuclei, (d) combining the nuclei or cells from different compartments into a second compartment to generate pooled hashed nuclei or pooled hashed cells, wherein at least one copy of the hashing oligo is associated with nuclei or cells.
2 . The method of claim 1 , further comprising exposing the cells of each compartment of step (a) to a predetermined condition or, prior to step (a), providing a plurality of compartments comprising cells, exposing the cells of each compartment to a predetermined condition and then isolating nuclei from the cells.
3 . The method of claim 2 , wherein the predetermined condition comprises exposure to an agent.
4 . The method of claim 3 , wherein the agent comprises a protein, a non-ribosomal protein, a polyketide, an organic molecule, an inorganic molecule, an RNA or RNAi molecule, a carbohydrate, a glycoprotein, a nucleic acid, a drug, or a combination thereof.
5 . The method of claim 1 , wherein the providing further comprises: exposing the plurality of cells of each compartment to a predetermined condition.
6 . The method of claim 5 , wherein the predetermined condition comprises exposure to an agent.
7 . The method of claim 1 , wherein the hashing oligo comprises a single stranded nucleic acid.
8 . The method of claim 1 , further comprising processing the pooled hashed cells or pooled hashed nuclei using a single-cell combinatorial indexing method to result in a sequencing library comprising nucleic acids from the pooled hashed cells or pooled hashed nuclei, wherein the nucleic acids comprise a plurality of indexes.
9 . The method of claim 8 , wherein the single-cell combinatorial indexing comprises: distributing subsets of the pooled hashed cells or pooled hashed nuclei into a second plurality of compartments and contacting each subset with reverse transcriptase or DNA polymerase and a primer, wherein the primer in each compartment comprises a first index sequence that is different from first index sequences in the other compartments to generate indexed nuclei or indexed cells comprising indexed nucleic acid fragments; combining the indexed cells or indexed nuclei to generate pooled indexed cells or pooled indexed nuclei; distributing subsets of the pooled indexed cells or pooled indexed nuclei into a third plurality of compartments and introducing a second index sequence to the indexed nucleic acid fragments to generate dual-indexed cells or dual-indexed nuclei comprising dual-indexed nucleic acid fragments, wherein the introducing comprises ligation, primer extension, amplification, or transposition; combining the dual-indexed cells or dual-indexed nuclei to generate pooled dual-indexed nuclei or pooled dual-indexed cells; distributing subsets of dual-indexed cells or the pooled dual-indexed nuclei into a fourth plurality of compartments and introducing a third index sequence to the dual-indexed nucleic acid fragments to generate triple-indexed cells or triple-indexed nuclei comprising triple-indexed nucleic acid fragments, wherein the introducing comprises ligation, primer extension, amplification, or transposition; and combining the triple-indexed fragments, thereby producing a sequencing library comprising nucleic acids from the pooled hashed cells or pooled hashed nuclei.
10 . The method of claim 9 , wherein distributing subsets of the pooled indexed cells or pooled indexed nuclei into a third plurality of compartments comprises contacting each subset with a transposome complex, wherein the transposome complex in each compartment comprises a transposase and a second index sequence under, conditions suitable for ligation of the second index sequence to the ends of the indexed nucleic acid fragments comprising a first index sequence to generate dual-indexed nuclei comprising dual-indexed nucleic acid fragments, wherein the second index sequence is different from second index sequences in the other compartments.
11 . The method of claim 9 , wherein distributing subsets of dual-indexed cells or the pooled dual-indexed nuclei into a fourth plurality of compartments comprises contacting each subset with a primer comprising a third index sequence and a universal primer sequence, wherein the contacting comprises conditions suitable for amplification and incorporation of the third index sequence to the ends of the dual-indexed nucleic acid fragments, wherein the third index sequence is different from third index sequences in the other compartments.
12 . The method of claim 9 , further comprising: providing a surface comprising a plurality of amplification sites, wherein the amplification sites comprise at least two populations of attached single stranded capture oligonucleotides having a free 3′ end, and contacting the surface comprising amplification sites with the triple-indexed fragments under conditions suitable to produce a plurality of amplification sites that each comprise a clonal population of amplicons from an individual fragment comprising a plurality of indexes.
13 . A composition comprising the hashed cells or hashed nuclei of claim 1 .
14 . A composition comprising the pooled hashed cells or pooled hashed nuclei of claim 1 .
15 . A multi-well plate, wherein compartments of the multi-well plate comprise the composition of claim 13 .
16 . A droplet, wherein the droplet comprises the composition of claim 13 .
17 . The method of claim 1 , wherein the hashing oligo comprises a nucleic acid and a hashing index.
18 . The method of claim 17 , wherein the the hashing index in each compartment comprises an index sequence that is different from index sequences in the other compartments to generate hashed nuclei or hashed cells.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application Ser. No. 62/812,853, filed Mar. 1, 2019, which is incorporated by reference herein in its entirety. This application is the § 371 U.S. National Stage of International Application No. PCT/US2020/020637, filed 2 Mar. 2020, which claims the benefit of U.S. Provisional Application Ser. No. 62/812,853, filed Mar. 1, 2019, which are incorporated by reference herein in their entireties its entirety. GOVERNMENT FUNDING This invention was made with government support under Grant Nos. HG007811, HD088158, and R01 HG006283, awarded by the National Institutes of Health, and Grant No. DGE1258485, awarded by the National Science Foundation. The government has certain rights in the invention. FIELD Embodiments of the present disclosure relate to sequencing nucleic acids. In particular, embodiments of the methods and compositions provided herein relate to producing indexed single-nuclei and single-cell libraries using hashing oligos and/or normalization oligos and obtaining sequence data therefrom. SEQUENCE LISTING This application contains a Sequence Listing electronically submitted via EFS-Web to the United States Patent and Trademark Office as an ASCII text file entitled “IP-1815-PCT_ST25.txt” having a size of 4 kilobytes and created on Feb. 28, 2020. The information contained in the Sequence Listing is incorporated by reference herein. BACKGROUND High-throughput screens (HTSs) are a cornerstone of the pharmaceutical drug discovery pipeline (J. R. Broach, J. Thorner, Nature 384 (Suppl), 14-16 (1996), Pereira, J. A. Williams, Br. J. Pharmacol. 152, 53-61 (2007)). However, conventional HTSs have at least two major limitations. First, the readout of most are restricted to gross cellular phenotypes, e.g., proliferation (D. Shum et al., J. Enzyme Inhib. Med. Chem. 23, 931-945 (2008), C. Yu et al., Nat. Biotechnol. 34, 419-423 (2016)), morphology (Z. E. Perlman et al., Science 306, 1194-1198 (2004), Y. Futamura et al., Chem. Biol. 19, 1620-1630 (2012)), or a highly specific molecular readout (J. Kang et al., Nat. Biotechnol. 34, 70-77 (2016), K. L. Huss, P. E. Blonigen, R. M. Campbell, J. Biomol. Screen. 12, 578-584 (2007)). Subtle changes in cell state or gene expression that might otherwise provide mechanistic insights or reveal off-target effects are routinely missed. Second, even when HTSs are performed in conjunction with more comprehensive molecular phenotyping such as transcriptional profiling (C. Ye et al., Nat. Commun. 9, 4307 (2018), E. C. Bush et al., Nat. Commun. 8, 105 (2017), A. Subramanian et al., Cell 171, 1437-1452.e17 (2017), J. Lamb et al., Science 313, 1929-1935 (2006)), a limitation of bulk assays is that even cells ostensibly of the same “type” can exhibit heterogeneous responses (M. B. Elowitz, A. J. Levine, E. D. Siggia, P. S. Swain, Science 297, 1183-1186 (2002), C. Trapnell, Genome Res. 25, 1491-1498 (2015)). Such cellular heterogeneity can be highly relevant in vivo. For example, it remains largely unknown whether the rare subpopulations of cells that survive chemotherapeutics are doing so on the basis of their genetic background, epigenetic state, or some other aspect (S. M. Shaffer et al., Nature 546, 431-435 (2017), S. L. Spencer, S. Gaudet, J. G. Albeck, J. M. Burke, P. K. Sorger, Nature 459, 428-432 (2009)). Moreover, the sparsity and levels of technical noise often make it difficult to extract biologically meaningful information. In principle, single-cell transcriptome sequencing (scRNA-seq) represents a form of high-content molecular phenotyping that could enable HTSs to overcome both limitations. However, the per-sample and per-cell costs of most scRNA-seq technologies remain high, precluding even modestly sized screens. Recently, several groups have developed “cellular hashing” methods, in which cells from different samples are molecularly labeled and mixed before scRNA-seq. However, current hashing approaches require relatively expensive reagents (e.g., antibodies (M. Stoeckius et al., Genome Biol. 19, 224 (2018)) or chemically modified DNA oligos (J. Gehring, J. H. Park, S. Chen, M. Thomson, L. Pachter, bioRxiv 315333 [Preprint] 5 May 2018. doi.org/10.1101/315333, C. S. McGinnis et al., Nat. Methods 16, 619-626 (2019)), use cell-type-dependent protocols (D. Shin, W. Lee, J. H. Lee, D. Bang, Sci. Adv. 5, eaav2249 (2019)), and/or use scRNA-seq platforms with a high per-cell cost. SUMMARY OF THE APPLICATION High cell count single-cell and single-nuclei sequencing with Single-cell Combinatorial Indexed Sequencing (sci-) methods has shown its efficacy in separation of populations within cells and complex tissues via transcriptomes, chromatin-accessibility, mutational differences, and other differences. One method described herein, nuclear hashing or cellular hashing, uses hashing oligos to increase sample throughput and increases doublet detection at high collision rates. Another method d