US-20260125752-A1 - METHODS AND PROBES FOR DETECTING POLYNUCLEOTIDE SEQUENCES IN CELLS AND TISSUES

US20260125752A1US 20260125752 A1US20260125752 A1US 20260125752A1US-20260125752-A1

Abstract

Disclosed herein, inter alia, are compositions and methods of use thereof for detecting polynucleotides from and within cells and tissues.

Inventors

Tung Thanh LE
Ryan SHULTZABERGER
Daan Witters

Assignees

Singular Genomics Systems, Inc.

Dates

Publication Date: 20260507
Application Date: 20251104

Claims (20)

1 . A method of sequencing an RNA molecule in a cell or tissue, the method comprising: (i) contacting the cell or tissue with a polynucleotide probe and hybridizing a first end of the polynucleotide probe to a first sequence of the RNA molecule, and hybridizing a second end of the polynucleotide probe to a second sequence of the RNA molecule, wherein the RNA molecule comprises a target sequence located between the first sequence and the second sequence; (ii) extending the polynucleotide probe along the target sequence to generate a complement of the target sequence, and ligating the complement of the target sequence to the polynucleotide probe thereby forming a circular oligonucleotide; (iii) amplifying the circular oligonucleotide to generate an amplification product comprising the target sequence, and (iv) sequencing the target sequence of the amplification product, wherein the first end comprises a sequence selected from SEQ ID NO:8 to SEQ ID NO: 37 and the second end comprises a sequence selected from SEQ ID NO:38 to SEQ ID NO: 67.
2 . The method of claim 1 , wherein sequencing the target sequence generates a sequencing read, the method further comprises aligning the sequencing read to a reference sequence.
3 . The method of claim 1 , further comprising repeating (i)-(iv) to generate a plurality of sequencing reads corresponding to different target sequences and computationally grouping the sequencing reads based on sequence similarity.
4 . The method of claim 3 , further comprising analyzing the groups to determine at least one of: (i) clonotype identity, (ii) clonal expansion, or (iii) repertoire diversity.
5 . The method of claim 3 , further comprising quantifying the sequencing reads assigned to each group.
6 . The method of claim 4 , further comprising quantifying the distribution of clonotypes within the tissue.
7 . The method of claim 1 , wherein the first end comprises locked nucleic acids (LNAs), Bis-locked nucleic acids (bisLNAs), twisted intercalating nucleic acids (TINAs), bridged nucleic acids (BNAs), 2′-O-methyl RNA:DNA chimeric nucleic acids, minor groove binder (MGB) nucleic acids, morpholino nucleic acids, C5-modified pyrimidine nucleic acids, peptide nucleic acids (PNAs), phosphorothioate nucleic acids, Zip nucleic acids (ZNAs), or combinations thereof.
8 . The method of claim 1 , wherein the first end of the polynucleotide probe comprises an LNA nucleotide and the second end does not comprise an LNA nucleotide.
9 . The method of claim 1 , wherein the polynucleotide probe comprises an amplification primer binding sequence, a sequencing primer binding sequence, or both an amplification primer binding sequence and a sequencing primer binding sequence.
10 . The method of claim 1 , wherein the polynucleotide probe comprises a predicted melting temperature of about 60° C.
11 . The method of claim 1 , wherein sequencing comprises repeated cycles of labeled oligonucleotide hybridization and detection.
12 . The method of claim 1 , wherein sequencing comprises sequencing by synthesis, sequencing by ligation, sequencing-by-binding, or pyrosequencing.
13 . The method of claim 1 , wherein sequencing comprises extending a sequencing primer by incorporating a labeled nucleotide, or labeled nucleotide analogue, and detecting the label for each incorporated nucleotide or nucleotide analogue, wherein the sequencing primer is hybridized to the amplification product.
14 . The method of claim 1 , further comprising contacting the cell or tissue with an antibody comprising a nucleic acid molecule and binding the antibody to the protein; binding a polynucleotide to the nucleic acid molecule; and detecting the polynucleotide, thereby detecting the protein.
15 . The method of claim 1 , wherein the RNA molecule comprises a sequence encoding for a complementarity-determining region (CDR) of a T cell receptor or a B cell receptor.
16 . The method of claim 1 , further comprising determining a location of the RNA molecule in the cell or tissue based on the detected sequence.
17 . A method of sequencing an agent-mediated nucleic acid sequence of a cell, said method comprising administering a genetically modifying agent to the cell, and sequencing an agent-mediated nucleic acid sequence of the cell in situ according to claim 1 .
18 . A computer-implemented method for analyzing sequence information detected from nucleic acid sequences in cells or tissues, said method comprising: receiving nucleic acid sequence data from a plurality of cells or tissue regions, wherein the nucleic acid sequence data comprises one or more target sequences detected according to the method of claim 1 ; computationally processing the nucleic acid sequence data to determine sequence variability across the plurality of cells or tissue regions; generating, using the processed nucleic acid sequence data, a spatial profile of clonotype distributions that maps sequence diversity; and outputting the spatial profile as a digital representation that correlates the nucleic acid sequence data with cellular or tissue locations, thereby creating a spatial map of sequence-based cellular diversity within the cells or tissues.
19 . A method of grouping cells, said method comprising amplifying a first target sequence according to claim 1 in a first cell, amplifying a second target sequence according to claim 1 in a second cell, detecting each target sequence, and computationally grouping the cells based on the co-occurrence of the detected target sequences.
20 . A kit comprising a plurality of polynucleotide probes, each polynucleotide probe comprising a first end configured to bind to a first sequence and a second end configured to bind to a second sequence, wherein the first sequence comprises a sequence selected from SEQ ID NO:8 to SEQ ID NO:37 and the second sequence comprises a sequence selected from SEQ ID NO:38 to SEQ ID NO:67.

Description

CROSS-REFERENCES TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/717,615, filed Nov. 7, 2024, which is incorporated herein by reference in its entirety and for all purposes. SEQUENCE LISTING The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML copy, created on Oct. 27, 2025, is named 00636001US.xml, and is 91,131 bytes in size. BACKGROUND Profiling the genome, epigenome, transcriptome, and proteome enables researchers to analyze heterogeneity and distributions of gene and protein co-expression patterns within cells and tissues. This level of information is pivotal for learning how cell co-localization influences microenvironments and ultimately the development of tissue and various diseases. As such, it can have wide-reaching implications for understanding and uncovering new therapies and treatments. Including spatial information while collecting and analyzing multiomics data enables a more comprehensive view than ever before, contributing powerful insights and knowledge that is often overlooked. Combining this data can reveal detailed information as to how diseases can spread and how cell to cell communication operates; it can also help in designing targeted treatment approaches. Thus, spatial information is invaluable when learning about multiomics and should not be excluded when fully exploring cell and tissue compositions. Disclosed herein, inter alia, are solutions to these and other problems in the art. BRIEF SUMMARY In an aspect is provided a method of forming a circular oligonucleotide in a cell or tissue (e.g., in situ). In embodiments, the method includes contacting the cell or tissue with a polynucleotide probe and hybridizing a first hybridization sequence of the polynucleotide probe to a first sequence of a target polynucleotide (e.g., RNA molecule), and hybridizing a second hybridization sequence of the polynucleotide probe to a second sequence of the target polynucleotide molecule, wherein the target polynucleotide molecule is in the cell or tissue and comprises a target sequence between the first sequence and the second sequence; extending the polynucleotide probe along the target sequence to generate a complement of the target sequence, and ligating the complement of the target sequence to the polynucleotide probe thereby forming a circular oligonucleotide. In embodiments, the target polynucleotide is in a cell. In embodiments, the target polynucleotide is on a cell. In embodiments, the target polynucleotide is in a tissue (e.g., kidney, lung, breast, colon, skin, or placenta tissue). In another aspect is provided a computer-implemented method for analyzing sequence information detected from nucleic acid sequences in cells or tissues. In embodiments, the method includes receiving nucleic acid sequence data from a plurality of cells or tissue regions, wherein the nucleic acid sequence data comprises one or more target sequences detected according to the methods described herein; computationally processing the nucleic acid sequence data to determine sequence variability across the plurality of cells or tissue regions; generating, using the processed nucleic acid sequence data, a spatial profile of clonotype distributions that maps sequence diversity; and outputting the spatial profile as a digital representation that correlates the nucleic acid sequence data with cellular or tissue locations, thereby creating a spatial map of sequence-based cellular diversity within the cells or tissues. In embodiments, the method generates one or more data set(s). For example, the nucleic acid sequence data may be stored as a data set. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1. Scheme of Direct-Seq™ methodology from probe hybridization, gap extension, and probe ligation to in situ sequencing on G4X™ by SBS chemistry. The hybridization regions are conserved, the center region represents the unknown/variable sequence to be determined by Direct-Seq. FIGS. 2A-2B. Nuclear staining (FIG. 2A) of a mixture of Ramos, Jurkat, and SupT1 cells overlayed to fluorescent single base reads through SBS chemistry in cycle 9. The full sequence is presented as SEQ ID NO:81, SEQ ID NO:82, and SEQ ID NO:83. Distinct features indicate different cell types expected from the target sequences. FIG. 2B reports the average sequencing accuracy for each cycle within the Direct-Seq reads. FIGS. 3A-3B. Sequence conservation frequency from all specific reads decreases as the read enters the more diverse CDR3 for IgH, as expected (FIG. 3A). Intracellular accuracy (i.e., sequencing accuracy within each cell) remains high over 50 cycles of sequencing. Distributions of the mean Hamming Distance (HD) of reads within a cell relative to a cellular consensus (i.e., intracellular) and the minimum HD between each per-cell consensus sequence and all other per-cell consensus sequences (