EP-4737566-A1 - CAS12 PROTEIN AND USE THEREOF
Abstract
Disclosed is a Cas12 protein, a guide polynucleotide, an inactivated Cas12 mutant, a fusion protein or conjugate including the Cas12 protein, an isolated nucleic acid, a CRISPR-Cas12 system, a vector system, a delivery system, a cell, a pharmaceutical composition, and a kit, and uses thereof.
Inventors
- LIANG, Junbin
- HUANG, Liancheng
- SI, Kaiwei
- CHEN, Chongjian
- SUN, YANG
- PAN, Weiye
- CAI, Jinxiu
- LIAO, Qing
Assignees
- Reforgene Medicine
- Zheijiang Synsorbio Technology Co., Ltd.
- Zhejiang Synsorbio Gene Technology Co., Ltd.
Dates
- Publication Date
- 20260506
- Application Date
- 20250523
Claims (16)
- A Cas12 protein, wherein the Cas12 protein conforms to any one of the following (a)-(c): (a) the Cas12 protein is selected from the group consisting of a CLUSTER1 protein, a CLUSTER2 protein, a CLUSTER3 protein, a CLUSTER4 protein, a CLUSTERS protein, a CLUSTER6 protein, a CLUSTER7 protein, a CLUSTER8 protein, a CLUSTER9 protein, a CLUSTER10 protein, a CLUSTER11 protein, and a CLUSTER12 protein; (b) the Cas12 protein comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of amino acid sequences shown in SEQ ID NOs: 1-35; and (c) the Cas12 protein belongs to a Cas12h subtype, wherein the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to a target nucleic acid in a eukaryotic cell; preferably, the Cas12 protein retains a function of a protein having an amino acid sequence as shown in any one of SEQ ID NOs: 1-35; preferably, a protospacer adjacent motif (PAM) sequence (5'→3') recognized by the Cas12 protein is selected from any one or more of the following: A, C, T, G, TA, TC, GN, AA, AG, TG, AN, GG, CG, TN, NT, NG, GT, NA, CC, AC, GC, AT, CT, GA, TT, CN, NC, CA, NTN, ANN, TTN, ATC, NAC, AGA, TGC, TCT, NGN, CGC, NTC, GCA, TCG, TTT, CCG, GGG, NAG, ACA, CGG, CNG, ACN, GTG, CNT, TTG, TCN, GGT, TNC, CCN, CGT, TGG, CGA, NGG, TCC, AGT, NCA, CAN, TCA, NNG, TAC, CCT, NTG, CGN, TGN, CAT, NGC, GNG, GNC, NNA, GAA, TTC, CTT, ATA, TAT, GCT, NCC, TTA, AGN, GNN, CAA, CAC, AGG, NTT, ANG, GNA, GTT, NGA, TAA, GTA, GGN, GNT, NCG, ATT, CCA, CNN, AAA, AAC, ATN, GAG, CTG, ACG, NAA, TAN, NAT, CNA, GCN, GTC, NCN, CTN, CNC, ANT, NNC, CAG, NAN, ATG, NCT, CCC, AAN, TGT, TNA, ACC, GAT, ACT, AAT, GGA, GAN, ANC, GAC, NNT, CTA, TNN, GCG, GTN, TNT, AAG, TAG, NGT, NTA, ANA, CTC, GCC, TGA, GGC, AGC, TNG, NGAA, GANC, GCNC, NTNT, TGGG, AAGG, AAGN, NTNN, TCGT, CNTG, NTGG, CCGN, ATAT, TGCA, NGGT, TGNT, NNTG, NCCG, ACAT, GNTG, CGCG, GACN, NTCG, TCNG, CTGC, TNNC, GGTN, CGNN, TCCA, AGCN, TNAG, GGAC, GATC, AANA, NATG, CCAG, NAAT, TCNT, CACT, CGGC, CGAN, CNCA, ATNT, NNNG, NGCT, CTGG, GGAN, NTNC, ATTC, AATG, CNTC, TGGN, NATC, GTCG, ACNC, GCNN, GACT, CTNT, NCTT, NAGG, NANC, CTTA, GTCT, ANAG, NGCN, CNNA, TCAG, ACAC, NCGG, TNNT, CAAG, ACCT, CCCA, GTNC, ANTC, GACC, AACG, TTAA, TCCG, CGCC, NCCN, TTNA, NCNT, NGCA, AGNN, AATC, GGGA, GNAN, NAGA, CGNA, GTAT, GTNA, ATNC, ACNA, GGAA, NTCC, GGCG, AATN, CNNT, AGGC, GCGN, GTGC, TTGA, AAGC, GAAG, ATNG, TGCT, TACT, CTAN, GGCT, GNGC, GTCN, CGAA, CNAC, GCCT, TAGG, ANGC, TNAA, GANT, NCNA, NCCT, AGAN, GTAA, TTTN, ATGA, TGNA, CANC, ACGA, CCAC, CCGG, CTNG, CNGN, GGTA, NGNC, GTTT, CTAA, TNCT, CTGN, NGAC, TGTA, TANN, GCNT, GCTC, CNCG, AAAN, CCNT, GANA, CACA, CTNA, ANTN, TTNT, CCTG, TNTT, CANA, NTAN, CACG, GGAT, TTTC, GNCG, TACA, GTAC, GAGC, ACNN, ATGG, AANT, ATCC, ACCG, AGNC, TGTT, NCAT, ATTA, GNTT, GAGN, TNAC, GCCG, NTNG, GTGG, GNGN, ACCA, NTAA, ACTN, NCTG, NCTA, TTTT, GCNG, NTAG, CAAA, GGNA, CNTN, TTAG, TCTG, NCTN, TATG, GCGT, TANT, GGGT, NACN, ACTG, CCNG, GNNT, CCAT, GNTA, NANT, TACN, TGTN, ATCT, NCAN, TNGG, CNNN, AAGT, ATTN, GGNN, CAGC, CGTN, GCCC, GCTT, CNAT, NANA, CCNN, GNGA, TNGN, GCAG, CGNG, CCTT, NGAG, NCNG, AANG, GGTC, ACTC, TGAA, NAGN, NNCA, ACGG, TGAC, TCCN, ANNN, TCGN, TAAN, CAGG, TTAN, NGAN, NTGC, CCNC, TNTN, ATGN, GTGN, GCAT, NNGN, NNCC, CCNA, CNAG, GNAC, CGNT, TTCN, TAGN, ANCT, NATN, GTGA, TNGT, CTAT, CCCG, TNCA, NGTA, NNGA, CGTG, TAAT, CGCA, NNCG, NGTC, NAGT, GNAT, TNTC, NCGC, NGGN, CATN, GTTN, AGTA, GNNG, TTNN, TGNC, NAAA, TNCC, CACC, CTCT, TTGN, GCTA, NTTT, TGAN, TNAN, NGAT, CCTN, GAAT, GTCA, NTCN, GCCA, ANTG, TGGC, CAAC, TTTA, TGTC, CGGA, NCGN, AGNT, NCGA, ANCG, ACAA, TAGT, CGAG, NCAA, AATA, AGGG, GNGT, CAGA, AGGT, GGGG, ANAC, TGGT, GTGT, GNCA, GTTA, NGTT, TNNG, NCAG, CACN, GCAN, GAAC, NCCA, TTCC, NCNN, GNNN, ANGT, NTNA, CCCT, GNAA, TTNG, GTNN, GGNG, TCTA, NCAC, GANG, TTCG, CCTC, CNGG, ANNA, TCAN, ATCG, NTGA, CGTA, TTAC, GCTN, GCTG, NGTG, TCCC, CANN, NNNA, TAGA, ACGT, AGAT, GATG, GCCN, TGNG, GCGC, CCGA, GNCN, NTTG, NNAT, TNCG, NANG, GGTG, NCCC, GNCC, CAAT, CGCN, CNGA, NTTC, TTCT, NGGA, AGTC, CNNC, NACG, AGTN, NANN, ACAG, GNCT, TACC, CNTA, TGTG, CATC, GACA, TCTT, NTCT, CTGA, AGGA, GATA, TNAT, CCTA, GGAG, ANCC, AANC, GTAN, GCNA, TGNN, TANC, GNTN, AGCG, CTAG, NNAA, AGTT, CTAC, TACG, TTNC, TNTA, ANTT, ATAC, TCCT, TCAC, NGGC, NTTN, NNTC, CANT, ATAA, TGCC, CTCC, TNNA, GTNG, ACGN, GGCA, AAAG, TTGT, NGNA, NAAN, TATN, CGGG, CATA, ATGC, ACGC, ACCN, ATTT, TCNA, TNGC, NACA, NACC, CTCN, GGCC, TANG, AGAA, TNGA, TAGC, CAGN, GGCN, ANNT, NNNC, TCAT, CATT, TAAA, ATGT, TGAG, CGCT, TCGG, GCAC, GTAG, NTCA, NATT, ANTA, CCCN, ACTA, AAAA, GAAN, TATT, NNAC, TGAT, GGGN, CCAA, GNGG, CCAN, GTCC, NNCT, AGNG, CNTT, CNCT, GANN, GGTT, AGCT, CATG, NTAC, TNCN, NNTN, TGGA, GATT, AGCA, TAAG, GCGA, ACTT, ANGN, NTGN, AACN, AACT, TCAA, NTAT, TCGA, NCTC, NNGG, ANGG, NNTT, GTNT, CTNN, CGGN, TAAC, GGNC, GAAA, ACNG, GNAG, TTGG, CTTC, CNGT, TNNN, TNTG, GTTG, TCNN, CGGT, GAGA, CNNG, NCNC, GAGG, AGCC, ATNN, NNNT, AGAC, AACC, ANNC, ANNG, ACAN, GTTC, TATA, GNTC, NCGT, NGNT, CGTC, CCGC, CGAC, GACG, ATTG, GNNC, CNAA, TATC, AGNA, CTNC, TTCA, ANCA, ACCC, AGTG, CCGT, ANAT, CTGT, GGGC, NTTA, NAAG, AANN, CNAN, NNCN, ANAA, ANAN, CTTG, NGNN, AGAG, TANA, TCNC, GCAA, NGNG, NAGC, NATA, ATCN, CGTT, CNGC, GATN, NNTA, AAGA, CTTT, AAAC, AGGN, ACNT, NTGT, CTTN, ATCA, NACT, NNAG, NGTN, NAAC, TGCG, GGNT, ATAN, TTGC, ANCN, CCCC, ANGA, NGCG, TCTC, CTCG, ATNA, AATT, NNAN, NNGT, TCGC, ATAG, CAAN, AACA, TTAT, CAGT, GNNA, TGCN, GCGG, NGGG, CANG, TTTG, GAGT, AAAT, CTCA, CNCN, CNCC, TCTN, CGNC, NGCC, CGAT, and NNGC; preferably, the PAM sequence (5'→3') recognized by the Cas12 protein is selected from any one or more of the following: WYR, BMCTTH, TTN, VNWTV, VNWTC, and VNTTC, wherein W is A or T, Y is C or T, R is A or G, B is C, G, or T, M is A or C, H is A, T, or C, N is A, T, C, or G, and V is A, C, or G; preferably, the Cas12 protein forms a complex with the guide polynucleotide; further, the complex specifically binds to the target nucleic acid; further, the complex is capable of cleaving the target nucleic acid, modifying the target nucleic acid, and/or regulating an expression of the target nucleic acid; preferably, the Cas12 protein forms a complex with the guide polynucleotide, wherein the guide polynucleotide comprises a guide sequence that is reverse complementary to the target nucleic acid; further, the guide polynucleotide comprises a scaffold sequence that interacts with the Cas12 protein; further, the scaffold sequence comprises a direct repeat (DR) sequence; preferably, the scaffold sequence does not comprise a tracrRNA sequence; preferably, the Cas12 protein is a nuclease-inactivated mutant; preferably, the Cas12 protein is a dead Cas12 mutant or a nickase Cas12 mutant; preferably, the Cas12 protein has an inactivated RuvC domain; preferably, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to a target nucleic acid in a eukaryotic cell; further preferably, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to a target nucleic acid and cleaves the target nucleic acid in a eukaryotic cell; preferably, the Cas12 protein comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity to an amino acid sequence shown in SEQ ID NO: 18; preferably, the Cas12 protein is non-natural or engineered; preferably, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to a target nucleic acid; preferably, the complex specifically binds to and cleaves the target nucleic acid, or the complex specifically binds to the target nucleic acid but does not cleave the target nucleic acid; preferably, the complex is non-natural or engineered; preferably, the guide polynucleotide comprises a guide sequence and a scaffold sequence; preferably, the guide sequence is reverse complementary to the target nucleic acid, and the scaffold sequence interacts with the Cas12 protein; preferably, the scaffold sequence is a direct repeat sequence; preferably, the guide sequence is located at a 5' end or a 3' end of the scaffold sequence; preferably, the guide polynucleotide is non-natural or engineered; preferably, the PAM sequence recognized by the Cas12 protein is 5'-WYR-3', wherein W is A or T, Y is C or T, and R is A or G; preferably, the PAM sequence recognized by the Cas12 protein is 5'-ACA-3', 5'-TCA-3', 5'-ATA-3', 5'-TTA-3', 5'-ACG-3', 5'-TCG-3', 5'-ATG-3', 5'-TTG-3', and/or 5'-TTN-3'; preferably, the scaffold sequence comprises a nucleotide sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of sequences shown in SEQ ID NO: 84-86 or SEQ ID NO: 187-195; preferably, the scaffold sequence comprises a nucleotide sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to a sequence shown in SEQ ID NO: 84; preferably, the Cas12 protein has at least one mutation in at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, or at least 12 of amino acid residues corresponding to positions 1-891 of the amino acid sequence shown in SEQ ID NO: 18; preferably, the mutation is a mutation to any other natural amino acid residue; preferably, the mutation is a mutation to residue R, H, K, or A; preferably, the mutation is a mutation to residue R; preferably, the Cas12 protein has at least one mutation at amino acid residues corresponding to any 1, any 2, any 3, any 4, any 5, any 6, any 7, any 8, any 9, any 10, any 11, any 12, any 13, any 14, any 15, any 16, or more positions in the amino acid sequence shown in SEQ ID NO: 18, and the positions are selected from N5, D9, E58, S100, N115, K142, C148, S147, K232, S245, I251, Y263, D279, A297, L300, E303, L337, M378, N394, T396, T443, K458, T468, K533, F537, F548, N550, D697, A706, and I788; preferably, the mutation is a mutation to residue R; preferably, the Cas12 protein has at least one mutation at amino acid residues corresponding to any 1, any 2, or any 3 positions in the amino acid sequence shown in SEQ ID NO: 18, and the positions are selected from D480, E675, and D757; preferably, the mutation is a mutation to any other natural amino acid residue; preferably, the mutation is a mutation to residue A.
- A guide polynucleotide, comprising: (i) a direct repeat sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to a sequence shown in any one of SEQ ID NOs: 36-170 and 187-195, and (ii) a guide sequence engineered to hybridize with a target nucleic acid; wherein the direct repeat sequence is linked to the guide sequence, and the guide polynucleotide forms a complex with a Cas12 protein and guides sequence-specific binding of the complex to the target nucleic acid; preferably, the Cas12 protein is the Cas12 protein of claim 1; preferably, the guide sequence comprises 15-60 nucleotides; preferably, the guide sequence hybridizes with the target nucleic acid, and the guide sequence is mismatched to the target nucleic acid by no more than one nucleotide; preferably, the guide polynucleotide does not comprise or comprises a tracrRNA; preferably, the tracrRNA sequence is linked to the direct repeat sequence; preferably, the tracrRNA comprises 10-200 nucleotides; preferably, the guide sequence is located at a 3' end of the direct repeat sequence; preferably, the guide sequence is located at a 5' end of the direct repeat sequence.
- An inactivated Cas12 mutant, wherein the inactivated Cas12 mutant is a nuclease-inactivated mutant of the Cas12 protein of claim 1; preferably, the inactivated Cas12 mutant is a dead Cas12 mutant or a nickase Cas12 mutant; preferably, the inactivated Cas12 mutant has an inactivated RuvC domain.
- A fusion protein or conjugate, comprising: (1) the Cas12 protein of claim 1, or the inactivated Cas12 mutant of claim 3; and (2) a homologous or heterologous functional domain; preferably, the functional domain has an enzymatic activity for modifying the target nucleic acid sequence; the enzymatic activity comprising a nuclease activity, a methyltransferase activity, a demethylase activity, a DNA repair activity, a DNA damage activity, a deaminase activity, a dismutase activity, an alkylation activity, a depurination activity, an oxidation activity, a pyrimidine dimer formation activity, an integrase activity, a transposase activity, a recombinase activity, a polymerase activity, a ligase activity, a helicase activity, a photolyase activity, a glycosylase activity, a deglycosylation activity, an acetyltransferase activity, a deacetylase activity, a kinase activity, a phosphatase activity, a ubiquitin ligase activity, a deubiquitination activity, an adenylation activity, a deadenylation activity, a SUMOylating activity, a deSUMOylating activity, a myristoylation activity, and/or a demyristoylation activity; preferably, the homologous or heterologous functional domain is selected from one or more of the following: a subcellular localization signal, a DNA binding domain, a protease domain, a transcriptional activation domain, a transcriptional repression domain, a nuclease domain, a deaminase domain, a uracil DNA glycosylase domain (UDG), a uracil DNA glycosylase inhibitor domain (UGI), a methylase, a demethylase, a transcription release factor, a histone acetyltransferase domain, a histone deacetylase domain, a DNA ligase, an affinity tag, a reporter tag, an affinity domain, and a reporter domain; preferably, the nuclease domain comprises a polypeptide with an ssDNA cleavage activity and/or a polypeptide with a dsDNA cleavage activity; preferably, the Cas12 protein or the inactivated Cas12 mutant is directly or indirectly linked to the homologous or heterologous functional domain; preferably, the direct linkage is a covalent linkage, and the indirect linkage is a linkage through an amino acid linker or a non-amino acid linker; preferably, the homologous or heterologous functional domain is fused or conjugated at N-terminal, C-terminal, or internally with respect to the Cas12 protein or the inactivated Cas12 mutant; preferably, the Cas12 protein comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity to the amino acid sequence shown in SEQ ID NO: 18; preferably, the Cas12 protein is non-natural or engineered; preferably, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to a target nucleic acid; preferably, the complex specifically binds to and cleaves the target nucleic acid, or the complex specifically binds to the target nucleic acid but does not cleave the target nucleic acid; preferably, the complex is non-natural or engineered; preferably, the guide polynucleotide comprises a guide sequence and a scaffold sequence; preferably, the guide sequence is reverse complementary to the target nucleic acid, and the scaffold sequence interacts with the Cas12 protein; preferably, the scaffold sequence is a direct repeat sequence; preferably, the guide sequence is located at the 5' end or the 3' end of the scaffold sequence; preferably, the guide polynucleotide is non-natural or engineered; preferably, a PAM sequence recognized by the Cas12 protein is 5'-WYR-3', wherein W is A or T, Y is C or T, and R is A or G; preferably, the PAM sequence recognized by the Cas12 protein is 5'-ACA-3', 5'-TCA-3', 5'-ATA-3', 5'-TTA-3', 5'-ACG-3', 5'-TCG-3', 5'-ATG-3', 5'-TTG-3', and/or 5'-TTN-3'; preferably, the scaffold sequence comprises a nucleotide sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of sequences shown in SEQ ID NO: 84-86 and 187-195; preferably, the scaffold sequence comprises a nucleotide sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to a sequence shown in SEQ ID NO: 84; preferably, the Cas12 protein has at least one mutation in at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, or at least 12 of amino acid residues corresponding to positions 1-891 of the sequence shown in SEQ ID NO: 18; preferably, the mutation is a mutation to any other natural amino acid residue; preferably, the mutation is a mutation to residue R, H, K, or A; preferably, the mutation is a mutation to residue R; preferably, the Cas12 protein has at least one mutation at amino acid residues corresponding to any 1, any 2, any 3, any 4, any 5, any 6, any 7, any 8, any 9, any 10, any 11, any 12, any 13, any 14, any 15, any 16, or more positions in the amino acid sequence shown in SEQ ID NO: 18, and the positions are selected from N5, D9, E58, S100, N115, K142, C148, S147, K232, S245, I251, Y263, D279, A297, L300, E303, L337, M378, N394, T396, T443, K458, T468, K533, F537, F548, N550, D697, A706, and I788; preferably, the mutation is a mutation to residue R; preferably, the Cas12 protein has at least one mutation at amino acid residues corresponding to any 1, any 2, or any 3 positions of the amino acid sequence shown in SEQ ID NO: 18, and the positions are selected from D480, E675, and D757; preferably, the mutation is a mutation to any other natural amino acid residue; preferably, the mutation is a mutation to residue A; preferably, the functional domain has an epigenomic modification activity or an epigenetic modification activity; preferably, the epigenomic modification or the epigenetic modification comprises, but is not limited to, DNA methylation, RNA methylation, RNA interference, nucleosome positioning, chromatin conformation change, chromatin remodeling, histone modification, and modification of long non-coding RNA sequences; preferably, the functional domain is an epigenomic modification functional domain or an epigenetic modification functional domain; preferably, the functional domain has a single-base editing activity; preferably, the functional domain is selected from one or more of the following: a nuclease (e.g., FokI), a DNA methyltransferase, a DNA demethylase, a histone methyltransferase, a histone demethylase, a DNA repair enzyme, a DNA damage enzyme, a base deaminase (including, but not limited to, an adenine deaminase, a cytosine deaminase), a dismutase, an alkylase, a depurinase, an oxidase, a pyrimidine dimer-forming enzyme, an integrase, a transposase, a recombinase, a polymerase, a ligase, a helicase, a photolyase, a glycosylase, a deglycosylase, an acetyltransferase, a deacetylase, a kinase, a phosphatase, a ubiquitin ligase, a deubiquitinating enzyme, an adenylylase, a deadenylase, a SUMOylating enzyme, a deSUMOylating enzyme, a myristoylase, and/or a demyristoylase; preferably, the functional domain is an adenine deaminase or a cytosine deaminase.
- An isolated nucleic acid, wherein the isolated nucleic acid encodes the Cas12 protein of claim 1, the inactivated Cas12 mutant of claim 3, or the fusion protein or conjugate of claim 4; preferably, the nucleic acid is codon optimized for expression in cells; preferably, the nucleic acid is codon optimized for expression in a prokaryotic cell; preferably, the nucleic acid is codon optimized for expression in a eukaryotic cell; preferably, the nucleic acid is codon optimized for expression in an eukaryote, a mammal such as a human or a non-human mammal, a plant, an insect, a bird, a reptile, a rodent (e.g., a mouse, a rat), a fish, a worm/nematode, or a yeast.
- A CRISPR-Cas12 system, comprising: a) the Cas12 protein of claim 1, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, or the isolated nucleic acid according to claim 5; and b) a guide polynucleotide, or a polynucleotide sequence encoding the guide polynucleotide; preferably, the Cas12 protein, the inactivated Cas12 mutant, the fusion protein or the conjugate forms a complex with the guide polynucleotide; and the guide polynucleotide comprises a guide sequence engineered to guide a sequence-specific binding of the complex to a target nucleic acid; preferably, the guide polynucleotide comprises a direct repeat sequence linked to the guide sequence, preferably, the direct repeat sequence has at least 50% sequence identity to any one of sequences shown in SEQ ID NOs: 36-170 and 187-195; preferably, the guide sequence comprises 15-35 nucleotides, and/or the guide sequence hybridizes to the target nucleic acid, the guide sequence is 90%-100% complementary to the target nucleic acid, preferably, the guide sequence is mismatched to the target nucleic acid by no more than one nucleotide. preferably, the guide sequence comprises 15-60 nucleotides; preferably, the guide sequence hybridizes with the target nucleic acid; preferably, the guide sequence is mismatched to the target nucleic acid by no more than one nucleotide; preferably, the guide sequence is located at the 3' end of the direct repeat sequence; preferably, the guide sequence is located at the 5' end of the direct repeat sequence; preferably, the target nucleic acid is DNA or RNA, preferably, dsDNA or ssDNA; preferably, the DNA is eukaryotic DNA; preferably, the eukaryotic DNA is non-human mammalian DNA, non-human primate DNA, human DNA, plant DNA, insect DNA, bird DNA, reptile DNA, rodent DNA, fish DNA, worm/nematode DNA, or yeast DNA; preferably, the target nucleic acid is a disease or disorder-related gene or a signaling biochemical pathway-related gene, or the target nucleic acid is a reporter gene.
- A vector system, comprising one or more recombinant vectors, wherein the one of the recombinant vectors comprise the isolated nucleic acid of claim 5, or the CRISPR-Cas12 system of claim 6; preferably, the recombinant vector further comprises a regulatory sequence; preferably, a polynucleotide sequence encoding the Cas12 protein, the inactivated Cas12 mutant, or the fusion protein or the conjugate is operably linked to a regulatory sequence, and/or a polynucleotide sequence encoding the guide polynucleotide is operably linked to the regulatory sequence; more preferably, the regulatory sequence is preferably selected from at least one of a promoter, an enhancer, an internal ribosome entry site (IRES), and a transcription termination signal, wherein the promoter comprises a constitutive promoter, an inducible promoter, a broad-spectrum promoter, or a tissue-specific promoter, and/or the transcription termination signal comprises a polyadenylation signal or a poly-U sequence; preferably, a scaffold of the recombinant vector is an adeno-associated virus (AAV) vector, a lentiviral vector, a ribonucleoprotein (RNP) complex, or a virus-like particle; preferably, when the scaffold is the AAV vector, the AAV vector is a recombinant AAV vector of serotype AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV PHP.B, AAV PHP.B2, AAV PHP.B3, AAV PHP.A, AAV PHP.eB, AAV PHP.eS, AAV2.7m8, AAV8.7m8, AAV ShH10, AAVrh10, or AAVrh74; when the scaffold is the lentiviral vector, the lentiviral vector is pseudotyped with an envelope protein; preferably, the isolated nucleic acid is linked to an aptamer sequence; when the scaffold of the recombinant vector is the virus-like particle, the isolated nucleic acid is linked to a gene encoding a gag protein.
- A delivery system, comprising: (1) a delivery tool; and (2) the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, or the vector system of claim 7; preferably, the delivery tool is a virus, a lipid nanoparticle, a nanoparticle, a liposome, an exosome, a microbubble, or a gene gun; preferably, the delivery tool is the lipid nanoparticle, and the lipid nanoparticle comprises the guide polynucleotide and an mRNA encoding the Cas12 protein, the inactivated Cas12 mutant, or the fusion protein or conjugate.
- A cell, comprising the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, or the delivery system of claim 8; preferably, the cell is a prokaryotic cell; preferably, the cell is a eukaryotic cell; preferably, the eukaryotic cell is a mammalian cell.
- A pharmaceutical composition, comprising the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, or the cell of claim 9; preferably, the pharmaceutical composition comprises pharmaceutically acceptable excipients.
- A kit, comprising the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, or the pharmaceutical composition of claim 10.
- A use of the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, the pharmaceutical composition of claim 10, or the kit of claim 11 in preparing a reagent or medicament for diagnosing, treating, and/or preventing a disease or disorder associated with a target nucleic acid; preferably, the disease or disorder is a hematological disease or disorder, an ophthalmic disease or disorder, a neurological disease or disorder, a respiratory disease or disorder, a hepatic disease or disorder, a metabolic disease or disorder, cancer, or an infectious disease; and/or the reagent or medicament is used to: cleave one or more target nucleic acid molecules or introduce nicks into one or more target nucleic acid molecules, activate or upregulate an expression of the one or more target nucleic acid molecules, activate or inhibit transcription of the one or more target nucleic acid molecules, inactivate the one or more target nucleic acid molecules, visualize, label, or detect the one or more target nucleic acid molecules, bind the one or more target nucleic acid molecules, transport the one or more target nucleic acid molecules, and mask the one or more target nucleic acid molecules.
- A method for detecting, binding, or cleaving a target nucleic acid, comprising using the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, the pharmaceutical composition of claim 10, or the kit of claim 11 to contact the target nucleic acid; preferably, the method is for non-diagnostic and/or non-therapeutic purposes; and/or the fusion protein or conjugate comprises a detectable marker, such as, a marker detectable by fluorescence, DNA blotting, or FISH.
- A method for altering a cell state, comprising using the Cas12 protein of to claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, the pharmaceutical composition of claim 10, or the kit of claim 11 to contact a cell to alter the cell state; preferably, the method results in one or more of: an increase or decrease in an expression of a specific gene, an induction of cellular senescence in vitro or in vivo, an induction of cellular cycle arrest in vitro or in vivo, a cellular growth promotion and/or cellular growth inhibition in vitro or in vivo, an induction of anergy in vitro or in vivo, an induction of apoptosis in vitro or in vivo, and an induction of necrosis in vitro or in vivo; preferably, the method is for non-diagnostic and/or non-therapeutic purposes.
- A method for diagnosing, treating, or preventing a disease or disorder associated with a target nucleic acid, comprising administering the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, the pharmaceutical composition of claim 10, or the kit of claim 11 to a sample from a subject in need or a subject in need; preferably, the disease or disorder is a hematological disease or disorder, an ophthalmic disease or disorder, a neurological disease or disorder, a respiratory disease or disorder, a hepatic disease or disorder, a metabolic disease or disorder, a cancer, or an infectious disease.
- A use of the Cas12 protein of claim 1, the guide polynucleotide of claim 2, the inactivated Cas12 mutant of claim 3, the fusion protein or conjugate of claim 4, the nucleic acid of claim 5, the CRISPR-Cas12 system of claim 6, the vector system of claim 7, the delivery system of claim 8, the cell of claim 9, the pharmaceutical composition of claim 10, or the kit of claim 11 in diagnosing, treating, or preventing a disease or disorder associated with a target nucleic acid; preferably, the disease or disorder is a hematological disease or disorder, an ophthalmic disease or disorder, a neurological disease or disorder, a respiratory disease or disorder, a hepatic disease or disorder, a metabolic disease or disorder, a cancer, or an infectious disease.
Description
This application claims priority to Chinese Patent Application No. 202410661837.4, filed on May 24, 2024, and Chinese Patent Application No. 202510611056.9, filed on May 13, 2025, the entire contents of each of which are incorporated herein by reference. TECHNICAL FIELD The present disclosure generally relates to the field of CRISPR gene editing, and in particular, to Cas12 proteins and uses thereof. BACKGROUND The clustered regularly interspaced short palindromic repeats (CRISPR) and the CRISPR-associated protein system (CRISPR-Cas system) form an adaptive immune defense developed by bacteria and archaea over a long period of time, and these mechanisms are used to fight against invading viruses and exogenous DNA. Based on extensive research, CRISPR-Cas system are being employed make changes to gene sequences directly in cells, providing a fast and effective approach for gene editing. It is therefore always desirable to develop new Cas12 proteins and CRISPR-Cas12 gene editing systems. SUMMARY The present disclosure provides Cas12 proteins and uses thereof. One or more embodiments of the present disclosure provide a Cas12 protein, and the Cas12 protein is selected from the group consisting of a CLUSTER1 protein, a CLUSTER2 protein, a CLUSTER3 protein, a CLUSTER4 protein, a CLUSTERS protein, a CLUSTER6 protein, a CLUSTER7 protein, a CLUSTERS protein, a CLUSTERS protein, a CLUSTER10 protein, a CLUSTER11 protein, and a CLUSTER12 protein. In some embodiments, the Cas12 protein is the CLUSTER1 protein. In another aspect, one or more embodiments of the present disclosure provide a Cas12 protein, the Cas12 protein belongs to a Cas12h subtype (subtype V-H), and the Cas12 protein specifically binds to a target nucleic acid in a eukaryotic cell. In some embodiments, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to the target nucleic acid in the eukaryotic cell. In some embodiments, the Cas12 protein forms a complex with a guide polynucleotide, and the complex specifically binds to and cleaves the target nucleic acid in the eukaryotic cell. In some embodiments, the target nucleic acid is located within a nucleus of the eukaryotic cell. In some embodiments, the target nucleic acid is located within a mitochondrion of the eukaryotic cell. In some embodiments, the target nucleic acid is located within a chloroplast of the eukaryotic cell. In some embodiments, the eukaryotic cell is a mammalian cell. In some embodiments, the eukaryotic cell is a human cell. In some embodiments, the Cas12 protein is the CLUSTER1 protein. In some embodiments, the Cas12 protein comprises an amino acid sequence having at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity to any one of the amino acid sequences shown in SEQ ID NO: 3 or SEQ ID NO: 18. In another aspect, one or more embodiments of the present disclosure provides a Cas12 protein, and the Cas12 protein comprises an amino acid sequence having at least 50% sequence identity to any one of the amino acid sequences shown in SEQ ID NOs: 1-35. Table 1 lists Cas proteins having the amino acid sequences shown in SEQ ID NOs: 1-35, and Table 2 lists a direct repeat (DR) sequence corresponding to each Cas protein. In some embodiments, the at least 50% sequence identity comprises at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity. In some embodiments, the Cas12 protein comprises an amino acid sequence having at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence identity to any one of the amino acid sequences shown in SEQ ID NOs: 1-35. In some embodiments, the Cas12 protein comprises an amino acid sequence having at least 80% sequence identity to any one of the amino acid sequences shown in SEQ ID NOs: 1-35. In some embodiments, the Cas12 protein comprises an amino acid sequence having at least 85% sequence identity to any one of the amino acid sequences shown in SEQ ID NOs: 1-35. In some embodiments, the Cas12 protein co