CN-121991923-A - Cas protein, corresponding gene editing system and application thereof
Abstract
Cas proteins and compositions comprising Cas proteins, and CRISPR-Cas systems are provided. Further, methods and applications of using these compositions, CRISPR-Cas systems are provided. Also provided are cells comprising Cas proteins, compositions, CRISPR-Cas systems.
Inventors
- ZHANG HONGLING
Assignees
- 上海尧唐生物科技股份有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20240511
- Priority Date
- 20230511
Claims (20)
- 1.A Cas protein, wherein the protein is selected from the group consisting of: (a) A polypeptide having the amino acid sequence shown in SEQ ID NO. 1; (b) A polypeptide having at least about 60% (e.g., 61%,62%,63%,64%,65%,66%,67%,68%,69%,70%,71%,72%,73%,74%,75%,76%,77%,78%,79%,80%,81%,82%,83%,84%,85%,86%,87%,88%,89%,90%,91%,92%,93%,94%,95%,96%,97%,98%,99% or 99.5%) identity to the amino acid sequence set forth in SEQ ID No. 1; (c) A polypeptide formed by substitution, deletion or addition of one or more (optionally, about 1-50, about 1-10, about 1-40, about 1-30, about 2-25, preferably, 2-20) amino acid residues of the amino acid sequence shown in any of SEQ ID No. 1.
- 2. The Cas protein of claim 1, wherein the Cas protein comprises a mutation at one or more sites relative to the polypeptide of the amino acid sequence set forth in SEQ ID No.1 such that cleavage activity or preference is altered (e.g., double strand cleavage activity is increased or decreased, single strand cleavage activity is increased or decreased, or no cleavage activity).
- 3. The Cas protein of claim 1 or 2, comprising any amino acid substitution at one or more of the following positions corresponding to SEQ ID No.1 :V7、R8、T11、S12、S110、I149、N150、H151、N152、L153、Q160、E161、Y162、N163、C164、Y165、S166、S167、F168、K195、S196、K197、S198、A199、C215、T216、A217、K219、S225、L227、L229、M231、D234、S240、S241、Q242、E243、I244、S250、F251、E252、K253、V254、K260、T261、E263、N276、Y282、D283、A285、N292、S303、I304、L307、Y308、S309、R315、E316、T317、I318、I319、V348、I349、E350、P351、L354、S355、N356、L357、K358、I373、G416、I417、E418、F419、D420、I455、R456、V458、H486、S488、L501、R509、P510、V511、L512、G513、N514、R515、V516、L525、I526、N527、K528、K529、C633、T634、T635、K636、N637、D638、R639、G640、E641、F642、E648、L650、A651、Y652、A661、T681、N682、E683、S684、G727、K728、N729、E730、S743、E748、G749、S751、K752、K781、L782、G783、E784、C785、S787、K865、P866、Y867、N868、I872、D896、N936、M957、F958、Q960、W961、P1018、S1019、R1020、N1021、S1022; Optionally, the Cas protein comprises the following mutation pattern relative to SEQ ID No. 1: (a) M231K+D234V; (b) S240A+S241R+Q242P+E243L; (c) L227I+L229R、S225P+L227A+L229R; (d) S240L+S241A+Q242S+E243H+I244L+L307R+Y308F+S309A、D283S+A285P+L307R+Y308F+S309A、L307R+Y308F+S309A+E648R+L650I+A651P+Y652F、D283S+A285P+L307R+Y308F+S309A+I373V+E748G+S751G+K752R+S787F、S303T+I304M+L307R+Y308A+S309I、S196N+K197F+S198N+A199T+I304A+L307R+Y308A+S309A、L307R+Y308F+S309A; (e) S250W+F251A+E252A+K253R+V254R; (f) K260R+T261P+E263H; (g) R315L+E316R+T317R+I318V+I319A、Y282F+D283Q+A285T+R315A+E316Q+T317R+I318L+I319Q、Y282F+D283Q+A285T+R315S+E316R+T317K+I318K+I319W+E648S+A651L; (h) L354F+S355G+N356D+L357I+K358R; (i) E648S+A651L、E648S+A651T+Y652W、Y282F+D283Q+A285T+E648T+A651S+Y652F、Y282F+D283Q+A285T+E648L+A651G、N276S+D283N+E648S+A651L; (j) T681V+N682G+E683R+S684G、T681P+N682T+E683G+S684R、T681S+N682P+E683R+S684G、T681A+N682H+E683Y+S684R、Y282F+D283Q+A285T+T681P+N682P+E683Q+S684C、Y282F+D283Q+A285T+G416S+I417H+E418Q+F419T+D420V+L501P+T681N+N682P+E683G+S684A、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+T681D+N682A+E683R+S684D、Y165W+S166R+S198N+G416R+I417V+E418Q+F419V+D420T+T681S+N682R+E683R、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+R509N+P510G+V511G+L512S+T681S+N682R+E683R、G416R+I417V+E418Q+F419V+D420T+T681S+N682R+E683R+G727V+K728C+N729G+E730G、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509V+P510R+V511L+L512A+T681S+N682R+E683R、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+R509Q+P510A+V511A+L512R+T681S+N682R+E683R+A661V、S110G+Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+R509A+P510A+V511R+L512Q+T681S+N682R+E683R、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+R509L+P510V+V511S+L512F+T681S+N682R+E683R、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509G+P510K+L512C+T681S+N682R+E683R、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+H486A+S488W+T634V+T635Y+K636S+N637R+T681S+N682R+E683R; (k) Y165W+S166R、Y165W+S166V+S167I+F168I、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T、Y165W+S166R+G416R+I417E+E418Q+F419V+D420A、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+N682G+E683A+D896N、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+G513L+N514A+R515V+V516M+N682G+E683A+S684R+D896N、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+G513C+R515Y+V516M+N682G+E683A+S684R+D896N、Y165W+S166R+G416A+I417R+E418V+F419C+D420Q+H486S+S488W+N682G+E683A+S684R+D896N、Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+G513C+R515Y+V516M+T635S+K636G+N637L+N682G+E683A+S684R+D896N、 Y165W+S166R+G416A+I417R+E418V+F419C+D420Q+H486S+S488W+K781T+G783S+E784F+C785L+D896N、Y165W+S166R+G416A+I417R+E418V+F419C+D420Q+H486S+S488W+D638F+R639P+G640N+E641Y+F642L、 Y165W+S166R+G416R+I417V+E418Q+F419V+D420T+H486A+S488W+N682G+E683A+S684R+P1018L+S1019R+R1020P+N1021R+S1022V; (l) Q160V+E161I+Y162T+N163S+C164V、Q160S+E161S+Y162L+N163C+C164V、Q160P+E161V+Y162H+N163S+C164T、Q160L+E161S+Y162F+N163S+C164A; (m) V7H+T11R+S12M、V7F+R8L+T11R+S12V、V7I+T11V+S12M+Y282F+D283Q+A285T; (n) I149M+N150S+H151C+N152T+L153Y; (o) K195R+K197A+S198A+A199G; (p) T216L+A217T+K219R、C215V+T216S+A217S+K219R; (q) Y282F+D283Q+A285T、D283I+A285R+N292L、 Y282F+D283Q+A285T+K865S+P866S+Y867H+N868F+I872L、Y282F+D283Q+A285T+M957V+F958L+Q960V+W961V、Y282F+D283Q+A285T+V348I+I349V+E350L+P351T、Y282F+D283Q+A285T+E748A+G749S+S751G、Y282F+D283Q+A285T+G416S+I417H+E418Q+F419T+D420V、Y282F+D283Q+A285T+G416L+I417Q+E418M+F419R+D420A、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q、Y282F+D283Q+A285T+G416S+I417H+E418Q+F419T+D420V+L501P+N936S、Y282F+D283Q+A285T+G416L+I417Q+E418M+F419R+D420A+I455G+R456E+V458E、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+L525R+I526V+N527D+K528P+K529G+N682G+E683A+S684R+D896N、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+P510D+V511H+L512V、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+N682G+E683A+S684R+G727E+K728H+N729Q+E730R、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+N682G+E683A+S684R+S743T、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+P510D+V511H+L512V+K781H+L782I+G783F+E784R+C785S、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+P510D+V511H+L512V+D638P+E641M+F642V、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+P510D+V511H+L512V+P1018S+S1019R+R1020V+N1021M+S1022P、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+C633N+T634E+T635E+K636G+N637L+N682G+E683A+S684R+S743T、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+C633F+T634P+T635F+K636P+N637C+N682G+E683A+S684R+S743T、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+C633S+T634C+T635G+K636H+N637F+N682G+E683A+S684R+S743T、Y282F+D283Q+A285T+G416A+I417R+E418V+F419C+D420Q+R509C+P510S+L512F+C633R+T634Y+T635L+K636V+N637D+N682G+E683A+S684R+S743T; (r) G416R+I417W+E418T+F419R+D420V、R509C+P510S+L512F+G416R+I417V+E418Q+F419V+D420T+H486A+S488W+N682G+E683A+S684R+D896N; (2) Amino acid substitutions at one or more of the following positions corresponding to SEQ ID NO.1, D592, D643, E820, D992; Optionally, the Cas protein comprises a mutation selected from D592A, D643A, E820A and/or D992A relative to SEQ ID No. 1; Preferably, the Cas protein comprises the amino acid sequence set forth in any one of SEQ ID nos. 47-50, or an amino acid sequence having at least about 80% (e.g., at least about 80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%、99.1%、99.2%、99.3%、99.4%、99.5%、99.6%、99.7%、99.8%、99.9% or 100%) sequence identity to the amino acid sequence set forth in any one of SEQ ID nos. 47-50.
- 4. A fusion protein comprising the Cas protein of any one of claims 1-3, and one or more functional domains; Alternatively, the functional domain is selected from the group consisting of a Nuclear Localization Signal (NLS), a Nuclear Export Signal (NES), a reporter protein (e.g., a fluorescent protein), a Cas protein targeting moiety, a DNA binding domain (e.g., lex a DBD, gal4DBD, sp1 DBD), an epitope tag (e.g., his, myc, V, FLAG, HA, VSV-G, etc.), a transcriptional activation domain (e.g., VP64, VPR, p65, rta), a transcriptional repression domain (e.g., KRAB domain, SID domain, nuE domain, ncoR domain, or SID4X domain), a nuclease, a deaminase (e.g., adenosine deaminase or cytidine deaminase), a methylase (e.g., DNA methylase DNMT), a demethylase, a transcriptional release factor, HDAC, a cleavage active polypeptide, a ligase, an integrase, a transposase, a recombinase, a polymerase, an exonuclease (e.g., T5E), and a base excision repair inhibitor (e.g., uracil-DNA glycosylase inhibitor (UGI)). Alternatively, the functional domain includes one or more of enzymatic activity to a target sequence, methylase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitination activity, adenylation activity, deadenylation activity, SUMO activity, desumo activity, ribosylation activity, deribosylation activity, myristoylation activity, dimyristoylation activity, glycosylation activity (e.g., from an O-GlcNAc transferase) and deglycosylation activity; alternatively, the functional domain is selected from an adenosine deaminase catalytic domain or a cytidine deaminase catalytic domain; Optionally, the adenosine deaminase catalytic domain or cytidine deaminase catalytic domain comprises one or more of ADAR1, ADAR2, apodec, AID, or TAD; Alternatively, the adenosine deaminase catalytic domain comprises an amino acid sequence having at least 80%, 82%, 85%, 87%, 90%, 92%, 95%, 96%, 97%, 98% or 99% identity to the amino acid sequence shown in SEQ ID No.31 (005V 1 deaminase selected from CN114634923a, in which case the amino acid sequence is SEQ ID No. 2) or SEQ ID No.70 (004V 1 deaminase selected from CN114634923a, in which case the amino acid sequence is SEQ ID No. 1), and which retains the deamination activity of the amino acid sequence shown in SEQ ID No. 31; Optionally, the amino acid sequence of the catalytic domain of adenosine deaminase exhibits amino acid additions, insertions, deletions and substitutions relative to the amino acid sequence shown in SEQ ID No.31 or 70; Alternatively, the adenosine deaminase catalytic domain comprises a mutant of the amino acid sequence shown in SEQ ID No.31 or 70; Optionally, the functional domain is a full length or functional fragment of TadA e; Optionally, the localization signals comprise Nuclear Localization Signals (NLS) and/or nuclear output signals (NES); optionally, the sequence of the nuclear localization signal is as shown in any one of SEQ ID NO. 38-45; Optionally, the sequence of the nuclear localization signal is located at, near or near the terminus (e.g., N-terminus or C-terminus) of the Cas protein of claim 1; optionally, the nuclear export signal comprises protein tyrosine kinase 2 (e.g., human protein tyrosine kinase 2); Alternatively, the reporter protein comprises glutathione-S-transferase (GST), horseradish peroxidase (HRP), chloramphenicol Acetyl Transferase (CAT), beta-galactosidase, beta-glucuronidase, autofluorescent protein; alternatively, the autofluorescent proteins include green fluorescent proteins (e.g., GFP-2, tagGFP, turboGFP, eGFP, copGFP, aceGFP, etc.), hcRed, dsRed, cyan fluorescent proteins (e.g., eCFP, cerulean, cyPet, amCyanl, etc.), yellow fluorescent proteins (e.g., YFP, eYFP, citrine, venus, YPet, phiYFP, etc.), blue fluorescent proteins (e.g., eBFP2, azurite, mKalamal, GFPuv, sapphire, T-sapphire); Alternatively, the DNA binding domain comprises a methylation binding protein, lexADBD, gal4DBD; Optionally, the epitope tag comprises a histidine tag, a V5 tag, a FLAG tag, an influenza virus hemagglutinin tag, a Myc tag, a VSV-G tag, a thioredoxin tag, a streptavidin tag; optionally, the transcriptional activation domain comprises VP64 and/or VPR; Optionally, the transcription repression domain comprises KRAB and/or SID; Optionally, the nuclease comprises fokl; Alternatively, the cleavage active polypeptide comprises a polypeptide having single-stranded RNA cleavage activity, a polypeptide having double-stranded RNA cleavage activity, a polypeptide having single-stranded DNA cleavage activity, or a polypeptide having double-stranded DNA cleavage activity; Optionally, the ligase comprises DNA ligase and/or RNA ligase; Alternatively, the exonuclease is selected from the group consisting of TREX2 protein, TREX1 protein, APE1 protein, artemis protein, ctIP protein, exo1 protein, mre11 protein, RAD1 protein, RAD9 protein, tp53 protein, WRN protein, exonuclease V, T5 exonuclease, or T7 exonuclease, or a variant thereof; Optionally, the exonuclease is selected from a T5 exonuclease, optionally, the T5 exonuclease comprises an amino acid sequence as shown in SEQ ID No. 81; optionally, the functional domain is linked to the N-terminus, and/or the C-terminus, of the Cas protein variant; optionally, the functional domain is inserted between the N-terminus and the C-terminus of the Cas protein variant; Optionally, the one or more functional domains are linked to the N-terminus and/or C-terminus of the Cas protein variant, optionally through a linker; alternatively, the fusion protein comprises the amino acid sequence set forth in any one of SEQ ID NO.46, 84, 86.
- 5. An isolated polynucleotide encoding the Cas protein of any one of claims 1-3 or the fusion protein of claim 4, preferably wherein the polynucleotide has been codon optimized for expression in a eukaryotic cell; Alternatively, the polynucleotide comprises a nucleotide sequence having at least about 80% (e.g., at least about 80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%、99.1%、99.2%、99.3%、99.4%、99.5%、99.6%、99.7%、99.8%、99.9% or 100%) sequence identity to the nucleotide sequence set forth in any one of SEQ ID nos. 2, 37, 83, 85, 87; alternatively, the polynucleotide comprises the nucleotide sequence set forth in any one of SEQ ID nos. 2, 37, 83, 85, 87.
- 6. A guide RNA (gRNA) comprising (1) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein of any one of claims 1-3, or the fusion protein of claim 4; Preferably, the orthographic repeat (DIRECT REPEAT, DR) sequence comprises the nucleotide sequence set forth in any one of SEQ ID nos. 5, 71, or a nucleotide sequence having at least about 50% (e.g., at least 55%,60%,65%,70%,75%,80%,85%,90%,91%,92%,93%,94%,95%,96%,97%,98%,99%, or 100%) identity to the nucleotide sequence set forth in any one of SEQ ID nos. 5, 71, or a stem-loop structure comprising 5'-X 1 X 2 X 3 X 4 X 5 NNNNNNNX 6 X 7 X 8 X 9 X 10 -3';X 1 、X 2 、X 3 、X 4 、X 5 、X 6 、X 7 、X 8 、X 9 、X 10 is any base comprising A, T, C or G, N is any base comprising A, T, C or G, wherein X 1 、X 2 、X 3 、X 4 、X 5 and X 6 、X 7 、X 8 、X 9 、X 10 can hybridize to each other to form a stem and such that NNNNNNN forms a loop, more preferably wherein the DR sequence comprises a stem-loop structure near the 3' end of the DR sequence: 5'-CCGTCNNNNNNNGACGG-3' (SEQ ID NO. 80), wherein N is any base comprising A, T, C or G; (2) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA; Optionally, the spacer sequence length is at least 15nt, e.g., the spacer sequence length is 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69, or 70nt; optionally, the spacer sequence is 15-50nt in length; optionally, the spacer sequence is 18-41nt in length; Optionally, the spacer sequence is 18-27nt in length; optionally, the spacer sequence is 18-24nt in length; Optionally, the spacer sequence is 18-22nt in length; Optionally, the 5' end of the homodromous repeat sequence is connected with the spacer sequence; optionally, the spacer sequence comprises at least 15 consecutive nucleotides of the nucleotide sequence of any one of SEQ ID nos. 6, 11, 53, 55, 57, 59, 61, 101.
- 7. An isolated nucleic acid molecule comprising or consisting of a sequence selected from the group consisting of seq id nos: (i) A sequence shown in SEQ ID NO. 5 or 71; (ii) A sequence having a substitution, deletion, or addition of one or more bases (e.g., a substitution, deletion, or addition of 1,2, 3,4, 5, 6, 7, 8, 9, or 10 bases) as compared to the sequence set forth in SEQ ID NO. 5 or 71; (iii) A sequence having at least 20% (25%, 30%,35%, at least 40%,50%,60%,70%,80%,90%,95%, or 100%) sequence identity to the sequence set forth in SEQ ID NO. 5 or 71; (iv) A sequence which hybridizes under stringent conditions to a sequence as set forth in any one of (i) to (iii); (v) A sequence complementary to the sequence described in any one of (i) - (iii), or (Vi) A nucleotide sequence encoding the sequence set forth in any one of (i) - (v); And, the sequence of any one of (ii) - (vi) substantially retains the biological function of the sequence from which it is derived; for example, the isolated nucleic acid molecule is RNA; For example, the isolated nucleic acid molecule comprises a repeat (DR) sequence in a CRISPR/Cas system.
- 8. A vector comprising the polynucleotide of claim 5, and/or a nucleotide encoding the guide RNA of claim 6; Optionally, the vector comprises a plasmid, a viral vector; Optionally, the viral vector is selected from the group consisting of adeno-associated virus (AAV), adenovirus, lentivirus, retrovirus, herpes virus, SV40, poxvirus, or a combination thereof; Optionally, the vector comprises a cloning vector, a transformation vector, an expression vector, a shuttle vector, an integration vector, a multifunctional vector; optionally, the carrier comprises: (1) A first regulatory element operably linked to a nucleotide sequence encoding the Cas protein of any one of claims 1-3 or a nucleotide sequence encoding the fusion protein of claim 4, and (2) A second regulatory element operably linked to a nucleotide sequence encoding a guide RNA comprising: (a) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein or fusion protein, and (B) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA; optionally, the first regulatory element and the second regulatory element are on the same or different vectors; in certain embodiments, the first regulatory element and/or the second regulatory element is a promoter, such as an inducible promoter; In certain embodiments, the vector comprises one or more promoters operably linked to the nucleic acid sequence, enhancer, transcription termination signal, polyadenylation sequence, origin of replication, selectable marker, nucleic acid restriction site, and/or homologous recombination site.
- 9. A composite of a metal and a silicon-containing material, characterized by comprising: (i) A protein component selected from the group consisting of the Cas protein of any one of claims 1-3, the fusion protein of claim 4, or a combination thereof, and (Ii) The guide RNA of claim 6.
- 10. A CRISPR-Cas system comprising: (i) The Cas protein of any one of claims 1-3 or the fusion protein of claim 4, or a nucleotide encoding the Cas protein or fusion protein, and (Ii) The guide RNA of claim 6, or a nucleotide encoding the guide RNA; the guide RNA comprises: (a) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein or fusion protein, and (B) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA.
- 11. A CRISPR-Cas composition comprising: (i) A first component selected from the group consisting of a Cas protein according to any one of claims 1-3, a fusion protein according to claim 4, a nucleotide sequence encoding a Cas protein according to any one of claims 1-3 or a fusion protein according to claim 4, and any combination thereof, and (Ii) A second component that is a guide RNA comprising one or more of the guide RNAs of claim 6, or encodes the nucleotide sequence comprising one or more guide RNAs of claim 6; the guide RNA comprises: (a) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein or fusion protein, and (B) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA.
- 12. A CRISPR-Cas system is characterized in that, comprising one or more vectors, the one or more vectors comprising: (i) A first nucleic acid which is a nucleotide sequence encoding the Cas protein of any one of claims 1-3 or the fusion protein of claim 4, optionally operably linked to a first regulatory element, and (Ii) A second nucleic acid encoding a nucleotide sequence comprising the guide RNA of claim 6, optionally operably linked to a second regulatory element; Wherein: the first nucleic acid and the second nucleic acid are present on the same or different vectors; the guide RNA comprises: (a) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein or fusion protein, and (B) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA.
- 13. A kit comprising one or more components selected from the group consisting of the Cas protein of any one of claims 1-3, the fusion protein of claim 4, the polynucleotide of claim 5, the vector of claim 8, the complex of claim 9, the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11, and the CRISPR-Cas system of claim 12.
- 14. A delivery composition comprising a delivery vehicle and one or more selected from the group consisting of the Cas protein of any one of claims 1-3, the fusion protein of claim 4, the polynucleotide of claim 5, the vector of claim 8, the complex of claim 9, the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11, and the system of claim 12.
- 15. A host cell comprising the Cas protein of any one of claims 1-3, the fusion protein of claim 4, the polynucleotide of claim 5, the vector of claim 8, the complex of claim 9, the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11, the system of claim 12, or the delivery composition of claim 14.
- 16. An enzyme preparation comprising the Cas protein of any one of claims 1-3, the fusion protein of claim 4, the complex of claim 9, the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11, or the system of claim 12, or the delivery composition of claim 14.
- 17. A medicine box, which comprises a medicine box body, characterized by comprising the following steps: a first container, and the complex of claim 9 or the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11 or the system of claim 12, or a medicament containing the complex of claim 9 or the CRISPR-Cas system of claim 10, the CRISPR-Cas composition of claim 11 or the system of claim 12, in the first container.
- 18. A medicine box, which comprises a medicine box body, characterized by comprising the following steps: (a1) A first container, and a Cas protein according to any one of claims 1-3, or a fusion protein according to claim 4, or a coding gene thereof, or an expression vector thereof, or a medicament containing a Cas protein variant according to any one of claims 1-3, or a fusion protein according to claim 4, or a coding gene thereof, or an expression vector thereof, located in the first container; (b1) An optional second container, and the guide RNA of claim 6 or an expression vector thereof, or a medicament containing the guide RNA of claim 6 or an expression vector thereof, in the second container.
- 19. A method of targeting and editing a target gene or cleaving a target gene comprising contacting a Cas protein of any one of claims 1-3, or a fusion protein of claim 4, or a complex of claim 9, or a CRISPR-Cas system of claim 10, a composition of claim 11, or a system of claim 12, or a delivery composition of claim 14, or an enzyme preparation of claim 16, or a kit of claim 17 or 18, with the target gene, or into a cell comprising the target gene, a target sequence being present in the target gene.
- 20. A method of inducing a change in the state of a cell, comprising contacting the Cas protein of any one of claims 1-3, or the fusion protein of claim 4, or the complex of claim 9, or the CRISPR-Cas system of claim 10, the composition of claim 11, or the system of claim 12, or the delivery composition of claim 14, or the enzyme preparation of claim 16, or the kit of claim 17 or 18, with a target gene in a cell.
Description
Cas protein, corresponding gene editing system and application thereof The application relates to a split application of an application patent application with the application date of 2024, 5, 11 and the application number of 202480002205.5, and the name of Cas protein, a corresponding gene editing system and application thereof. Technical Field The disclosure relates to the field of gene editing, in particular to a Cas protein and variant polypeptides thereof, a corresponding gene editing system and application thereof. Background Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) systems are formed by bacteria and archaea in order to defend against DNA invading phages. Of these, the most common is the CRISPR/Cas9 system, and Cas9 proteins can process pre-crrnas into mature crrnas that bind to tracrRNA with the aid of trans-coding small RNAs (tracrRNA). Then, it was found that recognition and cleavage of the target by Cas9 protein can be effectively mediated by artificially constructing single-stranded chimeric guide RNAs (guide RNAs) that mimic the crRNA-tracrRNA complex. Wherein the 3 bases immediately 3' to the target must be in the form of 5' -NGG-3' to constitute the PAM (protospacer adjacent motif) structure required for Cas/crRNA complex to recognize the target. However, the different CRISPR/Cas currently in existence each have different advantages and disadvantages, e.g., different Cas protein sizes, guide RNAs, and PAMs. Thus, there is still a need to develop new Cas proteins and CRISPR-Cas systems to meet the diverse application needs. Disclosure of Invention The main object of the present invention is to provide a novel Cas protein and variant polypeptides thereof, CRISPR-Cas system comprising the same, to meet the above application needs. In one aspect, the present disclosure provides a Cas protein selected from the group consisting of: (a) A polypeptide having the amino acid sequence shown in SEQ ID NO. 1; (b) A polypeptide having at least about 60% (e.g., 61%,62%,63%,64%,65%,66%,67%,68%,69%,70%,71%,72%,73%,74%,75%,76%,77%,78%,79%,80%,81%,82%,83%,84%,85%,86%,87%,88%,89%,90%,91%,92%,93%,94%,95%,96%,97%,98%,99% or 99.5%) identity to the amino acid sequence set forth in SEQ ID No. 1; (c) The amino acid sequence shown in SEQ ID NO.1 is formed by substitution, deletion or addition of one or more amino acid residues. In yet another aspect, the present disclosure provides a fusion protein comprising a Cas protein of the present disclosure, and one or more functional domains. In yet another aspect, the present disclosure provides an isolated polynucleotide encoding a Cas protein of the present disclosure or a fusion protein of the present disclosure; Preferably, the polynucleotide has been codon optimized for expression in eukaryotic cells; Preferably, the polynucleotide comprises a nucleotide sequence having at least about 80% (e.g., at least about 80%、85%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%、99.1%、99.2%、99.3%、99.4%、99.5%、99.6%、99.7%、99.8%、99.9% or 100%) sequence identity to the nucleotide sequence set forth in any one of SEQ ID nos. 2, 37, 83, 85, 87. Preferably, the polynucleotide has a nucleotide sequence as set forth in any one of SEQ ID NO.2, 37, 83, 85, 87. In yet another aspect, the present disclosure provides a guide RNA (gRNA) comprising (1) A homeotropic (DIRECT REPEAT, DR) sequence capable of forming a complex with the Cas protein of the first aspect, or the fusion protein of the second aspect; (2) A spacer (spacer) sequence capable of hybridizing to a target sequence of a target DNA, thereby directing the complex to the target DNA. Alternatively, the spacer sequence length is at least 15nt, for example, the spacer sequence length is 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69, or 70nt. Alternatively, the spacer sequence is 15-50nt in length. Alternatively, the spacer sequence is 18-41nt in length. Alternatively, the spacer sequence is 18-27nt in length. Alternatively, the spacer sequence is 18-24nt in length. Alternatively, the spacer sequence is 18-22nt in length. Optionally, the 5' end of the homodromous repeat sequence is linked to the spacer sequence. Optionally, the spacer sequence comprises at least 15 consecutive nucleotides of the nucleotide sequence of any one of SEQ ID nos. 6, 11, 53, 55, 57, 59, 61, 101. In yet another aspect, the present disclosure provides an isolated nucleic acid molecule comprising or consisting of a sequence selected from the group consisting of seq id no: (i) A sequence shown in SEQ ID NO. 5 or 71; (ii) A sequence having a substitution, deletion, or addition of one or more bases (e.g., a substitution, deletion, or addition of 1,2, 3,4, 5, 6, 7, 8, 9, or 10 bases) as compared to the sequence set forth in SEQ ID NO. 5 or 71; (iii) A sequence having at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 50