CN-121975772-A - Novel Cas3 protein and AcaI-C system comprising same
Abstract
The present application relates to the field of nucleic acid editing. In particular to a Cas 3 protein in a type I CRISPR/Cas system and a fusion protein comprising the same. The application also relates to novel CRISPR/Cas systems comprising the Cas 3 proteins and fusion proteins, and the use of the fusion proteins and novel CRISPR/Cas systems in modifying target genes. The Cas 3 protein and CRISPR/Cas system are capable of editing target genes with high efficiency and high precision.
Inventors
- LAI JINSHENG
- CHEN JIAN
- LI ZHIMENG
- WANG YINGYING
- XIN BEIBEI
- YANG ZHIJIA
Assignees
- 中国农业大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260203
Claims (17)
- 1. A Cas 3 protein in a type I CRISPR/Cas system having the amino acid sequence shown in SEQ ID No. 1 or a sequence having at least 85% sequence identity compared to SEQ ID No. 1 and substantially retaining the biological function of the sequence from which it is derived.
- 2. A fusion protein comprising the Cas 3 protein of claim 1, and an additional protein or polypeptide; Preferably, the additional protein or polypeptide is selected from the group consisting of an epitope tag, a reporter gene sequence, a Nuclear Localization Signal (NLS) sequence, a targeting moiety, a transcriptional activation domain, a transcriptional repression domain, a nuclease domain, and any combination thereof.
- 3. A CRISPR/Cas system comprising a protein component comprising the Cas 3 protein of claim 1, or the fusion protein of claim 2.
- 4. The system of claim 3, the protein component further comprising a Cas 5 protein, a Cas 7 protein, a Cas 8 protein, a Cas 11 protein, or any combination thereof; preferably, the protein component comprises a Cas 3 protein, a Cas 5 protein, a Cas 7 protein, a Cas 8 protein, and a Cas 11 protein.
- 5. The system of claim 3 or 4, further comprising a nucleic acid component comprising a guide RNA, and the guide RNA comprises a guide sequence that hybridizes to a target sequence; Preferably, the guide RNA is capable of forming a complex with the protein component; Preferably, the guide RNA comprises a direct repeat sequence and a guide sequence in the 5 'to 3' direction; Preferably, the orthostatic repeat is derived from a bacterium, e.g., having the sequence shown in SEQ ID NO. 6; preferably, the guide RNA is crRNA comprising a direct repeat, a guide sequence and a direct repeat in the 5 'to 3' direction.
- 6. The system of any one of claims 3-5, wherein the protein component and the nucleic acid component are associated with each other to form a complex or are present in a separate form; Preferably, the Cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein bind to each other to form a CRISPR-associated complex (cascades complex); Preferably, the system has one or more features selected from the group consisting of: (1) The Cas 8 protein has the amino acid sequence shown in SEQ ID No. 4, or a sequence that has at least 85% sequence identity compared to SEQ ID No. 4 and substantially retains the biological function of the sequence from which it is derived; (2) The Cas 5 protein has the amino acid sequence shown in SEQ ID No. 2, or a sequence that has at least 85% sequence identity as compared to SEQ ID No. 2 and substantially retains the biological function of the sequence from which it is derived; (3) The Cas 7 protein has the amino acid sequence shown in SEQ ID No. 3, or a sequence having at least 85% sequence identity compared to SEQ ID No. 3 and substantially retaining the biological function of the sequence from which it is derived; (4) The Cas 11 protein has the amino acid sequence shown in SEQ ID No. 5, or a sequence that has at least 85% sequence identity compared to SEQ ID No. 5 and substantially retains the biological function of the sequence from which it is derived.
- 7. The system of any one of claims 3-6, wherein when the target sequence is DNA, the target sequence is located 3' to a Protospacer Adjacent Motif (PAM) and the PAM sequence is TTC.
- 8. The system of any of claims 3-7, wherein the target sequence is a DNA or RNA sequence from a prokaryotic or eukaryotic cell, or the target sequence is a non-naturally occurring DNA or RNA sequence; Preferably, the target sequence is present in a cell, or the target sequence is present in an in vitro nucleic acid molecule (e.g., a plasmid); for example, the target sequence is present in the nucleus or cytoplasm, for example, the cell is a eukaryotic cell, for example, the cell is a prokaryotic cell.
- 9. A nucleic acid molecule or combination comprising: (i) A nucleotide sequence encoding the Cas 3 protein of claim 1, or the fusion protein of claim 2, or a protein component in the system of any one of claims 3-8; (ii) A nucleotide sequence for expressing a nucleic acid component in a system according to any one of claims 3 to 8, and/or, (Iii) A nucleotide sequence comprising (i) and (ii); Preferably, the nucleotide sequence set forth in any one of (i) - (iii) is codon optimized for expression in a prokaryotic cell or eukaryotic cell.
- 10. The nucleic acid molecule or combination of claim 9, comprising: (1) A first nucleic acid sequence comprising a nucleotide sequence encoding the Cas 3; (2) A second nucleic acid sequence comprising a second nucleotide sequence encoding the Cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein; (3) A third nucleic acid sequence comprising a nucleotide sequence for expressing the guide RNA; preferably, the first nucleic acid sequence is operably linked to a first regulatory element; Preferably, the second nucleic acid sequence is operably linked to a second regulatory element; preferably, the third nucleic acid sequence is operably linked to a third regulatory element.
- 11. The nucleic acid molecule or combination of claim 9 or 10, wherein the first regulatory element, the second regulatory element and the third regulatory element are promoters and are the same or different from each other; preferably, the first nucleic acid sequence comprises the nucleotide sequence shown in SEQ ID NO; preferably, the second nucleic acid sequence comprises the nucleotide sequence shown as SEQ ID NO.
- 12. A vector comprising the nucleic acid molecule or combination of any one of claims 9-11.
- 13. A host cell comprising the nucleic acid molecule or combination of any one of claims 9-11 or the vector of claim 12.
- 14. A delivery composition comprising a delivery vector and one or more selected from the group consisting of the Cas 3 protein of claim 1, or the fusion protein of claim 2, or the system of any one of claims 3-8, or the nucleic acid molecule or combination of any one of claims 9-11, the vector of claim 12, and the host cell of claim 13; For example, the delivery vehicle is a particle; preferably, the delivery vehicle is selected from phage, liposomes, metal particles, protein particles, exosomes, gene-guns, or viral vectors (e.g., replication-defective retroviruses, lentiviruses, adenoviruses, or adeno-associated viruses).
- 15. A method of modifying a target gene comprising contacting the Cas 3 protein of claim 1, or the fusion protein of claim 2, or the system of any one of claims 3-8 with the target gene, or delivering into a cell comprising the target gene, wherein a target sequence is present in the target gene; for example, the target gene is present in a cell, or the target gene is present in a nucleic acid molecule (e.g., a plasmid) in vitro; For example, the cell is a prokaryotic cell, e.g., the cell is a eukaryotic cell, e.g., the cell is selected from the group consisting of an animal cell (e.g., a mammalian cell, such as a human cell), a plant cell; for example, the modification refers to a break of the target sequence, such as a double-strand break of DNA or a single-strand break of RNA; For example, the modification further includes inserting an exogenous nucleic acid into the break.
- 16. A cell or progeny thereof obtained by the method of claim 15, wherein the cell comprises a modification not present in its wild type; Preferably, the cell is an animal cell (e.g., a mammalian cell, such as a human cell).
- 17. Use of the Cas 3 protein of claim 1, or the fusion protein of claim 2, or the system of any one of claims 3-8, or the nucleic acid molecule or combination of any one of claims 9-11, the vector of claim 12, or the delivery composition of claim 14 in the preparation of a formulation for (i) nucleic acid editing (e.g., in vitro or ex vivo nucleic acid editing) (ii) in vitro or ex vivo DNA detection, and/or (iii) editing a target sequence in a target locus to modify an organism or a non-human organism; For example, the nucleic acid editing includes genetic or genomic editing; For example, the gene or genome editing includes modifying a gene, knocking out a gene, altering expression of a gene product, repairing a mutation, and/or inserting a polynucleotide.
Description
Novel Cas3 protein and AcaI-C system comprising same Technical Field The present application relates to the field of nucleic acid editing. In particular to a Cas 3 protein in a type I CRISPR/Cas system and fusion protein containing the same. The application also relates to novel CRISPR/Cas systems comprising the Cas 3 proteins and fusion proteins, and the use of the novel CRISPR/Cas systems in modifying target genes. Background CRISPR-Cas systems are widely available adaptive immune mechanisms in bacteria and archaea, capable of effectively defending against invasive viruses and plasmids. According to the latest classification, CRISPR-Cas systems are divided into two classes, class I and Class II. Class I systems consist of multiple Cas proteins and CRISPR RNA (crrnas) multi-subunit effector complexes to coordinate target recognition and cleavage, whereas Class II systems rely on a single Cas effector protein to perform the entire interference process independently. Although Class I systems account for about 90% of CRISPR-Cas sites in nature and are assumed to be evolutionary progenitors of Class II systems, they are still relatively limited in their application to genome editing tools. CLASS I CRISPR-Cas systems are generally divided into three broad categories, I, III and IV, with the type I system being the most abundant and diverse. The type I system is characterized by possessing Cascade complex and Cas3 helicase-nuclease modules, which recognize and degrade target DNA under the guidance of CRISPR RNA for large fragment cleavage. The Type I system exhibits unique advantages in terms of DNA targeting efficiency and engineering application potential compared to the Type III system targeting RNA and the Type IV system with simplified structure but ambiguous function. In practical applications of CRISPR systems, a more compact operational substructure facilitates its transfer and expression between different organisms. Within the Type I family, the Type I-C system is relatively compact, and its Cas5 protein can replace the traditional Cas6 protein to process crrnas, thereby inducing mature crRNA expression. This structural feature makes the Type I-C system a more potential gene editing tool. Thus, exploring more versatile I-C systems would help expand the CRISPR gene editing toolbox. Disclosure of Invention The inventors of the present application identified for the first time a series of novel Cas proteins and novel CRISPR/Cas systems comprising them, and named the Aca I-C system. Further, the inventors have experimentally confirmed that the novel Cas3 protein in this system has nuclease, helicase and exonuclease activities, thereby enabling editing of target nucleic acids. Thus, the inventors completed the present application. Cas 3 proteins Accordingly, in a first aspect, the present invention provides a Cas 3 protein in a type I CRISPR/Cas system having the amino acid sequence shown in SEQ ID No.1 or a sequence having at least 85% sequence identity compared to SEQ ID No.1 and substantially preserving the biological function of the sequence from which it is derived. In certain embodiments, it has the amino acid sequence shown in SEQ ID NO. 1 or an ortholog, homolog, variant or functional fragment thereof, wherein said ortholog, homolog, variant or functional fragment substantially retains the biological function of the sequence from which it is derived. In certain embodiments, it has a sequence derived from Acetobacter capsid, or an ortholog, homolog, variant or functional fragment thereof, wherein said ortholog, homolog, variant or functional fragment substantially retains the biological function of the sequence from which it was derived. In the present invention, the biological functions of the above sequences include, but are not limited to, activity of binding to a guide RNA, endonuclease activity, activity of binding to a specific site of a target sequence under the guidance of a guide RNA and cleavage. In the present invention, the biological functions of the above sequences include, but are not limited to, nuclease, helicase and exonuclease activities. In certain embodiments, the ortholog, homolog, variant has at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity compared to the sequence from which it is derived. In certain embodiments, the Cas 3 protein is an effector protein in a type I CRISPR/Cas system. In certain embodiments, the system is a type I-a CRISPR-Cas system, a type I-B CRISPR-Cas system, a type I-C CRISPR-Cas system, a type I-D CRISPR-Cas system, a type I-E CRISPR-Cas system, or a type I-F CRISPR-Cas system. In certain preferred embodiments, the system is a type I-C CRISPR-Cas system. Fusion proteins The proteins of the invention may be derivatized, e.g., linked to another molecule (e.g., anoth