Search

CN-121975773-A - Novel AcaI-C system and application thereof

CN121975773ACN 121975773 ACN121975773 ACN 121975773ACN-121975773-A

Abstract

The present application relates to the field of nucleic acid editing. In particular to a novel CRISPR/Cas system. The application also relates to nucleic acid molecules comprising nucleotide sequences encoding effector proteins and/or crrnas in the system, and the use of the novel CRISPR/Cas system for modifying a target gene. The novel Aca I-C system has high-efficiency and controllable editing capability and higher application potential.

Inventors

  • LAI JINSHENG
  • CHEN JIAN
  • LI ZHIMENG
  • WANG YINGYING
  • XIN BEIBEI
  • YANG ZHIJIA

Assignees

  • 中国农业大学

Dates

Publication Date
20260505
Application Date
20260203

Claims (16)

  1. 1. A type I CRISPR/Cas system comprising a protein component, wherein the protein component comprises a Cas 5 protein, a Cas 7 protein, and a Cas 11 protein derived from a bacillus capsid (Acidobacterium capsulatum); preferably, the protein component further comprises Cas 3 and Cas 8 proteins derived from a bacillus capsid.
  2. 2. The system of claim 1, further comprising a nucleic acid component, wherein the nucleic acid component comprises a guide RNA, the guide RNA comprising a guide sequence that hybridizes to a target sequence, preferably the guide RNA is capable of forming a complex with the protein component; Preferably, the nucleic acid component comprises a first guide RNA that hybridizes to a first target sequence and a second guide RNA that hybridizes to a second target sequence, wherein both the first guide RNA and the second guide RNA are capable of forming a complex with the protein component; Preferably, the first target sequence and the second target sequence are present in the same target gene.
  3. 3. The system of claim 1 or 2, wherein the guide RNA comprises a direct repeat sequence and a guide sequence in the 5 'to 3' direction; Preferably, the orthostatic repeat is derived from a bacterium, e.g., having the sequence shown in SEQ ID NO. 6; Preferably, the guide RNA is crRNA comprising a direct repeat, a guide sequence and a direct repeat in the 5 'to 3' direction; Preferably, the guide RNA has a length of 40-26nt (e.g., 35nt, 34nt, 33nt, 32nt, 31nt, 30nt, 29nt, 28nt, 27nt, 26 nt).
  4. 4. The system of any one of claims 1-3, wherein the protein component and the nucleic acid component are associated with each other to form a complex or are present in a separate form; Preferably, the Cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein bind to each other to form a CRISPR-associated complex (cascades complex); Preferably, the system has one or more features selected from the group consisting of: (1) The Cas 8 protein has the amino acid sequence shown in SEQ ID No. 4, or a sequence that has at least 85% sequence identity compared to SEQ ID No. 4 and substantially retains the biological function of the sequence from which it is derived; (2) The Cas 3 protein has the amino acid sequence shown in SEQ ID No. 1, or a sequence that has at least 85% sequence identity as compared to SEQ ID No. 1 and substantially retains the biological function of the sequence from which it is derived; (3) The Cas 5 protein has the amino acid sequence shown in SEQ ID No. 2, or a sequence that has at least 85% sequence identity as compared to SEQ ID No. 2 and substantially retains the biological function of the sequence from which it is derived; (4) The Cas 7 protein has the amino acid sequence shown in SEQ ID No. 3, or a sequence having at least 85% sequence identity compared to SEQ ID No. 3 and substantially retaining the biological function of the sequence from which it is derived; (5) The Cas 11 protein has the amino acid sequence shown in SEQ ID No.5, or a sequence that has at least 85% sequence identity compared to SEQ ID No.5 and substantially retains the biological function of the sequence from which it is derived.
  5. 5. The system of any one of claims 1-4, wherein when the target sequence is DNA, the target sequence is located 3' to a Protospacer Adjacent Motif (PAM) and the PAM sequence is TTC.
  6. 6. The system of any one of claims 1-5, wherein the target sequence is a DNA or RNA sequence from a prokaryotic or eukaryotic cell, or the target sequence is a non-naturally occurring DNA or RNA sequence; Preferably, the target sequence is present in a cell, or the target sequence is present in an in vitro nucleic acid molecule (e.g., a plasmid); for example, the target sequence is present in the nucleus or cytoplasm, for example, the cell is a eukaryotic cell, for example, the cell is a prokaryotic cell.
  7. 7. The system of any one of claims 1-6, wherein Cas 3 protein, cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein in the protein component are each independently linked or not linked to another protein or polypeptide; Preferably, the additional protein or polypeptide is selected from the group consisting of an epitope tag, a reporter gene sequence, a Nuclear Localization Signal (NLS) sequence, a transcriptional activation domain, a transcriptional repression domain, a nuclease domain, and any combination thereof; Preferably, the Cas 3 protein, cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein are linked with one or more NLS sequences, e.g., linked to the N-terminus or C-terminus of the protein.
  8. 8. A nucleic acid molecule or combination comprising: (i) A nucleotide sequence encoding a protein component in the system of any one of claims 1-7; (ii) A nucleotide sequence for expressing a nucleic acid component in a system according to any one of claims 1 to 7, and/or, (Iii) A nucleotide sequence comprising (i) and (ii); Preferably, the nucleotide sequence set forth in any one of (i) - (iii) is codon optimized for expression in a prokaryotic cell or eukaryotic cell.
  9. 9. The nucleic acid molecule or combination of claim 8, comprising: (1) A first nucleic acid sequence comprising a nucleotide sequence encoding the Cas3 protein; (2) A second nucleic acid sequence comprising a second nucleotide sequence encoding the Cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein; (3) A third nucleic acid sequence comprising a nucleotide sequence for expressing the guide RNA; preferably, the first nucleic acid sequence is operably linked to a first regulatory element; Preferably, the second nucleic acid sequence is operably linked to a second regulatory element; preferably, the third nucleic acid sequence is operably linked to a third regulatory element.
  10. 10. The nucleic acid molecule or combination of claim 8 or 9, wherein the first regulatory element, the second regulatory element and the third regulatory element are promoters and are the same or different from each other; preferably, the first nucleic acid sequence comprises the nucleotide sequence shown as SEQ ID NO. 38; preferably, the second nucleic acid sequence comprises the nucleotide sequence shown as SEQ ID NO. 39.
  11. 11. A vector comprising the nucleic acid molecule or combination of any one of claims 8-10.
  12. 12. A host cell comprising the nucleic acid molecule or combination of any one of claims 8-10 or the vector of claim 8.
  13. 13. A delivery composition comprising a delivery vector and one or more selected from the group consisting of the system of any one of claims 1-7, the nucleic acid molecule or combination of any one of claims 8-10, the vector of claim 11, and the host cell of claim 12; For example, the delivery vehicle is a particle; preferably, the delivery vehicle is selected from phage, liposomes, metal particles, protein particles, exosomes, gene-guns, or viral vectors (e.g., replication-defective retroviruses, lentiviruses, adenoviruses, or adeno-associated viruses).
  14. 14. A method of modifying a target gene comprising contacting the system of any one of claims 1-7 with the target gene or delivering the system to a cell comprising the target gene, wherein a target sequence is present in the target gene, e.g., the target gene is present in a cell, or the target gene is present in a nucleic acid molecule (e.g., a plasmid) in vitro, e.g., the cell is a prokaryotic cell, e.g., the cell is a eukaryotic cell, e.g., the cell is selected from animal cells (e.g., mammalian cells, e.g., human cells), plant cells, e.g., the modification refers to a break in the target sequence, such as a double-strand break in DNA or a single-strand break in RNA, e.g., the modification further comprises inserting an exogenous nucleic acid into the break.
  15. 15. A cell or progeny thereof obtained by the method of claim 14, wherein the cell comprises a modification not present in its wild type; Preferably, the cell is an animal cell (e.g., a mammalian cell, such as a human cell).
  16. 16. Use of the system of any one of claims 1-7, the nucleic acid molecule or combination of any one of claims 8-10, the vector of claim 11, the host cell of claim 12, or the delivery composition of claim 13 in the preparation of a formulation for (i) nucleic acid editing (e.g., in vitro or ex vivo nucleic acid editing) (ii) in vitro or ex vivo DNA detection, and/or (iii) editing a target sequence in a target locus to modify an organism or a non-human organism; For example, the nucleic acid editing includes genetic or genomic editing; For example, the gene or genome editing includes modifying a gene, knocking out a gene, altering expression of a gene product, repairing a mutation, and/or inserting a polynucleotide.

Description

Novel AcaI-C system and application thereof Technical Field The present application relates to the field of nucleic acid editing. In particular to a novel CRISPR/Cas system. The application also relates to nucleic acid molecules comprising nucleotide sequences encoding effector proteins and/or crrnas in the system, and the use of the novel CRISPR/Cas system for modifying a target gene. Background CRISPR-Cas systems are widely available adaptive immune mechanisms in bacteria and archaea, capable of effectively defending against invasive viruses and plasmids. According to the latest classification, CRISPR-Cas systems are divided into two classes, class I and Class II. Class I systems consist of multiple Cas proteins and CRISPR RNA (crrnas) multi-subunit effector complexes to coordinate target recognition and cleavage, whereas Class II systems rely on a single Cas effector protein to perform the entire interference process independently. Although Class I systems account for about 90% of CRISPR-Cas sites in nature and are assumed to be evolutionary progenitors of Class II systems, they are still relatively limited in their application to genome editing tools. CLASS I CRISPR-Cas systems are generally divided into three broad categories, I, III and IV, with the type I system being the most abundant and diverse. The type I system is characterized by possessing Cascade complex and Cas3 helicase-nuclease modules, which recognize and degrade target DNA under the guidance of CRISPR RNA for large fragment cleavage. The Type I system exhibits unique advantages in terms of DNA targeting efficiency and engineering application potential compared to the Type III system targeting RNA and the Type IV system with simplified structure but ambiguous function. In practical applications of CRISPR systems, a more compact operational substructure facilitates its transfer and expression between different organisms. Within the Type I family, the Type I-C system is relatively compact, and its Cas5 protein can replace the traditional Cas6 protein to process crrnas, thereby inducing mature crRNA expression. This structural feature makes the Type I-C system a more potential gene editing tool. Thus, exploring more versatile I-C systems would help expand the CRISPR gene editing toolbox. Disclosure of Invention The inventors of the present application identified for the first time a series of novel Cas proteins and novel CRISPR/Cas systems comprising them in a bacillus capsid (Acidobacterium capsulatum) and named the Aca I-C system. Further, the inventor has proved through experiments that Cas 5 protein, cas 7 protein and Cas 11 protein in the system can be complexed to form a cascades complex and play an important role in the nucleic acid editing process. Thus, the inventors completed the present application. CRISPR/Cas system Accordingly, in a first aspect, the present invention provides a type I CRISPR/Cas system comprising a protein component, wherein the protein component comprises a Cas 5 protein, a Cas 7 protein and a Cas 11 protein derived from Acetobacter capsid (Acidobacterium capsulatum). In certain embodiments, the protein component further comprises a Cas 3 protein and a Cas 8 protein derived from a bacillus capsid (Acidobacterium capsulatum). In certain embodiments, the system is a type I-a CRISPR-Cas system, a type I-B CRISPR-Cas system, a type I-C CRISPR-Cas system, a type I-D CRISPR-Cas system, a type I-E CRISPR-Cas system, or a type I-F CRISPR-Cas system. In certain preferred embodiments, the system is a type I-C CRISPR-Cas system. In certain embodiments, the protein component comprises a Cas 3 protein, and a Cascade complex. In some embodiments, the cascades complex comprises: (i) Cas6 protein, cas8 protein, cas7 protein, and Cas5 protein; (ii) Cas5 protein, cas8 protein and Cas7 protein, or (Iii) Cas5 protein, cas8 protein, cas7 protein, and Cas11 protein. In some embodiments, the cascades in the type I-C CRISPR-Cas system comprise Cas5 protein, cas8 protein, and Cas7 protein. In certain embodiments, the protein component comprises a Cas 3 protein, a Cas 5 protein, a Cas 7 protein, a Cas 8 protein, and a Cas 11 protein. Genomic information of Acidobacterium capsici (Acidobacterium capsulatum) has been disclosed in public databases, for example in NCBI. The inventors of the present application, however, first discovered an I-C CRISPR-Cas system and Cas 5, cas 7 and Cas 11 proteins contained therein from this genome. Based on the present disclosure and the prior art, one of skill in the art is able to confirm the sequences of Cas 5 protein, cas 7 protein, and Cas 11 protein. Similarly, it is within the ability of one skilled in the art to confirm the sequence of Cas 3 protein, cas 5 protein, cas 7 protein, cas 8 protein, and Cas 11 protein. In certain embodiments, the system further comprises a nucleic acid component. Wherein the nucleic acid component comprises a guide RNA. The guide RNA comprises a guide sequence that hyb