EP-4741497-A1 - NOVEL CRISPR ENZYME, AND SYSTEM AND APPLICATION
Abstract
The present disclosure provides a clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) enzyme, and belongs to the field of nucleic acid editing, and particularly to the technical field of CRISPR. The present disclosure has promising application prospects in the field of gene editing.
Inventors
- DUAN, Zhiqiang
- LI, SHANSHAN
- LIU, RUIHENG
Assignees
- Shandong Shunfeng Biotechnology Co., Ltd.
Dates
- Publication Date
- 20260513
- Application Date
- 20240924
Claims (10)
- A clustered regularly interspaced short palindromic repeat (CRISPR)-associated (Cas) protein, wherein the Cas protein is any one of the following I to IV: I, a Cas protein that has an amino acid sequence possessing a sequence identity of at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% with SEQ ID No. 1 and basically retains a biological function of SEQ ID No. 1; II, a Cas protein that has an amino acid sequence comprising substitution, deletion, or addition of one or more amino acids (comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids) relative to SEQ ID No. 1 and basically retains the biological function of SEQ ID No. 1; III, a Cas protein comprising the amino acid sequence shown in SEQ ID No. 1; and IV, a Cas protein that has an amino acid sequence comprising a mutation of any one or more of36th, 39th, 65th, 69th, 73rd, 75th, 119th, 122nd, 132nd, 154th, 155th, 156th, 157th, 171st, 186th, 191st, 195th, 208th, 264th, 278th, 281st, 296th, 304th, 342nd, and 344th amino acids relative to the amino acid sequence shown in SEQ ID No. 1.
- A fusion protein, comprising the Cas protein according to claim 1 and other modification moiety.
- An isolated polynucleotide, wherein the polynucleotide is a polynucleotide sequence encoding the Cas protein according to claim 1 or a polynucleotide sequence encoding a fusion protein according to claim 2.
- A vector, comprising the polynucleotide according to claim 3 and a regulatory element operably linked thereto.
- A guide RNA (gRNA), comprising a targeting sequence and a non-targeting sequence, wherein the non-targeting sequence comprises a trans-activating CRISPR RNA (tracrRNA) and a pairing region sequence in a CRISPR RNA (crRNA) that targets the tracrRNA; and preferably, the non-targeting sequence of the gRNA is shown in SEQ ID No. 5; or the non-targeting sequence of the gRNA comprises a base mutation relative to SEQ ID No. 5, wherein the base mutation is selected from any one or more (comprising any 1, 2, 3, 4, or 5) of the following (1) to (20): (1) deletion of 1st to 12th bases in SEQ ID No. 5; (2) deletion of 1st to 26th bases in SEQ ID No. 5; (3) deletion of 13th to 26th and 158th to 172nd bases in SEQ ID No. 5; (4) substitution of A with C at position 29 and substitution of U with G at position 155 in SEQ ID No. 5; (5) substitution of U with C at position 31 and substitution of A with G at position 154 in SEQ ID No. 5; (6) addition of U between U at position 155 and G at position 156 in SEQ ID No. 5; (7) substitution of A with C at position 29, substitution of U with C at position 31, substitution of A with G at position 154, and substitution of U with G at position 155 in SEQ ID No. 5; (8) substitution of A with C at position 74 and substitution of U with G at position 88 in SEQ ID No. 5; (9) substitution of U with C at position 100 and substitution of A with G at position 119 in SEQ ID No. 5; (10) substitution of U with C at position 105 and substitution of A with G at position 114 in SEQ ID No. 5; (11) substitution of U with C at position 100, substitution of U with C at position 105, substitution of A with G at position 114, and substitution of A with G at position 119 in SEQ ID No. 5; (12) substitution of U with G at position 124 and substitution of A with C at position 143 in SEQ ID No. 5; (13) substitution of A with G at position 126 and substitution of U with C at position 141 in SEQ ID No. 5; (14) addition of U between G at position 127 and G at position 128 in SEQ ID No. 5; (15) substitution of U with G at position 124, substitution of A with G at position 126, substitution of U with C at position 141, and substitution of A with C at position 143 in SEQ ID No. 5; (16) deletion of 198th to 201st bases in SEQ ID No. 5; (17) deletion of 192nd to 200th bases in SEQ ID No. 5; (18) deletion of 205th to 209th bases in SEQ ID No. 5; (19) deletion of 205th to 217th bases in SEQ ID No. 5; and (20) deletion of 205th to 222nd bases in SEQ ID No. 5.
- A CRISPR-Cas system, comprising the Cas protein according to claim 1 and at least one gRNA capable of binding to the Cas protein, wherein the gRNA comprises a region that binds to the Cas protein and a targeting sequence that targets a nucleic acid, or the gRNA is the gRNA according to claim 5.
- A composition, comprising: (i) a protein component selected from the Cas protein according to claim 1 or a fusion protein according to claim 2; and (ii) a nucleic acid component selected from a gRNA or a nucleic acid encoding the gRNA or a precursor RNA of the gRNA or a nucleic acid encoding the precursor RNA of the gRNA, wherein the gRNA comprises a region that binds to the Cas protein according to claim 1 and a targeting sequence that targets a nucleic acid, or the gRNA is a gRNA according to claim 5, wherein the protein component is capable of binding to the nucleic acid component to produce a complex.
- An engineered host cell, comprising the Cas protein according to claim 1 or a fusion protein according to claim 2 or a polynucleotide according to claim 3 or a vector according to claim 4 or a gRNA according to claim 5 or a CRISPR-Cas system according to claim 6 or a composition according to claim 7.
- A use of the Cas protein according to claim 1 or a fusion protein according to claim 2 or a polynucleotide according to claim 3 or a vector according to claim 4 or a gRNA according to claim 5 or a CRISPR-Cas system according to claim 6 or a composition according to claim 7 or a host cell according to claim 8 in gene editing, gene targeting, gene cleavage, cleavage of a double-stranded DNA, a single-stranded DNA, or a single-stranded RNA, non-specific cleavage and/or degradation of a branched nucleic acid, non-specific cleavage of a single-stranded nucleic acid, nucleic acid detection, specific editing of a double-stranded nucleic acid, base editing of the double-stranded nucleic acid, or base editing of the single-stranded nucleic acid; or in preparation of a formulation or kit, wherein the formulation or kit is provided for the gene editing, the gene targeting, the gene cleavage, the cleavage of the double-stranded DNA, the single-stranded DNA, or the single-stranded RNA, the non-specific cleavage and/or degradation of the branched nucleic acid, the non-specific cleavage of the single-stranded nucleic acid, the nucleic acid detection, the specific editing of the double-stranded nucleic acid, the base editing of the double-stranded nucleic acid, or the base editing of the single-stranded nucleic acid.
- A method for editing a target nucleic acid, targeting the target nucleic acid, or cleaving the target nucleic acid, comprising: allowing the target nucleic acid to be in contact with the Cas protein according to claim 1 or a fusion protein according to claim 2 or a polynucleotide according to claim 3 or a vector according to claim 4 or a gRNA according to claim 5 or a CRISPR-Cas system according to claim 6 or a composition according to claim 7 or a host cell according to claim 8.
Description
The present application claims priority to Chinese Patent Application CN202311235588.4, filed on September 25, 2023, which is incorporated herein by reference in its entirety. TECHNICAL FIELD The present disclosure relates to the field of gene editing, and in particular to the technical field of clustered regularly interspaced short palindromic repeats (CRISPR). Specifically, the present disclosure finds a novel CRISPR-associated (Cas) enzyme and develops a corresponding gene editing tool based on the novel Cas enzyme, and a use thereof. BACKGROUND TECHNOLOGY CRISPR/Cas technology is a widely-used gene editing technology. In CRISPR/Cas technology, the RNA guidance is used to specifically bind to a target sequence on a genome and cleave DNA to produce double strand breaks, and the site-directed gene editing is conducted through biological non-homologous end joining or homologous recombination. The CRISPR/Cas9 system is the most common type II CRISPR system, which recognizes a protospacer adjacent motif (PAM) of 3'-NGG and enables blunt-end cleavage for a target sequence. The CRISPR/Cas Type V system is a newly discovered CRISPR system. The CRISPR/Cas Type V system has a motif of 5'-TTN and enables sticky-end cleavage for a target sequence, such as Cpf1, C2c1, CasX, and CasY. However, the various existing CRISPR/Cas systems have distinct advantages and disadvantages. For example, Cas9, C2c1, and CasX all require two RNAs for RNA guidance, while Cpf1 only requires one guide RNA (gRNA) and can be used for multiplex gene editing. CasX has a size of 980 amino acids, while common Cas9, C2c1, CasY, and Cpf1 typically have a size of about 1,300 amino acids. In addition, PAM sequences of Cas9, Cpf1, CasX, and CasY are relatively complex and diverse. C2c1 can only recognize the strict 5'-TTN. Thus, a target site of C2c1 is more easily predicted than target sites of other systems, which reduces the potential off-target effect of C2c1. In summary, given that the currently available CRISPR/Cas systems are all constrained by some shortcomings, it is of great significance for the development of biotechnology to develop a robust novel CRISPR/Cas system with multiple favorable properties. CONTENT OF THE INVENTION Through a large number of experiments and repeated explorations, the inventors of the present application have unexpectedly discovered a novel endonuclease (Cas enzyme). On the basis of this discovery, the inventors develop a novel CRISPR/Cas system, and a gene editing method and nucleic acid detection method based on the CRISPR/Cas system. Cas effector protein In an aspect, the present disclosure provides a Cas protein, which is an effector protein in a CRISPR/Cas system. In the present disclosure, the Cas protein is referred to as Cas-sf6728. An amino acid sequence of the Cas-sf6728 protein is shown in SEQ ID No. 1. In an embodiment, the Cas protein has an amino acid sequence possessing a sequence identity of at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% with an amino acid sequence shown in SEQ ID No. 1 and basically retains a biological function of SEQ ID No. 1. Preferably, the Cas protein is derived from the same species as that of the Cas-sf6728. In an embodiment, the Cas protein has an amino acid sequence including substitution, deletion, or addition of one or more amino acids relative to the amino acid sequence shown in SEQ ID No. 1 and basically retains the biological function of SEQ ID No. 1. The substitution, deletion, or addition of one or more amino acids includes substitution, deletion, or addition of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In an embodiment, the Cas protein is a derivatized protein with the same biological function as a protein possessing the amino acid sequence shown in SEQ ID No.1. In an embodiment, the Cas protein has the amino acid sequence shown in SEQ ID No. 1. In an embodiment, the Cas protein includes a mutation of any one or more (for example, any 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10) of 36th, 39th, 65th, 69th, 73rd, 75th, 119th, 122nd, 132nd, 154th, 155th, 156th, 157th, 171st, 186th, 191st, 195th, 208th, 264th, 278th, 281st, 296th, 304th, 342nd, and 344th amino acids relative to the amino acid sequence shown in SEQ ID No. 1. In an embodiment, the Cas protein includes a mutation of 65th and 75th amino acids relative to the amino acid sequence shown in SEQ ID No. 1. In an embodiment, an amino acid sequence of the Cas protein has a sequence identity of at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at