EP-3858990-B1 - ENGINEERED CRISPR-CAS9 NUCLEASES WITH ALTERED PAM SPECIFICITY

EP3858990B1EP 3858990 B1EP3858990 B1EP 3858990B1EP-3858990-B1

Inventors

JOUNG, J. KEITH
KLEINSTIVER, Benjamin

Dates

Publication Date: 20260506
Application Date: 20160303

Claims (15)

An isolated nucleic acid encoding a protein, wherein the protein comprises a variant Staphylococcus aureus Cas9 (SaCas9) protein comprising an amino acid sequence that has at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 2, wherein the variant SaCas9 protein has mutations at one or more of the following positions: E782, N968 and/or R1015, wherein the amino acid corresponding to E782 is mutated to K or G, the amino acid corresponding to N968 is mutated to K, and/or the amino acid corresponding to R1015 is mutated to Q, H, E, L, or M, and wherein the variant SaCas9 binds to a guide RNA and a target DNA.
The isolated nucleic acid of claim 1, variant Staphylococcus aureus Cas9 (SaCas9) protein comprising an amino acid sequence that has at least 85%, at least 90% or at least 95% sequence identity to the amino acid sequence of SEQ ID NO: 2, and/or wherein the amino acid corresponding to E782 is mutated to K, the amino acid corresponding to N968 is mutated to K, and/or the amino acid corresponding to R1015 is mutated to Q or H in the variant SaCas9 protein, or, wherein the amino acid corresponding to E782 is mutated to K, the amino acid corresponding to N968 is mutated to K, and the amino acid corresponding to R1015 is mutated to Q or H in the variant SaCas9 protein, and optionally wherein the amino acid corresponding to E735 of SEQ ID NO: 2 is mutated to K, the amino acid corresponding to K929 of SEQ ID NO: 2 is mutated to R, the amino acid corresponding to A1021 of SEQ ID NO: 2 is mutated to T, or the amino acid corresponding to K1044 of SEQ ID NO: 2 is mutated to N in the variant SaCas9 protein, and/or, wherein the amino acid corresponding to D10 of SEQ ID NO: 2, the amino acid corresponding to D556 of SEQ ID NO: 2, the amino acid corresponding to H557 of SEQ ID NO: 2, and/or the amino acid corresponding to N580 of SEQ ID NO: 2 is mutated in the variant SaCas9 protein, preferably, wherein the amino acid corresponding to D10 is mutated to A, the amino acid corresponding to D556 is mutated to A, the amino acid corresponding to H557 is mutated to A, and/or the amino acid corresponding to N580 is mutated to A in the variant SaCas9 protein, or, wherein (i) the amino acid corresponding to D10 is mutated to A and the amino acid corresponding to H557 is mutated to A, or (ii) the amino acid corresponding to D10 is mutated to A, the amino acid corresponding to D556 is mutated to A, the amino acid corresponding to H557 is mutated to A, and the amino acid corresponding to N580 is mutated to A in the variant SaCas9 protein.
The isolated nucleic acid of claim 1 or claim 2, wherein the protein encoded by the isolated nucleic acid further comprises a heterologous functional domain fused to the variant SaCas9 protein, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein, preferably, wherein the heterologous functional domain is a transcriptional activation domain, preferably, wherein the heterologous functional domain transcriptional activation domain is from VP64 or NF-κB p65; or, wherein the heterologous functional domain is a transcriptional silencer or transcriptional repression domain, preferably, wherein the transcriptional repression domain is a Krueppel-associated box (KRAB) domain, ERF repressor domain (ERD), or mSin3A interaction domain (SID) or, wherein the transcriptional silencer is Heterochromatin Protein 1 (HP1) ; or, wherein the heterologous functional domain is an enzyme that modifies the methylation state of DNA, preferably, wherein the enzyme that modifies the methylation state of DNA is a DNA methyltransferase (DNMT) or a Ten-Eleven-Translocation (TET) protein, preferably, wherein the TET protein is TET1; or, wherein the heterologous functional domain is an enzyme that modifies a histone subunit, preferably, wherein the enzyme that modifies a histone subunit is a histone acetyltransferase (HAT), a histone deacetylase (HDAC), a histone methyltransferase (HMT), or a histone demethylase; or, wherein the heterologous functional domain is a biological tether, preferably, wherein the biological tether is MS2, Csy4, or lambda N protein; or, wherein the heterologous functional domain is FokI.
A vector comprising the isolated nucleic acid of any one of claims 1-3.
The vector of claim 4, wherein the isolated nucleic acid is operably linked to one or more regulatory domains for expressing the SaCas9 protein.
An isolated host cell, preferably a mammalian host cell, comprising the nucleic acid of any one of claims 1-3.
An isolated nucleic acid of any one of claims 1-3; and a nucleic acid comprising or encoding a guide RNA that directs the SaCas9 protein to a target genomic sequence.
An isolated variant Staphylococcus aureus Cas9 (SaCas9) protein comprising an amino acid sequence that has at least 80% sequence identity to the amino acid sequence of SEQ ID NO: 2, wherein the variant SaCas9 protein has mutations at one or more of the following positions: E782, N968 and/or R1015, wherein the amino acid corresponding to E782 is mutated to K or G, the amino acid corresponding to N968 is mutated to K, and/or the amino acid corresponding to R1015 is mutated to Q, H, E, L, or M, and wherein the variant SaCas9 binds to a guide RNA and a target DNA.
The Cas9 protein of claim 8, wherein the Cas9 protein comprises an amino acid sequence that has at least 85%, at least 90% or at least 95% sequence identity to the amino acid sequence of SEQ ID NO:2, and/or, wherein the SaCas9 further comprises mutations at one or more of the following positions: E735, K929, A102I or K1044, and/or wherein the Cas9 protein comprising one or more of the following mutations: R1015Q, R1015H, E782K, N968K, E735K, K929R, A1021T, K1044N, E782K/N968K/R1015H (KKH variant); E782K/K929R/R1015H (KRH variant); or E782K/K929R/N968K/R1015H (KRKH variant), and/or, further comprising one or more mutations that decrease nuclease activity selected from the group consisting of mutations at D10, D556, H557, and/or N580, or, wherein the mutations are D10A, D556A, H557A.
A fusion protein comprising the Cas9 protein of claim 8 or claim 9, fused to a heterologous functional domain, with an optional intervening linker, wherein the linker does not interfere with the activity of the fusion protein, preferably, wherein, the heterologous functional domain is a transcriptional activation domain, preferably, wherein the transcriptional activation domain is from VP64 or NF-κB p65; or, wherein the heterologous functional domain is a transcriptional silencer or transcriptional repression domain, preferably, wherein the transcriptional repression domain is a Krueppel-associated box (KRAB) domain, ERF repressor domain (ERD), or mSin3A interaction domain (SID) or, wherein the transcriptional silencer is Heterochromatin Protein 1 (HP1) ; or, wherein the heterologous functional domain is an enzyme that modifies the methylation state of DNA, preferably, wherein the enzyme that modifies the methylation state of DNA is a DNA methyltransferase (DNMT) or a Ten-Eleven-Translocation (TET) protein, preferably, wherein the TET protein is TET1; or, wherein the heterologous functional domain is an enzyme that modifies a histone subunit, preferably, wherein the enzyme that modifies a histone subunit is a histone acetyltransferase (HAT), a histone deacetylase (HDAC), a histone methyltransferase (HMT), or a histone demethylase; or, wherein the heterologous functional domain is a biological tether, preferably, wherein the biological tether is MS2, Csy4, or lambda N protein; or, wherein the heterologous functional domain is FokI.
An ex vivo or in vitro method of altering the genome of a cell, the method comprising expressing in the cell, or contacting the cell with, the isolated protein or fusion protein of claims 8-10, and a guide RNA having a region complementary to a selected portion of the genome of the cell.
The isolated protein or fusion protein of claims 8-10, for use in a method of altering the genome of a cell, comprising expressing in the cell, or contacting the cell with, said isolated protein or fusion protein, and a guide RNA having a region complementary to a selected portion of the genome of the cell.
The method of claim 11 or the isolated protein or fusion protein for the use of claim 12, wherein the isolated protein or fusion protein comprises one or more of a nuclear localization sequence, a cell penetrating peptide sequence, and/or an affinity tag and/or, wherein the cell is a stem cell, preferably, wherein the stem cell is an embryonic stem cell, a mesenchymal stem cell, or an induced pluripotent stem cell.
An ex vivo or in vitro method of altering a double stranded DNA (dsDNA) molecule, the method comprising contacting the dsDNA molecule with the isolated protein or fusion protein of claims 8-10, and a guide RNA having a region complementary to a selected portion of the dsDNA molecule.
The isolated protein or fusion protein of claims 8-10, for use in a method of altering a double stranded DNA (dsDNA) molecule, comprising contacting the dsRNA with the said isolated protein or fusion protein and a guide RNA having a region complementary to a selected portion of the dsDNA molecule.

Description

CLAIM OF PRIORITY This application claims the benefit of U.S. Provisional Patent Application Serial Nos. 61/127,634, filed on March 3, 2015; 62/165,517, filed on May 22, 2015; 62/239,737, filed on October 9, 2015; and 62/258,402, filed on November 20, 2015. TECHNICAL FIELD The invention relates, at least in part, to engineered Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs)/CRISPR-associated protein 9 (Cas9) nucleases with altered and improved Protospacer Adjacent Motif (PAM) specificities and their use in genomic engineering, epigenomic engineering, and genome targeting. BACKGROUND CRISPR-Cas9 nucleases enable efficient, customizable genome editing in a wide variety of organisms and cell types (Sander & Joung, Nat Biotechnol 32, 347-355 (2014); Hsuet al., Cell 157, 1262-1278 (2014); Doudna & Charpentier, Science 346, 1258096 (2014); Barrangou & May, Expert Opin Biol Ther 15, 311-314 (2015)). Target site recognition by Cas9 is directed by two short RNAs known as the crRNA and tracrRNA (Deltcheva et al., Nature 471, 602-607 (2011); Jinek et al., Science 337, 816-821 (2012)), which can be fused into a chimeric single guide RNA (sgRNA) (Jinek et al., Science 337, 816-821 (2012); Jinek et al., Elife 2, e00471 (2013); Mali et al., Science 339, 823-826 (2013); Cong et al., Science 339, 819-823 (2013)). The 5' end of the sgRNA (derived from the crRNA) can base pair with the target DNA site,, thereby permitting straightforward re-programming of site-specific cleavage by the Cas9/sgRNA complex (Jinek et al., Science 337, 816-821 (2012)). However, Cas9 must also recognize a specific protospacer adjacent motif (PAM) that lies proximal to the DNA that base pairs with the sgRNA (Mojica et al., Microbiology 155, 733-740 (2009); Shah et al., RNA Biol 10, 891-899 (2013); Jinek et al., Science 337, 816-821 (2012); Sapranauskas et al, Nucleic Acids Res 39, 9275-9282 (2011); Horvath et al., J Bacteriol 190, 1401-1412 (2008)), a requirement that is needed to initiate sequence-specific recognition (Sternberg et al., Nature 507, 62-67 (2014)) but that can also constrain the targeting range of these nucleases for genome editing. The broadly used Streptococcus pyogenes Cas9 (SpCas9) recognizes a short NGG PAM (Jinek et al., Science 337, 816-821 (2012); Jiang et al., Nat Biotechnol 31, 233-239 (2013)), which occurs once in every 8 bps of random DNA sequence. By contrast, other Cas9 orthologues characterized to date can recognize longer PAMs (Horvath et al., J Bacteriol 190, 1401-1412 (2008); Fonfara et al., Nucleic Acids Res 42, 2577-2590 (2014); Esvelt et al., Nat Methods 10, 1116-1121 (2013); Ran et al., Nature 520, 186-191 (2015); Zhang et al., Mol Cell 50, 488-503 (2013)). For example, Staphylococcus aureus Cas9 (SaCas9), one of several smaller Cas9 orthologues that are better suited for viral delivery (Horvath et al., J Bacteriol 190, 1401-1412 (2008); Ran et al., Nature 520, 186-191 (2015); Zhang et al., Mol Cell 50, 488-503 (2013)), recognizes a longer NNGRRT (SEQ ID NO:46) PAM that is expected to occur once in every 32 bps of random DNA. Broadening the targeting range of Cas9 orthologues is important for various applications including the modification of small genetic elements (e.g., transcription factor binding sites (Canver et al. Nature.;527(7577):192-7 (2015); Vierstra et al., Nat Methods. 12(10):927-30 (2015)) or performing allele-specific alterations by positioning sequence differences within the PAM (Courtney, D.G. et al. Gene Ther. 23(1):108-12 (2015). SUMMARY As described herein, the commonly used Streptococcus pyogenes Cas9 (SpCas9) as well as the Staphylococcus aureus Cas9 (SaCas9) were engineered to recognize novel PAM sequences using structural information, bacterial selection-based directed evolution, and combinatorial design. These altered PAM specificity variants enable robust editing of endogenous gene sites in zebrafish and human cells that cannot be efficiently targeted by wild-type SpCas9 or SaCas9. In addition, we identified and characterized another SpCas9 variant that exhibits improved PAM specificity in human cells, possessing reduced activity on sites with non-canonical NAG and NGA PAMs. Furthermore, we found that two smaller-size Cas9 orthologues with completely different PAM specificities, Streptococcus thermophilus Cas9 (St1Cas9) and Staphylococcus aureus Cas9 (SaCas9), function efficiently in our bacterial selection system and in human cells, suggesting that our engineering strategies could be extended to Cas9s from other species. Our findings provide broadly useful SpCas9 and SaCas9 variants, referred to collectively herein as "variants" or "the variants". The disclosure provides isolated Streptococcus pyogenes Cas9 (SpCas9) proteins with mutations at one or more of the following positions: G1104, S1109, L1111, D1135, S1136, G1218, N1317, R1335, T1337, e.g., comprising a sequence that is at least 80% identical to the amino acid sequence of SEQ ID NO:1. In some embodiments, t