Search

EP-3365356-B2 - NUCLEOBASE EDITORS AND USES THEREOF

EP3365356B2EP 3365356 B2EP3365356 B2EP 3365356B2EP-3365356-B2

Inventors

  • LIU, DAVID, R.
  • KOMOR, ALEXIS CHRISTINE
  • REES, Holly A.
  • KIM, YONGJOO

Dates

Publication Date
20260513
Application Date
20161022

Claims (17)

  1. A fusion protein comprising: (i) a Cas9 domain; (ii) a cytidine deaminase domain; and (iii) a uracil glycosylase inhibitor (UGI) domain, wherein the Cas9 domain is a Cas9 nickase (nCas9) or a dead Cas9 (dCas9) domain, wherein the Cas9 domain comprises a D10A and/or H840A mutation of the amino acid sequence provided in SEQ ID NO: 10, or a corresponding mutation in any of the amino acid sequences provided in SEQ ID NOS: 11-260.
  2. The fusion protein of claim 1, wherein the Cas9 domain is an nCas9 domain that comprises the amino acid sequence provided in SEQ ID NO: 674.
  3. The fusion protein of claim 1 or 2, wherein the Cas9 domain is an nCas9 domain that nicks a nucleotide target strand of a nucleotide duplex, wherein the nucleotide target strand is the strand that binds to a gRNA of the nCas9 domain, optionally wherein the Cas9 domain is an nCas9 domain that comprises a D10A mutation in the amino acid sequence provided in SEQ ID NO: 10, or a corresponding mutation in any of the amino acid sequences provided in SEQ ID NOs: 11-260.
  4. The fusion protein of any one of claims 1-3, wherein the cytidine deaminase domain (i) is a deaminase from the apolipoprotein B mRNA-editing complex (APOBEC) family deaminase, optionally wherein the APOBEC family deaminase is selected from the group consisting of APOBEC1 deaminase, APOBEC2 deaminase, APOBEC3A deaminase, APOBEC3D deaminase, APOBEC3F deaminase, APOBEC3G deaminase, and APOBEC3H deaminase; (ii) comprises an amino acid sequence that is at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to an amino acid sequence of SEQ ID NO: 266-284, or comprises an amino acid sequence that is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% similar to an amino acid sequence of SEQ ID NO: 607-610; (iii) comprises an amino acid sequence of SEQ ID NO: 266-284 or 607-610; or (iv) is an activation-induced deaminase (AID).
  5. The fusion protein of any one of claims 1-4, wherein the UGI domain comprises an amino acid sequence of SEQ ID NO: 600.
  6. The fusion protein of any one of claims 1-5, wherein the fusion protein comprises the structure: NH 2 -[cytidine deaminase domain]-[Cas9 domain]-[UGI domain]-COOH, wherein each instance of "]-[" comprises an optional linker.
  7. The fusion protein of any one of claims 1-6, wherein the cytidine deaminase domain and the nCas9 domain are linked via a linker comprising the amino acid sequence (GGGS) n (SEQ ID NO: 265), (GGGGS) n (SEQ ID NO: 5), (G) n , (EAAAK) n (SEQ ID NO: 6), (GGS) n , SGSETPGTSESATPES (SEQ ID NO: 7), or (XP) n motif, or a combination thereof, wherein n is independently an integer between 1 and 30, inclusive, and wherein X is any amino acid; or optionally wherein the nCas9 domain and the UGI domain are linked via a linker comprising the amino acid sequence of (GGGS) n (SEQ ID NO: 265), (GGGGS) n (SEQ ID NO: 5), (G) n , (EAAAK) n (SEQ ID NO: 6), (GGS) n , SGSETPGTSESATPES (SEQ ID NO: 7), or (XP) n motif, or a combination thereof, wherein n is independently an integer between 1 and 30, inclusive, and wherein X is any amino acid.
  8. The fusion protein of any one of claims 1-7, further comprising a nuclear localization sequence (NLS).
  9. The fusion protein of any one of claims 1-8, wherein the fusion protein comprises the amino acid sequence set forth in SEQ ID NO: 594.
  10. The fusion protein of claim 1, wherein the fusion protein comprises the amino acid sequence set forth in any one of SEQ ID NOs: 591-593, 611, 612, 615, 657 and 658.
  11. A method for editing a DNA molecule comprising contacting the DNA molecule with the fusion protein of any one of claims 1-10 and a guide RNA (gRNA), provided that the method is not a method for treatment of the human or animal body by surgery or therapy.
  12. The method of claim 11, wherein the target DNA sequence comprises a sequence associated with a disease or disorder, and the disease or disorder is cystic fibrosis, phenylketonuria, epidermolytic hyperkeratosis (EHK), Charcot-Marie-Toot disease type 4J, neuroblastoma (NB), von Willebrand disease (vWD), myotonia congenital, hereditary renal amyloidosis, dilated cardiomyopathy (DCM), hereditary lymphedema, familial Alzheimer's disease, HIV, prion disease, chronic infantile neurologic cutaneous articular syndrome (CINCA), desmin-related myopathy (DRM), or a neoplastic disease associated with a mutant PI3KCA protein, a mutant CTNNB 1 protein, a mutant HRAS protein, or a mutant p53 protein.
  13. A complex comprising the fusion protein of any one of claims 1-10, and a gRNA bound to the Cas9 domain of the fusion protein, optionally wherein the gRNA is from 15-100 nucleotides long and comprises a sequence of at least 10 contiguous nucleotides that is complementary to a target sequence.
  14. The complex of claim 13, wherein the target sequence is a DNA sequence.
  15. A vector comprising a polynucleotide encoding the fusion protein of any one of claims 1-10, optionally further comprising a heterologous promoter driving expression of the polynucleotide encoding the fusion protein.
  16. A cell comprising the fusion protein of any one of claims 1-10, the complex of claim 13 or 14, or the vector of claim 15.
  17. The fusion protein of any one of claims 1-10, or the complex of claim 13 or 14, for use as a medicament.

Description

GOVERNMENT SUPPORT This invention was made with government support under grant number R01 EB022376 (formerly R01 GM065400) awarded by the National Institutes of Health, under training grant numbers F32 GM 112366-2 and F32 GM 106601-2 awarded by the National Institutes of Health, and Harvard Biophysics NIH training grant T32 GM008313 awarded by the National Institutes of Health. The government has certain rights in the invention. BACKGROUND OF THE INVENTION Targeted editing of nucleic acid sequences, for example, the targeted cleavage or the targeted introduction of a specific modification into genomic DNA, is a highly promising approach for the study of gene function and also has the potential to provide new therapies for human genetic diseases.1 An ideal nucleic acid editing technology possesses three characteristics: (1) high efficiency of installing the desired modification; (2) minimal off-target activity; and (3) the ability to be programmed to edit precisely any site in a given nucleic acid, e.g., any site within the human genome.2 Current genome engineering tools, including engineered zinc finger nucleases (ZFNs),3 transcription activator like effector nucleases (TALENs),4 and most recently, the RNA-guided DNA endonuclease Cas9,5 effect sequence-specific DNA cleavage in a genome. This programmable cleavage can result in mutation of the DNA at the cleavage site via non-homologous end joining (NHEJ) or replacement of the DNA surrounding the cleavage site via homology-directed repair (HDR).6,7 One drawback to the current technologies is that both NHEJ and HDR are stochastic processes that typically result in modest gene editing efficiencies as well as unwanted gene alterations that can compete with the desired alteration.8 Since many genetic diseases in principle can be treated by effecting a specific nucleotide change at a specific location in the genome (for example, a C to T change in a specific codon of a gene associated with a disease),9 the development of a programmable way to achieve such precision gene editing would represent both a powerful new research tool, as well as a potential new approach to gene editing-based human therapeutics. SUMMARY OF THE INVENTION In a first aspect, the invention provides a fusion protein comprising: (i) a Cas9 domain; (ii) a cytidine deaminase domain; and (iii) a uracil glycosylase inhibitor (UGI) domain, wherein the Cas9 domain is a Cas9 nickase (nCas9) domain or a dead Cas9 (dCas9) domain, wherein the Cas9 domain comprises a D10A and/or H840A mutation of the amino acid sequence provided in SEQ ID NO: 10, or a corresponding mutation in any of the amino acid sequences provided in SEQ ID NOS: 11-260. In a second aspect, the invention provides a method for editing a DNA molecule comprising contacting the DNA molecule with the fusion protein of the first aspect of the invention and a guide RNA. In a third aspect, the invention provides a complex comprising the fusion protein of the first aspect of the invention, and a guide RNA (gRNA) bound to the Cas9 domain of the fusion protein., optionally wherein the guide RNA is from 15-100 nucleotides long and comprises a sequence of at least 10 contiguous nucleotides that is complementary to a target sequence. In a fourth aspect, the invention provides a vector comprising a polynucleotide encoding the fusion protein of the first aspect of the invention, optionally further comprising a heterologous promoter driving expression of the polynucleotide encoding the fusion protein. In a fifth aspect, the invention provides a cell comprising the fusion protein of the first aspect of the invention, or the complex of the third aspect of the invention, or the vector of the fourth aspect of the invention. In a sixth aspect, the invention provides the fusion protein of the first aspect of the invention for use as a medicament. Certain embodiments of the invention are set out in the dependent claims. The clustered regularly interspaced short palindromic repeat (CRISPR) system is a recently discovered prokaryotic adaptive immune system10 that has been modified to enable robust and general genome engineering in a variety of organisms and cell lines.11 CRISPR-Cas (CRISPR associated) systems are protein-RNA complexes that use an RNA molecule (sgRNA) as a guide to localize the complex to a target DNA sequence via base-pairing.12 In the natural systems, a Cas protein then acts as an endonuclease to cleave the targeted DNA sequence.13 The target DNA sequence must be both complementary to the sgRNA, and also contain a "protospacer-adjacent motif" (PAM) at the 3'-end of the complementary region in order for the system to function.14 Among the known Cas proteins, S. pyogenes Cas9 has been mostly widely used as a tool for genome engineering.15 This Cas9 protein is a large, multi-domain protein containing two distinct nuclease domains. Point mutations can be introduced into Cas9 to abolish nuclease activity, resulting in a dead Cas9 (dCas9) that still