Search

US-12624361-B2 - Fusion proteins for DNA base editing

US12624361B2US 12624361 B2US12624361 B2US 12624361B2US-12624361-B2

Abstract

The present invention relates to methods and compositions for modifying a target site in the genome of a cell. Fusion proteins including one or more DNA binding domains and one or more heterologous domains, such as DNA modifying domains, connected by improved linker sequences are provided. Codon optimized polynucleotides encoding fusion proteins including one or more DNA binding domains and one or more heterologous domains connected by improved linker sequences are provided.

Inventors

  • Jianping Xu
  • Jiang Li

Assignees

  • SYNGENTA CROP PROTECTION AG

Dates

Publication Date
20260512
Application Date
20200918
Priority Date
20190926

Claims (1)

  1. 1 . A fusion protein comprising from the N-terminus to the C-terminus a heterologous TadA deaminase domain, a first linker sequence, and a Type V CRISPR-Cas protein, wherein the first linker sequence comprises the sequence GGGGS at least six times (SEQ ID NO: 11), wherein the Type V CRISPR-Cas protein is a catalytically inactive Cas12a, wherein the fusion protein comprises SEQ ID NO: 81.

Description

RELATED APPLICATION INFORMATION This application is a 371 of International Application No. PCT/US2020/051383, filed 18 Sep. 2020, which claims priority to PCT/CN2019/108026, filed 26 Sep. 2019, the contents of which are incorporated herein by reference herein. FIELD OF THE INVENTION The present invention relates to methods and compositions for targeted nucleotide base editing in the genome of a cell. STATEMENT REGARDING ELECTRONIC SUBMISSION OF A SEQUENCE LISTING A Sequence Listing in ASCII text format, submitted under 37 C.F.R. § 1.821, entitled “81945_USNPE_ST25.txt”, created Mar. 23, 2022, approximately 702 kilobytes, is attached and filed herewith and is incorporated herein by reference. BACKGROUND OF THE INVENTION There is a great need in agriculture to have the capability to edit the genome of plants in order to create favorable alleles. It could be possible to increase yields or prevent disease. Genome editing is a new field where progress in plants is lagging behind. Further, changes to the genome other than the intended change are a problem which limits application of the desired changes. CRISPR-CAS9 works by making a double stranded cut to the DNA. As this break is repaired by non-homologous end joining or homology dependent repair, DNA base insertions or deletions may occur. A strategy called base editing makes changes to the DNA without cutting and creating insertions and deletions. In one version, an enzyme called a cytidine deaminase is targeted to a specific base by a CAS9 (Shimatani et al, 2017. Nat. Biotechnol. 35, 441-443) or a CAS12a (Li et al, 2018. Nat. Biotechnol. 36, 324-327) enzyme which is modified so that it cannot cut DNA. The cytidine deaminase and the nuclease deficient CAS9 or CAS12a are fused together by a connection through an amino acid linker. Improvements in the linker connection can improve the functionality of the fusion protein such as by improving the precision of the cutting by reducing off target base changes. SUMMARY OF THE INVENTION To meet this need for improvements, we provide an optimized and improved Cas12a enzyme and construct. In particular, we provide a fusion protein comprising a heterologous domain, a first linker sequence, and a Type V CRISPR-Cas enzyme. The first linker sequence comprises a repeated GGGGS sequence. The heterologous domain can be a deaminase, polymerase, nuclease, relaxase, alkyltransferase, methyltransferase, adenosine deaminase, cytidine deaminase, oxidase, thymine alkyltransferase, adenine oxidase, adenosine methyltransferase, glycosylase or nuclear localization signal. For base editing, the heterologous domain is a deaminase domain—such as a cytidine deaminase or an adenine deaminase. The cytidine deaminase domain may be an activation-induced cytidine deaminase (“AID”), or an apolipoprotein B mRNA-editing complex (“APOBEC”) domain such as from the APOBEC1 family of deaminases. In some contexts, the APOBEC domain comprises a sequence at least 70% identical to SEQ ID NO: 1. Where an adenine deaminase is required, the adenine deaminase may be a TadA domain comprising an amino acid sequence at least 70% identical to SEQ ID NO: 92. Where the type V CRISPR-Cas enzyme is a type V-A (“Cas12a”) enzyme, the Cas12a is selected from the group comprised of SEQ ID NO: 3, SEQ ID NO: 6, SEQ ID NO: 22, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, and SEQ ID NO: 48. The Cas12a domain may be catalytically inactive, but still binds to the target DNA and allows the heterologous domain to operate. Where the Cas12a is inactive, its sequence is SEQ ID NO: 3, SEQ ID NO: 6, or SEQ ID NO: 22. The first linker sequence between the heterologous domain and the Cas12a enzyme may comprise GGGGS repeated at least three times. In other uses, the first linker sequence may comprise GGGGS repeated at least six times. The fusion protein may comprise SEQ ID NO: 11, 12, 13, or 44, and it may also include a uracil DNA glycosylase inhibitor (“UGI”) domain (as represented by SEQ ID NO: 8). The UGI domain may be linked to the Cas12a enzyme by a second linker comprising the sequence SGGS. The fusion protein may comprise SEQ ID NO: 17, SEQ ID NO: 24, SEQ ID NO: 35, SEQ ID NO: 39, SEQ ID NO: 43, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO:87, or SEQ ID NO:89. These fusion proteins, when contacted with DNA, produces on-target edits at an increased frequency and off-target edits at a reduced frequency compared to prior art fusion proteins which lack a first linker sequence of a repeated GGGGS sequence. We also provide a method of editing plant genomic DNA by contacting plant genomic DNA with: (a) a fusion protein as described by one of the above aspects and optionally comprising a UGI domain; and (b) a guide RNA (“gRNA”) targeting the fusion protein of step (a) to a target DNA sequence of the plant genomic DNA; where the edited plant genomic DNA comprises reduced off-target edits compared to plant genomic DNA edi