US-20260125708-A1 - Engineered Integration Enzymes and Uses Thereof

US20260125708A1US 20260125708 A1US20260125708 A1US 20260125708A1US-20260125708-A1

Abstract

The present disclosure provides compositions comprising engineered integration enzymes/eLSR and methods of using the same. In certain embodiments, the engineered integration enzyme comprises mutation(s) that substantially maintain or enhance integration activity at a pair of cognate integration recognition sites, and substantially decrease off-target integration activity at a pair of off-target integration recognition sites, when compared to a corresponding large serine integrase without said one or more substitutions (cLSR). The eLSR may further comprise a stabilization domain that increases the stability of the integration enzyme as compared to integration enzymes not comprising the stabilization domain.

Inventors

Jesse Christine COCHRANE

Assignees

BASECAMP RESEARCH LTD

Dates

Publication Date: 20260507
Application Date: 20250506

Claims (20)

1 . An engineered large serine integrase (eLSR) comprising one or more substitutions that substantially maintain or enhance integration activity at a pair of cognate integration recognition sites, and substantially decrease off-target integration activity at a pair of off-target integration recognition sites, when compared to a corresponding large serine integrase without said one or more substitutions (cLSR); optionally, said one or more substitutions are in a zinc ribbon domain (ZD) of the cLSR.
2 . The eLSR of claim 1 , wherein said cLSR comprises an amino acid sequence that is at least 80% identical to any one of: (a) SEQ ID NOs: 378-393; (b) SEQ ID NOs: 85-158 of WO2023/177424; and (c) SEQ ID NOs: 1-16 and 163-1162 and 3166-3175 of WO2023/070031.
3 .- 38 . (canceled)
39 . The eLSR of claim 1 , wherein the eLSR is linked to a gene editor polypeptide.
40 .- 44 . (canceled)
45 . A polynucleotide comprising a nucleic acid sequence encoding the eLSR of claim 1 .
46 . The polynucleotide of claim 45 , wherein the nucleic acid sequence encoding the eLSR and/or the fusion is codon optimized (e.g., codon-optimized for expression in a mammalian cell, such as a human cell).
47 .- 49 . (canceled)
50 . A vector comprising the polynucleotide of claim 45 .
51 . A host cell comprising the vector of claim 50 .
52 . A fusion protein, comprising: (a) a DNA binding domain, optionally comprising a nickase activity; (b) a reverse transcriptase; and (c) an eLSR of claim 1 , wherein at least any two of elements (a), (b), or (c) are linked via at least a first C-terminal linker.
53 . The fusion protein of claim 52 , wherein the C-terminal linker comprises a sequence in Table 3.
54 . A polynucleotide comprising a nucleic acid sequence encoding the fusion protein of claim 52 .
55 . A vector comprising the polynucleotide of claim 54 .
56 . A host cell comprising the vector of claim 55 .
57 . A system for site-specifically integrating a donor polynucleotide template into a mammalian cell genome at a target DNA sequence, comprising: (1) an attachment site containing gRNA (atgRNA) comprising at least a portion of an at least first integration recognition site; (2) a gene editor polypeptide comprising a DNA binding nickase domain linked to a reverse transcriptase domain capable of incorporating the integration recognition site into the target DNA sequence; (3) an eLSR of claim 1 ; and (4) a donor polynucleotide template linked to a sequence that is an integration cognate of the integration recognition site present in the atgRNA, whereby the gene editor polypeptide site-specifically integrates the integration recognition site into the target DNA sequence, and, whereby the eLSR integrates the donor polynucleotide template into the target DNA sequence at the integration recognition site.
58 . The system of claim 57 , wherein the first atgRNA comprises: (i) a domain that is capable of guiding the gene editor polypeptide to the target DNA sequence; and (ii) a reverse transcriptase (RT) template that comprises at least a portion of an at least first integration recognition site, whereby the at least portion of the at least first integration recognition site is integrated into the genome of the cell at the target sequence.
59 .- 61 . (canceled)
62 . A method for site-specifically integrating a donor polynucleotide template into a mammalian cell genome at a target DNA sequence, comprising: (1) incorporating an integration recognition site into the genome by delivering into the cell: i) an attachment site-containing guide RNA (atgRNA) comprising at least a portion of an at least first integration recognition site; and ii) a gene editor polypeptide or polynucleotide encoding the gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain, and is capable of incorporating the integration recognition site into the target DNA sequence; and iii) optionally, a nicking gRNA; and (2) integrating the donor polynucleotide template into the genome by delivering into the cell: a) an eLSR of any one of claim 1 ; and b) a donor polynucleotide template, wherein the donor polynucleotide template is linked to a sequence that is an integration cognate of the integration recognition site present in the atgRNA, and wherein the donor polynucleotide template is integrated into the genome at the incorporated genomic integration recognition site by the eLSR.
63 . The method of claim 62 , wherein the atgRNA, the gene editor polypeptide or polynucleotide encoding the gene editor polypeptide, the optional nicking gRNA, the eLSR, and the donor polynucleotide template are introduced into the cell concurrently.

Description

1. CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application No. 63/643,230, filed May 6, 2024, which is hereby incorporated in its entirety by reference herein. 2. SEQUENCE LISTING The instant application contains a Sequence Listing which has been submitted by Patent Center and is hereby incorporated by reference in its entirety. Said XML copy, created on Sep. 11, 2025, is named 62809US_CRF_Sequence Listing.xml and is 1,196,506 bytes in size. 3. BACKGROUND Programmable, efficient, and multiplexed genome integration of large, diverse DNA cargo independent of DNA repair remains an unsolved challenge of genome editing. Current gene integration approaches require double strand breaks that evoke DNA damage responses and rely on repair pathways that are inactive in terminally differentiated cells. Furthermore, CRISPR-based approaches that bypass double stranded breaks, such as Prime editing, are limited to modification or insertion of short sequences. While targeted integration of large donor DNA sequences at specific target locations/genomic sites have been achieved using certain large serine integrases (LSRs), occasional integration of the donor DNA at unintended non-target locations/genomic sites can be problematic, in that two classes of potential genotoxic events may occur due to such integrase-mediated off-target integration—DNA mutagenesis and DNA structural variants formation. Specifically, DNA mutagenesis could potentially arise via DNA free-end formation from LSR cleavage or abortive integration. Meanwhile, DNA structural variants could potentially arise via cryptic recombination between cryptic integrase sites in the human genome leading to off-target cargo insertion, structural variants, or chromosomal rearrangements. There is a need in the art for techniques which address and overcome these shortcomings and enable the co-delivery of gene editor constructs and associated donor templates for the insertion and/or deletion of large sequences into cells for therapeutic and circuit-based uses for broad purposes, across eukaryotic as well as prokaryotic systems. 4. SUMMARY The present disclosure describes integration enzymes (e.g., engineered large serine integrase or eLSR) engineered such that upon being introduced into a cell, the integration enzyme has increased fidelity/specificity towards the cognate integration recognition sequences/sites at the target integration sequence, over off-target integration recognition sequences/sites at off-target integration sequences. The engineered large serine integrase (eLSR) described herein comprises one or more substitutions (e.g., substitutions in a zinc ribbon domain (ZD)) that substantially maintain or enhance integration activity at a pair of cognate integration recognition sites, and substantially decrease off-target integration activity at a pair of off-target integration recognition sites, when compared to a corresponding large serine integrase without said one or more substitutions (cLSR). In certain embodiments, the one or more substitutions are in a zinc ribbon domain (ZD) of the cLSR. In certain embodiments, the cLSR comprises an amino acid sequence that is at least 80% identical to any one of (a) SEQ ID NOs: 378-393; (b) SEQ ID NOs: 85-158 of WO2023/177424 (incorporated herein by reference); and (c) SEQ ID NOs: 1-16 and 163-1162 and 3166-3175 of WO2023/070031 (incorporated herein by reference). In certain embodiments, the cLSR is a BxB1 polypeptide, SsuINT, SssINT, SscINT, Ssc2INT, SsdINT, SmcINT, UhmINT, SacINT, RsaINT, Rsa2INT, Bxb1, Tp91NT, Bt1INT, BceINT, BcyINT, SluINT, or a functional fragment/variant thereof. In certain embodiments, the cLSR is a BxB1 polypeptide or a functional fragment/variant thereof. In certain embodiments, the cLSR comprises an amino acid sequence having at least 80% sequence identity to SEQ ID NO: 388. In certain embodiments, the cLSR has an amino acid sequence of SEQ ID NO: 388. In certain embodiments, the pair of cognate integration recognition sites are: an attB sequence and an attP sequence; or a modified AttB sequence and a modified AttP sequence. In certain embodiments, the pair of off-target integration recognition sites are CAS031 attB sequence and CAS031 attP sequence; or CAS421 attB sequence and CAS421 attP sequence. See FIG. 59. In certain embodiments, at least one of the pair of cognate integration recognition sites is integrated into a mammalian cell genome at a target DNA sequence. In certain embodiments, the at least one of the pair of cognate integration recognition sites is integrated into the mammalian cell genome at the target DNA sequence by: (1) programmable addition through site-specific targeting elements (PASTE) using a n attachment site-containing guide RNA (atgRNA) and a gene editor polypeptide; (2) homology directed repair (HDR), such as short-fragment homologous recombination (SFHR); (3) ligation-assisted homologous recombination (LAHR); (4) liga