CA-3164962-C - SYNTHESIS OF LACTONE DERIVATIVES AND THEIR USE IN THE MODIFICATION OF PROTEINS

CA3164962CCA 3164962 CCA3164962 CCA 3164962CCA-3164962-C

Abstract

Site-specific modifications of proteins are desirable in biotechnological applications such as biopharmaceuticals, immunotherapy, vaccines, and are useful in chemical biology. Gluconoylation is a non-enzymatic, covalent, post-translational modification commonly observed on N-terminal His-Tags bearing proteins. We synthesized glucono-1,5-lactone derivatives, including azido variants for selective acylation. High yield acylation is achieved by simply mixing derivatives with target protein amidst diverse conditions of temperatures, aqueous buffers, excipients, or complex cell lysate.

Inventors

Alexander Redfern MORRIS
Grigorij SUTOV
Karl Dietrich BRUNE

Assignees

GENIE BIOTECH UK LTD.

Dates

Publication Date: 20260505
Application Date: 20201218
Priority Date: 20191220

Claims (1)

CLAIMS 1. An N-terminally acylated protein comprising formula (VIII), formula (VII) or formula (IX): (VIII), wherein: (i) R1, R2, R3 and R5 are each independently hydrogen, a hydroxyl, a methyl, or a handle, (ii) R4 is hydrogen, a hydroxyl, a methyl, a handle, a carboxyl, an ester, or an amide, (iii) at least one of R1, R2, R3 and R4 is a handle, (iv) one or none of R1, R2, R3 and R4 is a methyl; and (v) X comprises the N-terminal amino acid residue of the protein; (VII), wherein: (i) R1, R2, R3, R5 and R6 are each independently hydrogen, a hydroxyl, a methyl, or a handle, (ii) R4 is hydrogen, a hydroxyl, a methyl, a handle, or a carboxyl, (iii) at least one of R1, R2, R3 and R4 is a handle, (iv) one or none of R1, R2, R3 and R4 is a methyl; and (v) X comprises the N-terminal amino acid residue of the protein; (IX), wherein: (i) R1, R2 and R3 are each independently hydrogen, a hydroxyl, a methyl, or a handle, (ii) R4 is a hydroxyl, (iii) at least one of R1, R2, and R3 is a handle, (iv) one or none of R1, R2, and R3 is a methyl; and (v) X comprises the N-terminal amino acid residue of the protein. R3 R1 R4 R5 R2 X O R3 R1 R5 R2 X O R6 R4 R4 R3 R1 R2 X O '<~ y y 111 2. The N-terminally acylated protein of claim 1 comprising formula (IV): (IV), wherein: (i) R1, R2 and R3 are each independently hydrogen, a hydroxyl, a methyl, or an azide, (ii) R4 is hydrogen, a hydroxyl, a methyl, an azide, or a carboxyl, (iii) at least one of R1, R2, R3 and R4 is an azide, (iv) one or none of R1, R2, R3 and R4 is a methyl; and (v) X comprises the N-terminal amino acid residue of the protein. 3. The N-terminally acylated protein of claim 1 or 2 comprising formula (X): (X), wherein: (i) R4 is an azide, R3 is a hydroxyl, R2 is a hydroxyl and R1 is a hydroxyl; (ii) R4 is a hydroxyl, R3 is an azide, R2 is a hydroxyl and R1 is a hydroxyl; (iii) R4 is a hydroxyl, R3 is a hydroxyl, R2 is an azide and R1 is a hydroxyl; or (iv) R4 is a hydroxyl, R3 is a hydroxyl, R2 is a hydroxyl and R1 is an azide. 4. The N-terminally acylated protein of claim 2 or 3, wherein R4 is an azide, R3 is a hydroxyl, R2 is a hydroxyl and R1 is a hydroxyl. 5. The N-terminally acylated protein of any one of claims 1-4, wherein X comprises the amino acid sequence Xaa1-Xaa2-Xaa3-Xaa4, and (i) Xaa1 is an amino acid residue selected from: Ala, Gly, Ser, His, or Leu; (ii) Xaa2 is any amino acid residue or absent; (iii) Xaa3 is any amino acid residue or absent; and (iv) Xaa4 consists of 3 or more His residues. X R4 OH R3 R2 R1 O OH R2 0 R~xA R3 R1 ~A . . - - - - - - - 112 6. The N-terminally acylated protein of any one of claims 1-4, wherein: (A) X comprises the amino acid sequence Gly-Xaa1-Xaa2-Xaa3, and (i) Xaa1 is any amino acid residue or absent; (ii) Xaa2 is an amino acid residue or absent; and (iii) Xaa3 consists of 3 or more His residues; or (B) X comprises the amino acid sequence Gly-Xaa1, and (i) Xaa1 is an amino acid residue selected from Gly, Ala, or Ser. 7. The N-terminally acylated protein of any one of claims 1-4, wherein: (A) X comprises the amino acid sequence Xaa1-Xaa2 and (i) Xaa1 is amino acid residue Gly or absent; and (ii) Xaa2 consists of 1 or more His residues; (B) X comprises the amino acid sequence Gly-Xaa1-Xaa2, and (i) Xaa1 is amino acid residue selected from Ser, Gly, Ala, Tyr, Leu, Arg, or absent; and (ii) Xaa2 consists of 2 or more His residues; or (C) X comprises the amino acid sequence Gly-Xaa1-Xaa2, and (i) Xaa1 is Gly, Ser or Ala; and (ii) Xaa2 is: Thr-Tyr-Ser-Asp-His, Thr-Tyr-Ser-Cys-His, Thr-Tyr-Ser-Ala-His, Lys-Trp-Ser-Lys-Arg, or Ser-Gly-Ser-Lys. 8. A composition comprising the N-terminally acylated protein of any one of claims 1-7, wherein the composition further comprises a metal cation. 9. A composition comprising the N-terminally acylated protein of any one of claims 1-7, wherein: (i) the composition further comprises a metal cation from the d-block elements; and/or 113 (ii) the composition further comprises a diol-ester formed between a lactone-derived diol and an acid at the N-terminus of the N-terminally acylated protein. 10. A method for site-specifically modifying a protein at the N-terminus comprising contacting a protein with a handle-substituted carbohydrate lactone, wherein the handlesubstituted carbohydrate lactone is a handle-substituted six-membered 1,5-carbohydratelactone. 11. The method of claim 10, wherein the handle-substituted carbohydrate lactone is a compound according to formula (V): (V), wherein: (i) R1, R2 and R3 are each independently hydrogen, a hydroxyl, a methyl, or an azide, (ii) R4 is hydrogen, a hydroxyl, a methyl, an azide, or a carboxyl, (iii) at least one of R1, R2, R3 and R4 is an azide, and (iv) one or none of R1, R2, R3 and R4 is a methyl. 12. The method of any one of claims 10-11, wherein the method further comprises contacting the resulting acylated protein with a compound comprising a phosphine group, a phosphine derivative, alkene group, alkyne group, strained alkyne group, thioalkyne group, strained olefin, or oxanorbornadiene group. 13. A protein obtained through the method of any one of claims 10-12. 14. A method for identifying an acylated protein comprising running a sample suspected of containing an acylated protein on a diol-interacting, boron-containing acrylamide gel, wherein the acylated protein is the protein according to any one of claims 1-9 and 13. O O R1 R2 R3 R4 114 15. A method for purifying the acylated protein according to any one of claims 1-9 and 13 comprising: (1) binding a sample suspected of comprising the acylated protein onto a solid support comprising an immobilized diol-ester forming agent; and (2) eluting the protein. 16. A kit comprising: (i) a protein; and (ii) a handle-substituted six-membered 1,5-carbohydrate-lactone. 17. The kit of claim 16, wherein the kit further comprises a compound comprising a phosphine group, a phosphine derivative, alkene group, alkyne group, strained alkyne group, thioalkyne group, strained olefin, or oxanorbornadiene group.

Description

Synthesis of lactone derivatives and their use in the modification of proteins Technical field The present invention relates to methods of synthesizing lactone derivatives as well as methods for 5 chemically modifying, and purifying modified proteins. The methods of synthesizing lactone derivatives are encompassed in the field of carbohydrate chemistry and the method for modifying, and purifying modified proteins is encompassed in the field of protein chemistry. Background art 10 Well-characterized protein bioconjugates are indispensable for chemical biology studies and biopharmaceutical applications. Traditional methods for obtaining bioconjugates typically target naturally-occurring chemical groups, such as the abundant amino (Lys, N-terminus) or carboxyl groups (Asp, Glu, C-terminus), or the less common sulfhydryl groups (Cys) (Hermanson 2013). Siteselective protein conjugation is desirable but technically challenging (Rosen and Francis 2017). Unless 15 the targeted chemical group is uniquely present in a biomolecule, heterogeneous subpopulations of bioconjugates may be obtained. Heterogenous subpopulations are undesirable because they can display vastly different biological and pharmaceutical properties. Mandatory regulatory downstream mass spectrometric characterizations of bioconjugates is also complicated by heterogeneity, as the mass of the base protein is split into two or more values, even if the obtained heterogeneity should be 20 biologically harmless (Kieran F. Geoghegan 2016). Hence there is a need for methods that reduce heterogeneity in biopharmaceutical formulations (PEGylation etc.), antibody-drug conjugates (Jain et al. 2015), immunomodulatory conjugates and conjugate vaccines (Kanekiyo, Ellis, and King 2019), as well as for biomaterials (Proschel et al. 2015). Methods for natural amino acid residue modification are plentiful and have been reviewed in detail (deGruyter, Malins, and Baran 2017). Unnatural amino acids with bioorthogonal reactivities may be incorporated into recombinant proteins 30 by feeding the expression host modified amino-acids (Datta et al. 2002), the exploitation of Amber host suppression (L. Wang et al. 2001), genetic code expansion (Xie and Schultz 2006; Davis and Chin 2012; Lang and Chin 2014) or the use of Genetically recoded organism (Lajoie et al. 2013). A common drawback is that these approaches require expensive, chemically difficult to synthesize amino acid derivatives. Not only are special strains required but due to the promiscuity of tRNA synthases, 35 reports of undesirable miss incorporation of unnatural amino acids into non-target positions, detectable by mass spectrometric approaches, have been recently added to the literature (Aerni et al. 2015; Gan and Fan 2017; Kunjapur et al. 2018). WO 2021/123229 2 PCT /EP2020/087108 Alternative approaches rely on the engineering of protein domains or enzymes. For example, SplitInteins (Hirata et al. 1990), or Catcher/Tag technology (Bijan Zakeri and Howarth 2010) have been described. Recently, site-specific transglutaminases (Steffen et al. 2017), and peptide-peptide ligases 5 such as Catcher/Tag-derived Spy- and SnoopLigase (Fierer, Veggiani, and Howarth 2014; Buldun et al. 2018), butelase (Nguyen et al. 2014; Cao et al. 2016), and engineered asparaginyl endopeptidasesl (Harris et al. 2015; R. Yang et al. 2017; Jackson et al. 2018) have been presented in addition to the traditional sortase approach (Mazmanian 1999). The common drawback of enzymatic and protein domain-based approaches is that high concentrations of linkage partners and/or large excess of ligation 10 partner is required (e.g., sortases), ligating enzymes need to be removed after the conjugation, or that protein scars arise (e.g. SpyCatcher:SpyTag leaves a -100 aa scar, -11.5 kDa in Mw). One particular strategy is the modification of proteins at their N-termini. Modifying N-termini appears attractive as there can only be one N-terminus per linear polypeptide sequence and more than 80% of 15 all monomeric structures in the PDB have their N-terminus exposed (Jacob and Unger 2007). Importantly the pKa value of the N-terminal a-amino group is lower (pKa 7.6-8.0) than those of typical Lys side-chain £-amines (pKa 10.5 ± 1.1) and the N-terminus may therefore be targeted for selective, pH-controlled acyl- or alkylation (Grimsley, Scholtz, and Pace 2008). 20 However, most chemical N-terminal modification techniques can lead to off-site modifications, e.g., lysine acylation within the protein (see, for example, Martos-Maldonado et al. 2018) - thus a challenge in the field is to achieve N -terminal modification whilst minimizing off-site modification (Rosen and Francis 2017). 25 Most proteins that are fused to an N-terminal His6 tag for immobilized metal affinity chromatography (IMAC) purification actually contain the sequence MGSSHHHHHH, due to being cloned into the popular pET vector series for recombinant bacterial production in Escherichia coli. An NCBI