BR-122020022312-B1 - Library of synthetic polynucleotides encoding variable regions of the light chain, their use, polypeptides, kit and method of production of synthetic polynucleotides encoding a CDRL3 library.
Abstract
The present invention overcomes the inherent shortcomings of known methods for generating antibody-coding polynucleotide libraries by specifically designing libraries with targeted sequence and length diversity.
Inventors
- Maximiliano Vasquez
- Arvind Sivasubramanian
- Michael Feldhaus
Assignees
- ADIMAB, LLC
Dates
- Publication Date
- 20260310
- Application Date
- 20110714
- Priority Date
- 20100716
Claims (8)
- 1. A library of synthetic polynucleotides encoding variable regions of the light chain, characterized in that the polypeptide sequences of the variable regions of the light chain (i) vary in two or three positions between Kabat positions 89-94, and (ii) comprise polypeptide sequences given in Tables 3 and 4, such that said polypeptide sequences are produced by translation of polynucleotide sequences given in Tables 5-7.
- 2. Polypeptides, characterized in that they are as provided in Tables 3 and 4, such that they are expression products of the library as defined in claim 1.
- 3. Use of the library, as defined in claim 1 or 2, characterized in that it is for isolating an antibody that binds specifically to an antigen.
- 4. Kit, characterized in that it contains the library, as defined in claim 1.
- 5. Method for producing synthetic polynucleotides encoding a CDRL3 library, characterized in that it comprises: (i) obtaining a reference set of light chain sequences, wherein the reference set contains light chain sequences with VL segments originating from the same germline gene IGVL and/or its allelic variants; (ii) determining which amino acids occur at each of the CDRL3 positions in the reference set that are encoded by the IGVL gene; (iii) synthesizing variable domain light chain coding sequences wherein two or three positions between Kabat positions 89 and 94, inclusive, contain degenerate codons encoding two or more of the five most frequently occurring amino acid residues at the corresponding positions in the reference set; and (iv) synthesizing the polynucleotides encoding the CDRL3 library.
- 6. Method according to claim 5, characterized in that the CDRL3 library contains sequences of the kappa light chain and/or the lambda light chain.
- 7. Method according to claim 6, characterized in that (i) the kappa sequences CDRL3 comprise FT, LT, IT, RT, WT, YT, [X] T, [X] PT, [X] FT, [X] LT, [X] IT, [X] RT, [X] WT, [X] YT, [X] PFT, [X] PLT, [X] PIT, [X] PRT, [X] PWT and [X] PYT, where [X] corresponds to the amino acid residue found at position 95 (Kabat) in the respective VK germline sequence; and (ii) the lambda sequences CDRL3 comprise YV, VV, WV, AV or V.
- 8. Method according to claim 7, characterized in that [X] is P or S.
Description
RELATED REQUEST [001] This application claims priority for U.S. Provisional Application Serial No. 61/365,194, filed July 16, 2010, which is incorporated herein in its entirety by this reference. BACKGROUND [002] Antibodies have profound relevance as research tools and in diagnostic and therapeutic applications. However, identifying useful antibodies is difficult, and once identified, antibodies generally require considerable reformulation or "humanization" before they are suitable for therapeutic applications in humans. [003] Many methods for antibody identification involve displaying antibody libraries derived by amplifying nucleic acids from tissues or B cells. Some of these methods have utilized synthetic libraries. However, many of these methods have limitations. For example, most human antibody libraries known in the art contain only the diversity of antibody sequences that can be captured or cloned from a biological source (e.g., B cells) experimentally. In this sense, these libraries may overrepresent some sequences, during complete deficiency, or underrepresent other sequences, particularly those binding human antigens. Most synthetic libraries known in the art have other limitations, such as the occurrence of unnatural (i.e., non-human) amino acid sequence motifs that have the potential to be immunogenic. [004] In this sense, there is a need for diverse antibody libraries containing candidate antibodies that are non-immunogenic (i.e., human) and have desirable properties (e.g., the ability to recognize a wide variety of antigens). However, obtaining these libraries requires balancing the competing goals of generating diverse libraries while still maintaining the human character of the sequences within the library. The present invention provides antibody libraries that have these and other desirable characteristics, and methods for producing and using such libraries. SUMMARY [005] The present invention provides, among other things, improvements in the design and production of synthetic libraries that mimic the diversity of the natural human repertoire of CDRH3, CDRL3, heavy chain, light chain, and/or full-length (intact) antibody sequences. In some embodiments, the invention defines and provides methods for generating theoretical segment pools of TN1, DH, N2, and H3-JH segments to be considered for inclusion in a physical manifestation of a library (e.g., polynucleotide or polypeptide) comprising or encoding CDRH3 sequences (e.g., an antibody library). In certain embodiments, the present invention defines and provides methods for matching the individual elements of these theoretical segment pools to a reference set of CDRH3 sequences, to determine the frequency of occurrence (or segment usage weight) of each of the segments in the theoretical segment pool in the reference set. While any set of CDRH3 sequences can be used as a reference set, the invention also defines and provides methods for generating specific reference sets or subsets of interest. For example, among other things, the present invention provides methods for filtering a defined original reference to obtain a reference set provided with a pre-immune character. Furthermore, methods are provided for defining and/or identifying segments that occur within CDRH3 sequences in the reference set but not in the theoretical segment pool. Such segments can be added to a theoretical segment pool, for example, to be considered for inclusion in a physical library. Although the frequency of occurrence of a given segment of a reference set is useful for selecting segments for inclusion in a physical library, the invention also provides a number of physicochemical and biological properties that can be used in conjunction (alone or with any other criteria) to select segments for inclusion in a physical library. [006] In some embodiments, the invention provides libraries that differ from some other libraries known in the art in that they are not stochastic from place to place in composition or sequence and are therefore inherently less random than those other libraries known in the art (see, for example, Example 14 of US Pub. No. 2009/0181855, incorporated by reference in its entirety, for a discussion of information content and randomness). In some embodiments, degenerate oligonucleotides can be used to increase the diversity of elements in a library while additionally improving the match to a reference set of sequences (e.g., CDRH3, CDRL3, heavy chain, light chain, and/or full-length (intact) antibody sequences). [007] The invention also provides libraries whose elements have sequences that relate to each other, wherein these would be selected for inclusion in a physical library by performing the analyses described in this document, for example, generating a defined CDRH3 reference as in Example 3; generating theoretical segment pools as in Examples 5 to 7; matching the elements of a theoretical segment pool to the defined reference match as in Examp