CN-121991943-A - Method for preparing mutation library

CN121991943ACN 121991943 ACN121991943 ACN 121991943ACN-121991943-A

Abstract

The present application relates to a method for synthesizing a long-chain double-stranded nucleic acid library from a short-chain single-stranded nucleic acid, said method being characterized by the presence of an intermediate, said intermediate being a long-chain double-stranded nucleic acid with a split. In addition, the application also relates to the application of the method in synthesizing long-chain double-stranded nucleic acid libraries, the long-chain double-stranded nucleic acid libraries prepared by the method and protein libraries encoded by the long-chain double-stranded nucleic acid libraries.

Inventors

WEI DIMING
Xia Ninuo
WU ZHIGUANG

Assignees

合肥核信生物科技有限责任公司
清华大学

Dates

Publication Date: 20260508
Application Date: 20241108

Claims (20)

1. A method of synthesizing a long-chain double-stranded nucleic acid library from a short-chain single-stranded nucleic acid, the method comprising the steps of: a) The short chain single chain nucleic acid is complementarily paired to form an intermediate, and the intermediate is long chain double chain with a split Nucleic acid, said breach being the absence of phosphodiester between adjacent two of said long-chain double-stranded nucleic acids Key, and B) Repairing the split in the intermediate to obtain the long-chain double-stranded nucleic acid; wherein the long-chain double-stranded nucleic acids in the long-chain double-stranded nucleic acid library are all homologous to the same original long-chain double-stranded nucleic acid, preferably have 80% homology, and more preferably have 90% homology.
2. The method of claim 1, wherein the library of long-chain double-stranded nucleic acids comprises two or more long-chain double-stranded nucleic acids.
3. The method of claim 1, wherein the short-chain single-stranded nucleic acid is 10-300 nucleotides in length.
4. The method according to claim 1, wherein the length of the long-chain double-stranded nucleic acid is more than 2 times, preferably more than 5 times, more preferably more than 10 times the length of the short-chain single-stranded nucleic acid.
5. The method of claim 1, wherein the short-chain single-stranded nucleic acid is designed and synthesized by: (1) Designing a nucleotide sequence comprising the long-chain double-stranded nucleic acid; (2) Dividing each single strand of the double strands of the nucleotide sequence into a plurality of short-strand single-strand nuclei An acid, each of said long-chain double-stranded nucleic acids being divided into n short-chain single-stranded nucleotides, and (3) Synthesizing the short-chain single-stranded nucleic acid.
6. The method of claim 5, wherein the long-chain double-stranded nucleic acid is the original long-chain double-stranded nucleic acid comprising no mutation and/or comprising a mutation.
7. The method of claim 6, wherein the mutation is a classical site mutation.
8. The method according to claim 5, wherein (1) the long-chain double-stranded nucleic acid comprises X mutation sites, X is an integer of 2 or more, (2) the mutation types contained in the X mutation sites are p 1 ,p 2 ,p 3 ,……,p X respectively, wherein p 1 to p X are each independently selected from 2, 3 or 4, (3) the type of long-chain double-stranded nucleic acid contained in the long-chain double-stranded nucleic acid library is p 1 ×p 2 ×p 3 ×……×p X ; Wherein the X mutation sites are preferably distributed on two or more of the short-chain single-stranded nucleic acids which are not complementarily paired.
9. The method of claim 5, wherein (1) each of the short-chain single-chain nucleic acids on one single-chain of the long-chain double-chain nucleic acids comprises a number of mutation sites each being x 1 ,x 2 ,x 3 ,x 4 ,……,x n , which is independently an integer greater than or equal to 0, (2) the mutation sites on the same one of the short-chain single-chain nucleic acids comprises a number of mutation types each being P1, P2, P3, p.sub.2, p.sub. n , wherein P1-px n is independently selected from 2, 3, or 4, (3) the same one of the short-chain single-chain nucleic acids is capable of forming P n of the short-chain single-chain nucleic acids having different sequences, P n = p1×p2×p3×. The type of the long-chain double-stranded nucleic acid contained in the long-chain double-stranded nucleic acid library is P 1 ×P 2 ×P 3 ×……×P n .
10. The method of claim 8 or 9, wherein the long-chain double-stranded nucleic acid library comprises a type of long-chain double-stranded nucleic acid of 10 10 or less.
11. The method of claim 6, wherein the original long-chain double-stranded nucleic acid comprises one or more of a coding sequence for a protein, a regulatory sequence, and a coding region for a non-coding RNA.
12. The method of claim 11, wherein the protein is a wild-type protein or a mutant protein.
13. The method of claim 11, wherein the regulatory sequence is one or more of a promoter, an enhancer, a silencer, an insulator, and a response element.
14. The method of claim 6, wherein the original long-chain double-stranded nucleic acid comprises a nucleotide sequence encoding one or more of a reporter protein, a therapeutic protein, and a prophylactic protein.
15. The method of claim 14, wherein the original long-chain double-stranded nucleic acid comprises a nucleotide sequence encoding a reporter protein.
16. The method of claim 15, wherein the reporter protein is selected from one or more of Fluorescent Protein (FP), luciferase (Luciferase), beta-galactosidase (beta-Galactosidase or LacZ), chloramphenicol Acetyl Transferase (CAT), secreted alkaline phosphatase (SEAP), mCherry, mNeonGreen, haloTag, SNAPTag, and variants thereof.
17. The method of claim 16, wherein the original long-chain double-stranded nucleic acid comprises a nucleotide sequence encoding a fluorescent protein comprising the amino acid sequence set forth in SEQ No. 16, and wherein the original long-chain double-stranded nucleic acid comprises SEQ ID NO The nucleotide sequence shown in ID No. 1.
18. The method of claim 17, wherein the mutation is at one or more of F64, S65, Y66, S72, K79, Y145, N146, N149, M153, V163, I167, R168 and T203 of the amino acid sequence shown in SEQ ID NO. 16.
19. The method according to claim 17, wherein the mutation is selected from one or more of the following mutations occurring in the amino acid sequence shown in SEQ ID NO. 16, F64L, S65T, S65G, Y66H, Y66W, S72A, K79R, Y F, N146I, N149K, M153T, V163A, I167T, R H and T203Y.
20. The method of claim 5, wherein in step (2), the partitioning points in the two strands are offset from each other by at least 10 nucleotides.

Description

Method for preparing mutation library Technical Field The application relates to the field of biological medicine, in particular to a method for preparing a mutation library. Background Mutant libraries, including mutant nucleic acid libraries and mutant protein libraries, play a critical role in bioscience research and applications. They not only provide powerful tools for exploration of biodiversity and functional genomics research, but also play a central role in a number of fields such as drug discovery, protein engineering, disease mechanism research, biotechnology applications, etc. Through the mutation library, scientists can identify key gene functional regions, optimize protein performance, discover new drug targets, and develop more effective vaccines and antibodies. In addition, the mutant library is also the basis for high throughput screening experiments, accelerating the identification process of mutants with specific biological properties. In summary, mutation libraries are key factors driving life science research and biotechnology innovation, providing valuable information and materials for a deep understanding of the complexity of life processes and developing new therapeutic strategies. DNA shuffling (DNA shuffling) technology is one method of preparing libraries of mutations that is currently in common use. DNA shuffling is a molecular biological method that mimics the natural evolution process by introducing random mutations, fragmenting the DNA, and then recombining the fragments to create new protein variants. This technique first constructs a DNA library containing a large number of mutations, then cleaves the DNA into small fragments and reassembles them by recombinant techniques, and finally mutants with the desired properties are selected by a screening process. Although DNA shuffling technology has significant advantages in improving protein function and stability, enabling the development of new proteins to be accelerated, it also has several drawbacks, including difficulty in predicting and controlling the exact location and type of mutations, potentially leading to a large number of ineffective mutations, time-consuming and costly screening processes, the PCR assembly methods used in the assembly process require the design of different primers, and in addition, the technical operations are complex, requiring specialized knowledge and equipment support. These factors limit the efficiency and versatility of the application of DNA shuffling technology in certain situations. Based on the uncertainty of DNA shuffling technology, libraries of mutations can be prepared using classical mutations of proteins that have been determined instead of random mutations. However, the more classical mutations are introduced, the larger the capacity of the mutation library is, and the current DNA shuffling technology has a certain progress in preparing a plurality of DNA, but has problems such as excessive time consumption, excessive cost, random mutation generation and the like. Therefore, finding new methods for preparing mutant libraries is of great importance for bioscience research. Disclosure of Invention The present application aims to obtain a large number of potentially functionally altered mutant proteins by permutation and combination of existing classical mutation sites from the same protein. Meanwhile, after the long-chain double-stranded nucleic acid library is obtained, the obtained mutant protein may be subjected to functional detection, thereby obtaining a mutant protein having a functional change. The preparation method does not take random mutation as a necessary condition for novel protein occurrence, and reduces the unpredictability of protein functions, thereby reducing the screening difficulty and improving the final yield of functional protein. The present application provides a method for synthesizing a long-chain double-stranded nucleic acid library from a short-chain single-stranded nucleic acid, said method being characterized by the presence of an intermediate during synthesis. Wherein the long-chain double-stranded nucleic acids in the long-chain double-stranded nucleic acid library are all homologous to the same original long-chain double-stranded nucleic acid, preferably have about 80% homology, more preferably about 85% homology, about 90% homology, about 91% homology, about 92% homology, about 93% homology, about 94% homology, about 95% homology, about 96% homology, about 97% homology, about 98% homology, about 99% homology, about 100% homology. In certain embodiments, the long-chain double-stranded nucleic acid library comprises two or more long-chain double-stranded nucleic acids. In certain embodiments, the intermediate is a long-chain double-stranded nucleic acid that is defective, in that a break in the nucleic acid is present. The break refers to the lack of phosphodiester bonds between adjacent two nucleotides in the long-chain double-stranded nucleic acid