CN-121975924-A - Barcode sequence for nanopore sequencing, ligase and application thereof
Abstract
The invention discloses a bar code sequence for nanopore sequencing, ligase and application thereof. First, a barcode sequence for nanopore sequencing is disclosed, consisting of an internal barcode sequence with complementary exchangeable regions corresponding to SEQ ID NO.1 and SEQ ID NO.2 and an external barcode sequence corresponding to SEQ ID NO.3 and SEQ ID NO.4. Further discloses an improved T4 DNA ligase with a sequence shown as SEQ ID NO.194 and application of the modified T4 DNA ligase and a barcode sequence in nanopore sequencing and library establishment. According to the invention, the barcode sequence is combined to obtain the dual barcode product through the pre-connection process, so that the types of barcodes are greatly expanded, and meanwhile, the improved T4 DNA ligase improves the connection efficiency of the dual barcode product and a substrate, thereby remarkably reducing the sequencing cost and improving the sequencing efficiency.
Inventors
- YAN SHUPENG
- LV MENGYANG
- SONG XINWEN
- GENG LIANG
- XIN WEN
Assignees
- 北京全式金生物技术股份有限公司
- 北京全式金生物工程技术研究有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260211
Claims (10)
- 1. A barcode sequence for nanopore sequencing, characterized in that the barcode sequence consists of an internal barcode sequence and an external barcode sequence: The internal barcode sequence corresponds to SEQ ID NO.1 and SEQ ID NO.2, wherein the 1 st to 24 th positions of SEQ ID NO.1 and the 8 th to 31 th positions of SEQ ID NO.2 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.1 also comprises sequences shown as the 9 th to 32 th positions of SEQ ID NO.3 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.2 also comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.4 and SEQ ID NO.99 to 192; The external barcode sequence corresponds to SEQ ID NO.3 and SEQ ID NO.4, wherein the 9 th to 32 th positions of SEQ ID NO.3 and the 1 st to 24 th positions of SEQ ID NO.4 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.3 further comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.1 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.4 further comprises sequences shown as the 8 th to 31 th positions of SEQ ID NO.2 and SEQ ID NO.99 to 192.
- 2. An extended barcode for nanopore sequencing, wherein the extended barcode is composed of an internal extended barcode and an external extended barcode, wherein the internal extended barcode is formed by assembling an internal barcode sequence, and the external extended barcode is formed by assembling an external barcode sequence; The internal barcode sequence corresponds to SEQ ID NO.1 and SEQ ID NO.2, wherein the 1 st to 24 th positions of SEQ ID NO.1 and the 8 th to 31 th positions of SEQ ID NO.2 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.1 also comprises sequences shown as the 9 th to 32 th positions of SEQ ID NO.3 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.2 also comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.4 and SEQ ID NO.99 to 192; The external barcode sequence corresponds to SEQ ID NO.3 and SEQ ID NO.4, wherein the 9 th to 32 th positions of SEQ ID NO.3 and the 1 st to 24 th positions of SEQ ID NO.4 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.3 further comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.1 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.4 further comprises sequences shown as the 8 th to 31 th positions of SEQ ID NO.2 and SEQ ID NO.99 to 192.
- 3. An engineered T4 DNA ligase for nanopore sequencing, wherein the amino acid sequence of the engineered T4 DNA ligase is shown in SEQ ID No. 194.
- 4. Use of the barcode sequence of claim 1, the extended barcode of claim 2 and/or the engineered T4 DNA ligase of claim 3 in nanopore sequencing pooling and/or in the preparation of a product of nanopore sequencing pooling.
- 5. A library-building kit for nanopore sequencing, comprising the extended barcode of claim 2 and a barcode ligation reagent comprising a barcode ligase cocktail and a barcode ligation reaction buffer, wherein the barcode ligase cocktail comprises the engineered T4 DNA ligase of claim 3.
- 6. The library kit of claim 5, wherein the concentration of the inner and outer extended barcodes is 1-2.5 μΜ; Preferably, the final concentration of the modified T4 DNA ligase in the bar code ligase mixed solution is 100-200 ng/mu L.
- 7. The kit for constructing a library according to claim 5, wherein the mixed solution of the bar code ligase further comprises Tris-HCl, KCl, EDTA, DTT with pH 7.5 and glycerol, and the bar code ligation reaction buffer comprises Tris-HCl with pH 7.5, mgCl 2 , KCl, DTT, ATP and polyethylene glycol 6000; Preferably, the final concentration of Tris-HCl with the pH value of 7.5 in the bar-code ligase mixed solution is 5-20 mM, the final concentration of KCl in the bar-code ligase mixed solution is 40-60 mM, the final concentration of EDTA in the bar-code ligase mixed solution is 0.05-0.2 mM, the final concentration of DTT in the bar-code ligase mixed solution is 0.5-2 mM, and the final concentration of glycerol in the bar-code ligase mixed solution is 40-60%; More preferably, the final concentration of Tris-HCl with pH of 7.5 in the bar code ligation reaction buffer is 50-150 mM, the final concentration of MgCl 2 in the bar code ligation reaction buffer is 10-50 mM, the final concentration of KCl in the bar code ligation reaction buffer is 10-25 mM, the final concentration of DTT in the bar code ligation reaction buffer is 10-50 mM, the final concentration of ATP in the bar code ligation reaction buffer is 2.5-7.5 mM, and the final concentration of polyethylene glycol 6000 in the bar code ligation reaction buffer is 25-50%.
- 8. The pooling kit of claim 5, further comprising an end repair reagent, an extended barcode pre-ligation reagent, and a sequencing adaptor ligation reagent.
- 9. A method for nanopore sequencing pooling, the method comprising the steps of: Performing terminal repair on the substrate DNA, adding an A tail at the 3 'end and phosphorylating the 5' end to obtain a terminal repair product; Pre-connecting the internal extended bar code and the external extended bar code to obtain a dual bar code product; connecting the tail end repair product with the double bar code product to obtain a bar code connection product; And connecting the bar code connection product with a sequencing joint to obtain the on-machine nanopore sequencing library.
- 10. The method of claim 9, further comprising a purification step after the ligation of the end repair product and the dual barcode product; preferably, the method further comprises a purification step after the ligation reaction of the barcode ligation product and the sequencing adapter.
Description
Barcode sequence for nanopore sequencing, ligase and application thereof Technical Field The invention relates to the technical field of nanopore sequencing. More particularly, to a barcode sequence for nanopore sequencing, a ligase and applications thereof. Background Nanopore sequencing (Nanopore Sequencing) is a single molecule real-time sequencing technique that uses the changes in electrical signals generated as nucleic acid molecules pass through a nanopore to effect sequence detection. Nanopore sequencing has the characteristics of high throughput, long reading length, low cost and the like, and can directly detect apparent modification on nucleic acid, such as 5-methylcytosine (5-mC) and the like. Similar to the second generation sequencing technique, in order to simultaneously perform nanopore sequencing of multiple samples in the same Flow Cell (Flow Cell), a different Barcode (Barcode) needs to be attached to each sample before sequencing. The current mainstream library-building kit on the market can provide 96 bar codes at most, such as amplification-free bar code sequencing kit (SQK-NBD 114.96) of oxford nanopore company (ONT) in England. If more samples need to be sequenced simultaneously, only a Dual barcode label (Dual Barcoding) method can be used to attach two barcodes to each sample. The method firstly marks an internal bar code (Inner bar code) by using a PCR reaction, and then connects an external bar code (Outer bar code) by using a connection reaction. Although the double bar code labeling method can greatly expand bar code types, the library construction process is complicated, apparent modification information can be lost in the PCR process, and the method is not suitable for samples such as genome DNA with longer sequences. Therefore, there is a need to develop a new barcode extension strategy and library building method independent of PCR process for simultaneous sequencing of a large number of samples. Disclosure of Invention It is a first object of the present invention to provide a barcode sequence for nanopore sequencing that can expand the barcode species without relying on the PCR process, enabling simultaneous sequencing of a large number of samples. The second object of the present invention is to provide an engineered T4 DNA ligase for nanopore sequencing that can improve the efficiency of ligation of dual barcode products to substrates. A third object of the present invention is to provide the use of the above barcode sequences and engineered T4 DNA ligase in nanopore sequencing banking. In order to achieve the above purpose, the invention adopts the following technical scheme: In a first aspect, the present invention provides a barcode sequence for nanopore sequencing, the barcode sequence consisting of an internal barcode sequence and an external barcode sequence, expansion of barcode species being achieved by a combination of replaceable regions of the internal barcode sequence and the external barcode sequence; The internal barcode sequence corresponds to SEQ ID NO.1 and SEQ ID NO.2, wherein the 1 st to 24 th positions of SEQ ID NO.1 and the 8 th to 31 th positions of SEQ ID NO.2 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.1 also comprises sequences shown as the 9 th to 32 th positions of SEQ ID NO.3 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.2 also comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.4 and SEQ ID NO.99 to 192; The external barcode sequence corresponds to SEQ ID NO.3 and SEQ ID NO.4, wherein the 9 th to 32 th positions of SEQ ID NO.3 and the 1 st to 24 th positions of SEQ ID NO.4 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.3 further comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.1 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.4 further comprises sequences shown as the 8 th to 31 th positions of SEQ ID NO.2 and SEQ ID NO.99 to 192. In a second aspect, the invention provides an extended barcode for nanopore sequencing, the extended barcode consisting of an internal extended barcode (Inner Expansion Barcode, IEB) and an external extended barcode (Outer Expansion Barcode, OEB), wherein the internal extended barcode is formed upon assembly of an internal barcode sequence and the external extended barcode is formed upon assembly of an external barcode sequence; The internal barcode sequence corresponds to SEQ ID NO.1 and SEQ ID NO.2, wherein the 1 st to 24 th positions of SEQ ID NO.1 and the 8 th to 31 th positions of SEQ ID NO.2 are complementary replaceable regions, the sequence of the replaceable regions of SEQ ID NO.1 also comprises sequences shown as the 9 th to 32 th positions of SEQ ID NO.3 and SEQ ID NO.5 to 98, and the sequence of the replaceable regions of SEQ ID NO.2 also comprises sequences shown as the 1 st to 24 th positions of SEQ ID NO.4