Search

CN-121992075-A - AAV single-molecule sequencing library construction method, application and kit

CN121992075ACN 121992075 ACN121992075 ACN 121992075ACN-121992075-A

Abstract

The invention discloses a construction method, application and kit of an AAV single-molecule sequencing library, and belongs to the technical field of biological medicine detection. Aiming at the technical problem that the AAV inverted repeat (ITR) secondary structure causes steric hindrance of ligase, the invention uses Cas9-RNP complex of specific targeting ITR Stem Region (Stem Region) to directionally cut single-stranded genome, removes closed loop hairpin structure and retains a blunt end double-stranded DNA handle of 10-50 bp at the end. The method eliminates the connection steric hindrance and remarkably improves the joint connection efficiency. The method can sequence without double-ended cyclization and only with single-ended ligation, thereby being capable of capturing complete genome, truncations and covalent head-to-tail concatemers without bias and being used for accurately determining the physical packaging capacity limit of AAV vectors.

Inventors

  • ZHANG XIAOBING
  • LI GUOHUA
  • ZHANG JIANPING
  • CHENG TAO

Assignees

  • 中国医学科学院血液病医院(中国医学科学院血液学研究所)

Dates

Publication Date
20260508
Application Date
20260212

Claims (10)

  1. 1. A method of constructing an adeno-associated virus (AAV) single molecule sequencing library, the method utilizing CRISPR-Cas9 Ribonucleoprotein (RNP) complex to specifically remodel AAV ITR secondary structures to eliminate ligase steric hindrance, the method comprising the steps of: Step S1, extracting and purifying a single-stranded DNA (ssDNA) genome from AAV virus particles; Step S2, performing directional cleavage on the ssDNA genome by using a Cas9-RNP complex, wherein the Cas9-RNP complex comprises Cas9 protein and sgRNA specifically targeting a stem region of an AAV ITR hairpin structure, the cleavage removes the ITR closed-loop hairpin structure, and a blunt end double-stranded DNA handle of 10-50 bp is reserved at the tail end as a ligase binding site; Step S3, carrying out protease treatment and thermal denaturation treatment on the product of the step S2 so as to forcedly dissociate the bound Cas9 protein from the DNA end and expose the blunt-end double-stranded DNA handle; s4, carrying out end repair and dA tailing on the blunt end released in the step S3, and connecting a sequencing joint to the reserved blunt end double-stranded DNA handle under the action of T4 DNA ligase; and S5, purifying the connection product and loading the connection product to a nanopore sequencing chip for single-molecule sequencing.
  2. 2. The method of claim 1, wherein in step S2, the RNP complex comprises Cas9 protein and a specific sgRNA targeting AAV inverted repeat sequence ITRs; the sgRNA guides Cas9 to double-strand cut at a specific site inside the ITR hairpin structure, remodelling the T-hairpin structure of the ITR into a double-stranded DNA structure with blunt ends; The double-stranded DNA with blunt ends retains a short double-stranded region of 10-50 bp at the ends.
  3. 3. The method of claim 1, wherein the ligating step is independent of circularization of both ends of the DNA molecule such that both truncated DNA or full length genomic DNA comprising at least one of the blunt-ended double-stranded DNA handles can be ligated to the sequencing adapter and sequenced.
  4. 4. A method according to claim 3, wherein the target sequence of the sgRNA is located in the stem region of an AAV ITR hairpin structure, said target sequence being selected from the group consisting of: SEQ ID NO.1:GAGCGAGCGAGCGCGCAGAG; SEQ ID NO.2:GCGCTCGCTCGCTCACTG; or variant sequences having at least 90% identity to the above sequences and capable of achieving equivalent ITR cleavage and structural opening.
  5. 5. The method of claim 1, further comprising the step of identifying covalent head-to-tail concatamers in the AAV vector: Step a (structural remodeling sequencing) performing the steps of claim 1, utilizing Cas9-RNP to specifically cleave the ITR connection node between AAV genomes, obtaining a first sequencing dataset; step B (conformational preservation sequencing), namely taking AAV genome of the same batch, not carrying out Cas9-RNP treatment, and directly carrying out joint connection and sequencing after only thermal denaturation and renaturation treatment so as to preserve covalent connection structures among molecules or in the molecules, thus obtaining a second sequencing data set; And C (computer-aided feature extraction) comparing the first sequencing data set and the second sequencing data set in a data processing system, and identifying a read length sequence which is detected in the second data set, but is significantly reduced or absent in the first data set and has a length which is an integer multiple of the designed length of the monomer genome as a covalent head-to-tail concatemer.
  6. 6. The method of claim 1, wherein the method is further used to determine physical packaging capacity limits for a particular AAV capsid, the steps comprising: constructing a series of AAV vector libraries with progressively increasing insert lengths; Sequencing the library using the method of claim 1, calculating a genome integrity index for each vector; the vector length threshold at which a nonlinear dip in the genome integrity index occurs is determined as the physical packaging capacity limit for that AAV capsid.
  7. 7. The method of claim 1, wherein the method is further used for quantitative detection of residual DNA impurities, the steps comprising: Constructing a composite reference system comprising an AAV vector sequence, a plasmid backbone sequence and a host genome sequence; masking ITR regions and promoter regions homologous to AAV vectors in the composite reference frame, and replacing homologous bases with degenerate bases or placeholders; Comparing the sequencing read length to a mask-treated composite reference system, and quantitatively calculating the content of residual plasmid skeleton impurities based on the specificity matching read length.
  8. 8. A kit for constructing an AAV single molecule sequencing library, comprising: a) A structural remodeling component comprising a Cas9 nuclease and at least one sgRNA that specifically targets the stem region of an AAV ITR hairpin structure and whose cleavage site is configured to produce a blunt-ended double-stranded DNA handle of 10-50 bp at the ITR end; b) A zymolysis component comprising proteinase K and a denaturation buffer for forced dissociation of Cas9 complex from DNA ends; c) A sequencing linker component comprising a linker adapted to a nanopore sequencing platform.
  9. 9. The kit of claim 8, wherein the sgrnas are designed for conserved sequences of AAV2 ITRs to achieve universal structural remodeling for AAV vectors of different serotypes containing AAV2 ITRs, including AAV8, AAV9, or AAV-DJ; the target sequence of the sgRNA is located in the stem region of an AAV ITR hairpin structure, said target sequence being selected from the group consisting of: SEQ ID NO.1: GAGCGAGCGAGCGCGCAGAG; SEQ ID NO.2:GCGCTCGCTCGCTCACTG; or variant sequences having at least 90% identity to the above sequences and capable of achieving equivalent ITR cleavage and structural opening.
  10. 10. Use of the method of any one of claims 1-7 for quality control of an AAV gene therapy vector, for one or more of: Unbiased assessment of AAV vector genome integrity; Identifying and quantifying covalent head-to-tail concatamers; precisely determining the upper limit of the physical packaging capacity of a specific AAV capsid, namely a packaging cliff; high sensitivity quantitative determination of residual plasmid backbone, helper plasmid and host DNA impurities.

Description

AAV single-molecule sequencing library construction method, application and kit Technical Field The invention belongs to the technical field of biological medicine detection, and particularly relates to a construction method, application and kit of an AAV single-molecule sequencing library. Background Recombinant adeno-associated virus (rAAV) vectors have become a major delivery tool for in vivo gene therapy because of their high safety, wide tissue tropism, and ability to achieve long-term gene expression. With the rapid growth of multiple rAAV gene therapies in the market and clinical lines, regulatory authorities in various countries (e.g., FDA and NMPA) place more stringent demands on product quality control, particularly the detection of genome integrity (genome integrity) and residual impurities (residual impurities) has become a Critical Quality Attribute (CQA) for batch release. However, existing analytical methods have significant limitations. Traditional short fragment methods such as qPCR, ddPCR and second generation sequencing (NGS) rely mainly on amplicons (amplicon) that can only detect local sequences, cannot span the full-length genome or accurately resolve both ends of complex inverted repeat (ITR) structures. These ITR sequences tend to form highly stable T-hairpin (T-SHAPED HAIRPIN) secondary structures, which lead to steric hindrance (STERIC HINDRANCE) in enzymatic reactions such as ligases, thus making long-reading long-sequencing libraries inefficient to construct and requiring large amounts of samples. Although single-molecule long-read long sequencing techniques such as PacBIO and Nanopore have been introduced to overcome the above problems, existing commercial schemes still have detection dead zones and selective bias in that PacBio SMRTbell library construction relies on double-ended hairpin ligation to form closed-loop templates, resulting in filtration of molecules with damaged ends or atypical structures, resulting in "survivor bias" (survivorship bias), thereby overestimating vector integrity and masking true cut-off ratios. Conventional Nanopore methods rely on natural annealing to form double strands, but the complete complementary strand anneals more readily than the truncations, resulting in enrichment bias. However, existing nanopore Cas9 targeted sequencing technologies (e.g., nCATS) are designed primarily for genome long fragment enrichment, which is not optimized for the extreme secondary structure of AAV ITRs. There are two key physical barriers to the prior art that have not been addressed: 1. Cleavage creates end dilemmas conventional designs tend to avoid the ITR high GC region, resulting in too long residual ITR stem with incomplete opening of secondary structure, or cleavage creates blunt ends that are too short (< 10 bp) to provide the desired foothold for T4 DNA ligase (Footprint). 2. Enzyme Retention effect (Enzyme reaction) Cas9 has a very slow turnover rate after cleavage (Slow Turnover), will be tightly anchored at the DNA ends, physically blocking the ligation of sequencing adaptors. The lack of a targeted forced dissociation step in the existing procedure results in extremely low ligation efficiency (< 1%) in AAV samples. Therefore, a specific structural remodeling process is needed that can accurately eliminate the ITR secondary structure, retain the appropriate length of the connecting handle, and overcome the steric hindrance of Cas9 itself. Disclosure of Invention The invention aims to provide a construction method, application and kit of an AAV single-molecule sequencing library, so as to solve the technical problems of low connection efficiency, large sample demand, terminal selection bias (Terminal Selection Bias) generated by double-end cyclization or annealing dependence and the like caused by the complex secondary structure of ITR in the existing AAV sequencing library construction technology. The invention provides the following technical scheme for realizing the purposes: In a first aspect, the invention provides a method of constructing an adeno-associated virus (AAV) single molecule sequencing library utilizing CRISPR-Cas9 Ribonucleoprotein (RNP) complex to specifically remodel AAV ITR secondary structures to eliminate ligase steric hindrance, comprising the steps of: Step S1, extracting and purifying a single-stranded DNA (ssDNA) genome from AAV virus particles; Step S2, performing directional cleavage on the ssDNA genome by using a Cas9-RNP complex, wherein the Cas9-RNP complex comprises Cas9 protein and sgRNA specifically targeting a stem region of an AAV ITR hairpin structure, the cleavage removes the ITR closed-loop hairpin structure, and a blunt end double-stranded DNA handle of 10-50 bp is reserved at the tail end as a ligase binding site; Step S3, carrying out protease treatment and thermal denaturation treatment on the product of the step S2 so as to forcedly dissociate the bound Cas9 protein from the DNA end and expose the blunt-end