CN-122012682-A - Method, kit and server for identifying gentian medicinal material species in mixture

CN122012682ACN 122012682 ACN122012682 ACN 122012682ACN-122012682-A

Abstract

The invention discloses a method, a kit and a server for identifying gentian medicinal material species in a mixture. In the invention, the optimized lysate formula significantly improves the quality of extracting DNA from complex mixtures. The improved beta-mercaptoethanol content enhances the reducibility of the lysate, effectively prevents DNA from being damaged by oxidation in the extraction process, the added proteinase K enhances the digestion capability of protein, reduces the wrapping and pollution of protein impurities to the DNA, the PVP can adsorb polyphenol substances in the mixture, avoid degradation caused by the combination of the PVP and the DNA, moderately improves CTAB concentration and salt concentration and stronger metal ion chelating capability, so that the lysate can better cope with various interference components in complex matrixes, and meanwhile, the adjusted Tris-HCl concentration stabilizes the pH value of the solution, thereby ensuring that the DNA is kept stable in the extraction process. These improvements provide a significant improvement in both purity and integrity of the extracted DNA, providing a high quality template for subsequent species identification.

Inventors

LIU JINXIN
TIAN YU
Zang Erhuan
MA TINGYU
Ali Mu Si
XU WEIWEI
WANG LIFAN
WANG SAIYI

Assignees

中央民族大学

Dates

Publication Date: 20260512
Application Date: 20251231

Claims (10)

1. A method for identifying gentian medicinal material species in a mixture is characterized by comprising the following steps: s1, extracting DNA in the mixture by using modified CTAB lysate, wherein the total amount of the extracted DNA is more than 1 microgram; S2, constructing a PCR-free library based on the DNA extracted in the S1, controlling the library fragments to be 270bp, and sequencing by adopting a high-throughput platform; s3, checking the quality of the sequencing result obtained in the step S2 by FastQC, and filtering and trimming by Trimmomatic or fastp; S4, splicing the quality control data processed in the S3 by using FLASH or a peer to obtain a merge. Fastq file; S5, collecting ITS2 and psbA-trnH sequences of gentian plants to construct a K-MER library of K31-K77, constructing an ultra-light reference database based on 95% sequence similarity, and extracting sequencing reads of gentian from the merged.fastq obtained in the S4; s6, mixing and assembling the sequencing reads extracted in the S5 by adopting MEGAHIT or METASPADES software; S7, annotating the assembled result of S6 by using a core DNA bar code region annotation tool based on an HMM strategy to obtain a core DNA bar code region; S8, carrying out OTU clustering by adopting 97% -99% similarity, and removing sequence redundancy to obtain unique OTUs; S9, mapping the gentians extracted in the S5 to unique OTUs obtained in the S8 by using a READS MAPPING method, calculating the credibility of each OTUs, and removing low credibility OTUs based on coverage and sequencing depth; And S10, carrying out calculation and comparison on OTUs screened in the S9 with a gentian standard ITS2 sequence and a psbA-trnH sequence by adopting a BLAST method, a genetic distance method and a phylogenetic tree method so as to judge whether the gentian is a gentian species or not, and completing final identification.
2. A method for identifying gentian species in a mixture according to claim 1, wherein in step S1 the lysate is formulated with 200mM Tris-HCl, 150mM EDTA, 2.0M NaCl, 2.5% CTAB, 3% Tween-20, 2% SDS and 2% PVP-40, wherein 1.5% beta-mercaptoethanol, 0.2mg/mL proteinase K and 10. Mu.g/mL RNase A are added prior to use.
3. The method for identifying gentian species in a mixture according to claim 1, wherein in step S1, the DNA extraction step is as follows: s1-1, taking 5g of a sample containing gentian Chinese patent medicine, putting the sample into a 50mL sterile centrifuge tube, adding 40mL of PA reagent, putting the mixture into a constant-temperature water bath shaking table, shaking the mixture for 10min at the speed of 180RPM and at the temperature of 65 ℃ until the sample is completely mixed, centrifuging the mixture for 20min at 12000RPM, and removing supernatant to obtain a precipitate; S1-2, adding 4mLPA reagent into the 50mL centrifuge tube, swirling for 30S, re-suspending each particle, transferring to a 5mL sterile centrifuge tube, centrifuging for 5min at 12000RPM, and discarding the supernatant; Wherein the PA reagent comprises EDTA 25mM with pH value of 8.0, tris-HCl 100mM with pH value of 8.0, naCl 200 mM, and PVP-40 2%; S1-3, adding 4mLPA reagent into the 5mL centrifuge tube, swirling for 30S, re-suspending each particle, transferring to a 5mL sterile centrifuge tube, centrifuging for 5min at 12000RPM, and discarding supernatant; S1-4. The preparation and cleavage of the above formula are carried out, and beta-mercaptoethanol [ final concentration 1.5% ], proteinase K [ final concentration 0.2 mg/mL ] and RNase [ final concentration 10. Mu.g/mL ] are added before use; S1-5, adding 3 mL of lysate into the centrifuge tube, re-suspending the particles, adding 2 steel balls, grinding for 2min by using a ball mill at 30Hz, and preserving the mixture at-20 ℃ after removing the steel balls until DNA is extracted; S1-6.65 ℃ for 2h, 12000RPM centrifuging for 5min, transferring the supernatant into two 2mL new centrifuge tubes, tube A and tube B, with each tube being about 800 mu L; S1-7, adding the same volume (about 800 mu L) of glacial isopropanol and 1/10 volume of CB [3M sodium acetate ] into each tube, mixing uniformly by vortex with 10 mu L of magnetic beads, and standing at-20 ℃ for 30min; s1-8, magnetically separating, removing the supernatant, adding 600 mu L of 80% ethanol into each tube, washing, uniformly mixing for 30S by vortex, and magnetically separating to remove the supernatant; S1-9, adding 600 mu L of 75% ethanol into each tube, washing, uniformly mixing for 30S by vortex, and removing the supernatant by magnetic separation; S1-10, uncapping and airing the ultra-clean workbench, adding 100 mu L of double distilled water into the pipe A, sucking and mixing uniformly, standing for 7min at room temperature, and magnetically separating; s1-11, transferring the supernatant in the pipe A to the pipe B for pumping and mixing uniformly, standing at room temperature for 7min, transferring the supernatant after magnetic separation to a new centrifuge tube, and storing the supernatant at-20 ℃.
4. The method for identifying gentian medicinal material species in a mixture according to claim 1, wherein in the step S1, the self-grinding modified CTAB lysate used improves reducibility by enhancing beta-mercaptoethanol content, protein is added for enhanced protein digestion, PVP is added for adsorbing polyphenol, CTAB concentration and NaCl salt concentration are moderately improved, interference on complex matrixes in the mixture can be effectively caused, during extraction, a sample is pretreated by a PA reagent to remove impurities, cells are crushed by grinding by a ball mill, and finally high-quality DNA is obtained by magnetic bead purification.
5. The method for identifying gentian species in a mixture according to claim 1, wherein in step S3, indexes such as base quality, GC content and the like of sequencing data are comprehensively evaluated by FastQC, and low-quality base and linker sequences are removed by Trimmomatic or fastp to obtain clean sequencing data.
6. A method for identifying gentian species in a mixture according to claim 1, wherein step S4 uses FLASH or pear software to splice the double-ended sequencing data, and integrates the sequences at both ends from the same DNA fragment into a longer single sequence to form a merge. Fastq file.
7. The method for identifying gentian species in a mixture according to claim 1, wherein in step S5, ITS2 and psbA-trnH sequences of gentian are collected, K31 to K77K-MER libraries are constructed, an ultra-lightweight reference database is established based on 95% sequence similarity, and sequencing reads belonging to gentian are extracted from the spliced sequences.
8. The method for identifying gentian species in a mixture according to claim 1, wherein in step S6, MEGAHIT or METASPADES software is used to mix and assemble the extracted gentian sequencing reads, splice short sequences into longer continuous fragments, and provide a complete sequence basis for annotation of subsequent core DNA barcode regions.
9. A kit for identifying gentian medicinal material species in a mixture is characterized in that the kit is assembled by using the CTAB lysate used in the step S1 of claim 1 and other conventional reagents including double distilled water, phenol, chloroform, isoamyl alcohol=25:24:1, chloroform, isoamyl alcohol=24:1, isopropanol, sodium acetate, nano magnetic beads and a detergent.
10. A server for identifying gentian species in a mixture, wherein the server performs all of steps S3 to S10 of claim 1.

Description

Method, kit and server for identifying gentian medicinal material species in mixture Technical Field The invention belongs to the technical field of medicine analysis, and particularly relates to a method, a kit and a server for identifying gentian medicinal material species in a mixture. Background The radix Gentianae is dry root and rhizome of gentian Gentiana manshurica Kitag, gentian Gentiana scabra Bge, gentiana rigescens Gentiana triflora PA, ll. or gentiana rigescens GENTIANA RIGESCENS Franch. The former three are known as "gentian", the latter is known as "fast gentian". Currently, more than 100 kinds of Chinese patent medicines containing gentian are used in the market, such as angelica, longhui pills, gentian and Xiegan pills, and the like, the content of gentian is used as a quality control index in most cases, gentian and Xiegan pills are used as an example, under the condition of identification, the gentian and Xiegan pills are tested according to a thin layer chromatography (general rule 0502), a gentian and amarin reference substance is taken, methanol is added into the gentian and is prepared into a solution containing 0.5mg per 1ml, and the solution is used as a reference substance solution, so that spots with the same color are required to be displayed on the positions corresponding to the reference substance chromatograph in the sample chromatograph. Under the 'content measurement', the content of gentiopicroside (C16H 20O 9) in each 1g of the product is required to be not less than 0.80mg according to high performance liquid chromatography (general rule 0512). Gentiopicroside is widely distributed in gentian plants except 4 primitive plants specified in chinese pharmacopoeia, and is distributed in gentian Gentiana veitchiorum hemsl, gentiana lineata GENTIANA LAWRENCEI var, farreri (balf. F.) t.n. Ho, a homonymous plant, swertia davidiana Swertia mussotii frank, swertia capillaris INCARVILLEA ARGUTA (Royle) Royle, and the like. Therefore, the gentiopicroside-based gentian medicinal material species in the mixture is difficult to accurately identify, and when the complex mixture is treated by adopting the traditional CTAB lysate in the molecular identification process, the reducibility, the protein digestion capacity and the polyphenol adsorption capacity are insufficient, the complex matrix interference is difficult to be handled by detergents, salt concentration and metal chelating capacity, the pH stability is poor, the extracted DNA quality is low, the commonly used 350bp database construction is easy to fail due to DNA degradation, the length is insufficient after fragment splicing, and the barcode assembly accuracy is affected. Disclosure of Invention The invention aims to solve the problems, and provides a method, a kit and a server for identifying gentian medicinal material species in a mixture. The technical scheme adopted by the invention is as follows, the method for identifying the gentian medicinal material species in the mixture comprises the following steps: S1, extracting DNA in the mixture by using self-grinding modified CTAB lysate, wherein the total amount of the extracted DNA is more than 1 microgram, and providing qualified templates for the subsequent library construction step S2, constructing a PCR-free library based on the DNA extracted in the S1, controlling library fragments to be 270bp, and sequencing by adopting a high-throughput platform, wherein sequencing data are used for subsequent data quality control and analysis steps; s3, checking the quality of the sequencing result obtained in the step S2 by FastQC, filtering and trimming by Trimmomatic or fastp, and enabling the processed clean data to enter a splicing step; S4, splicing the quality control data processed in the step S3 by using FLASH or gear to obtain a merge. Fastq file, wherein the file is used for the subsequent gentian reads extraction step; S5, collecting ITS2 and psbA-trnH sequences of gentian plants to construct a K-MER library of K31-K77, constructing an ultra-light reference database based on 95% sequence similarity, extracting sequencing reads of gentian from the merged.fastq obtained in S4, reducing computational resources of subsequent analysis, wherein the extracted reads are used in a mixed assembly step; s6, mixing and assembling the sequencing reads extracted in the S5 by adopting MEGAHIT or METASPADES software, wherein the assembling result is used for annotating the core DNA bar code region; S7, annotating the assembled result of S6 by using a core DNA bar code region annotation tool based on an HMM strategy to obtain a core DNA bar code region, wherein the annotation result enters an OTU clustering step; S8, carrying out OTU clustering by adopting 97% -99% similarity, and removing sequence redundancy to obtain unique OTUs, wherein OTUs is used in a credibility calculation step; S9, mapping the gentians extracted in the S5 to unique OTUs obtained in the S8 by using a READS MAPPING