Search

CN-122024811-A - Method and device for designing specific primers of pathogenic microorganisms

CN122024811ACN 122024811 ACN122024811 ACN 122024811ACN-122024811-A

Abstract

The invention belongs to the technical field of bioinformatics, and particularly discloses a method and a device for designing specific primers of pathogenic microorganisms. The method for designing the specific primers of the pathogenic microorganisms adopts an original design thought, firstly uses K-mers to carry out genome investigation, obtains a conserved upper primer design area and a downstream primer design area on the premise of not requiring sequence consistency of an amplification area according to the coordinate position of the area on a reference genome and the length of a designed amplification product after obtaining the conserved sequence area of a target species, then takes an overlapping area with the specific sequence area, namely the primer design area meeting the requirements of conservation and specificity at the same time, outputs a candidate primer meeting the requirements by using primer design software, and evaluates the coverage and the specificity of the candidate primer to obtain the specific primer of the target species. The method has the advantages of high analysis speed and high precision, can be compatible with pathogenic genomes with different characteristics, and improves the success rate of primer design.

Inventors

  • SHU JINGCHAO
  • XUE SIMING
  • LIU SANJIANG
  • LU JIANG
  • LI LULU
  • DONG CHAO
  • ZHANG XIAOLIANG
  • ZHANG RUIFENG

Assignees

  • 郑州安图生物工程股份有限公司

Dates

Publication Date
20260512
Application Date
20260116

Claims (10)

  1. 1. A method for designing a specific primer of a pathogenic microorganism is characterized by comprising the following steps: obtaining all genome files of the target species; obtaining a conserved sequence region of a target species; obtaining a conserved upstream and downstream primer design region of a target species; obtaining a specific sequence region of a target species; Obtaining the upstream and downstream primer design areas of the target species; Obtaining candidate primers of the target species; specific primers for the target species are obtained.
  2. 2. The method according to claim 1, wherein the method for obtaining the conserved sequence regions of the target species comprises the steps of: Breaking the reference genome sequence of the target species into short sequences with the length of 15-25bp and the step length of less than or equal to 3bp, respectively comparing the obtained short sequences to all genome sequences under the target species, screening out the short sequences which have the sequence comparison consistency ratio of more than or equal to 95% and can compare more than 90% of genomes under the upper target species, and merging the short sequences with continuous coordinate positions to obtain a conservative sequence region of the target species.
  3. 3. The method according to claim 1, wherein the method for obtaining the conserved upstream and downstream primer design regions of the target species comprises the steps of: And analyzing all the conserved sequence regions of the target species according to the coordinate position of the conserved sequence regions of the target species on a reference genome and the designed length of the amplified product to obtain an upstream primer design region and a downstream primer design region which meet the length requirement of the amplified product, namely the conserved upstream primer design region and the conserved downstream primer design region of the target species.
  4. 4. The method according to claim 1, wherein the specific sequence region of the target species is obtained by: breaking the reference genome sequence of the target species into short sequences with the length of 50-150bp and the step length of less than or equal to 3bp, respectively comparing the obtained short sequences to all genome sequences of non-target species, screening out short sequences with the sequence comparison consistency rate of less than or equal to 90%, and merging the short sequences with continuous coordinate positions to obtain the specific sequence region of the target species.
  5. 5. The method according to claim 1, wherein the method for obtaining the upstream and downstream primer design regions of the target species comprises the steps of: The conserved upstream primer design region of the target species and the specific sequence region of the target species are combined to obtain the upstream primer design region of the target species by taking an overlapping region, and the conserved downstream primer design region of the target species is combined to obtain the upstream primer design region and the downstream primer design region of the target species; Or overlapping the conserved upstream and downstream primer design regions of the target species with the specific sequence regions of the target species to obtain the upstream and downstream primer design regions of the target species.
  6. 6. The method according to claim 1, wherein the method for obtaining candidate primers of a target species comprises the steps of: And (3) taking a reference genome sequence of a target species as a template, combining an upstream primer design area and a downstream primer design area of the target species, and carrying out primer design by using primer design software, wherein the set parameters are that the primer length is 15-30bp, the Tm value is 50-70 ℃, the GC content is 40-60%, and the length of an amplified product is 100-200bp, so as to obtain candidate primers of the target species.
  7. 7. The method according to claim 1, wherein the specific primer of the target species is obtained by: And evaluating candidate primers of the target species, wherein the candidate primers comprise species coverage evaluation and primer specificity evaluation, and the target species specific primers are obtained according to requirements.
  8. 8. The method of claim 7, wherein the species coverage assessment is performed by: Using all genomes under a target species as templates, performing simulated PCR by using candidate primers, comparing the candidate primer sequences to all genome sequences under the target species, screening out candidate primers which have a sequence comparison consistency of more than or equal to 95%, have no mismatch of at least 5 continuous bases at the 3' end of the primers, have amplification product length meeting the requirement, and can compare more than 90% of genomes under an upper target species, and taking the candidate primers as primers for evaluating the coverage of the species; And/or, the method for evaluating the specificity of the primer comprises the following steps: And (3) performing simulated PCR by using a reference genome of a target species as a template and using candidate primers, comparing the product sequences of the simulated PCR to all genome sequences under non-target species, and screening out the candidate primers with the sequence comparison consistency rate less than or equal to 90 percent, wherein the candidate primers are used as primers for qualified primer specificity evaluation.
  9. 9. An apparatus dedicated to carrying out the method for designing a primer specific for a pathogenic microorganism according to any one of claims 1 to 8, comprising the following modules: the database building module is used for obtaining all genome files of the target species; a sequence analysis module for obtaining the following regions: (a) A region of conserved sequence of the target species; (b) A conserved upstream and downstream primer design region of the target species; (c) Specific sequence regions of the target species; (d) Upstream and downstream primer design regions of the target species; A primer design module for obtaining a candidate primer for a target species; And a primer evaluation module for obtaining a specific primer of the target species.
  10. 10. Primers designed using the method for designing a primer specific to a pathogenic microorganism according to any one of claims 1 to 8 or the apparatus according to claim 9, characterized by being selected from one or more pairs of primers designed for one or more of the following pathogenic microorganisms: (1) The nucleotide sequences of specific primers of staphylococcus aureus are shown in SEQ ID NO. 1-2, 3-4, 5-6 and 7-8; (2) The nucleotide sequence of the specific primer of the human coronavirus 229E is shown as SEQ ID NO. 9-10, 11-12 and 13-14; (3) The nucleotide sequences of specific primers of measles virus are shown as SEQ ID NO. 15-16, 17-18 and 19-20; (4) The nucleotide sequences of the specific primers of the Micromonospora parvula are shown as SEQ ID NOs 21-22, 23-24 and 25-26; (5) The nucleotide sequence of the specific primer of the human herpesvirus type 2 is shown as SEQ ID NO. 27-28, 29-30 and 31-32; (6) The nucleotide sequences of the specific primers of the Citrobacter freundii are shown as SEQ ID NOs 33-34, 35-36 and 37-38; (7) The nucleotide sequences of the specific primers of the streptococcus suis are shown as SEQ ID NO. 39-40, 41-42 and 43-44; (8) The nucleotide sequences of specific primers of the illicina meningitidis are shown as SEQ ID NOs 45-46, 47-48 and 49-50.

Description

Method and device for designing specific primers of pathogenic microorganisms Technical Field The invention belongs to the technical field of bioinformatics, and particularly relates to a method and a device for designing specific primers of pathogenic microorganisms. Background Pathogenic microorganisms are a class of microorganisms, also known as pathogens, that can invade the human body, cause infection and even infectious diseases. Traditional pathogen identification methods comprise smear microscopy, separation culture and biochemical reaction, microorganism mass spectrometry detection and the like, but have the defects of long detection period, complex operation, low sensitivity and the like. In particular, some causticized bacteria and non-culturable pathogens cannot be detected by traditional microscopic examination, culture biochemical identification and other methods. Clinically, more than half of patients with infectious diseases cannot be effectively treated in time because the pathogen information cannot be determined by the traditional detection method. In recent years, metagenomic sequencing technology (mNGS) and targeted sequencing technology (tNGS) have been increasingly used in the detection of pathogenic microorganisms. Wherein mNGS can detect all microorganisms including bacteria, fungi, viruses and parasites in the sample without bias, the coverage range is wide, and the number of the single detection pathogens can be tens of thousands. tNGS is only used for sequencing specific gene sequences, a large number of primers aiming at the specific gene sequences are needed to be used for carrying out ultra-multiplex PCR amplification on nucleic acid extracted from a sample to be tested in library preparation, a large number of target nucleic acid fragments are obtained, then high-throughput sequencing is carried out on the target nucleic acid fragments, and then bioinformatics analysis is carried out on the obtained sequences, so that high-resolution identification of specific genes of pathogenic microorganisms is realized. compared with mNGS, the number of the pathogens detected by tNGS at a time is only hundreds, the sequencing data are small, the detection cost is low, but the requirement on the amplification specificity of the super multiplex primer is high. In order to avoid interference caused by nonspecific amplification, chinese patent CN117604079B (key medicine) discloses a design method of a super-multiplex PCR primer suitable for targeting sequencing of an infection metagenome, which comprises the following steps of (1) preliminary design of the super-multiplex PCR primer, wherein the set parameters are that the length of the primer amplicon is 150-300bp, the GC content of the primer sequence is 57-60%, the length of the primer is 18-21bp, the primers are strictly complementary with each other by no more than 5 bases, the delta G value is > -9000cal/mol, a multimeric sequence with the length of more than 5 bases does not exist in the primer sequence, (2) primer rescreening based on biological analysis, (3) wet experiment screening of primers with high amplification efficiency, and (4) primer evaluation is carried out based on the gray value/reference gray value of a target amplification product in an electrophoresis result, and final primer combination is determined. The method combines dry and wet experiments to screen and evaluate the primer, takes the dry experiment as a main part and the wet experiment as an auxiliary part to verify that the evaluation is comprehensive and objective, but the operation process is complex and the time consumption is long. Chinese patent CN112687337B (Jinrui) also discloses a method of designing a super multiplex primer, comprising obtaining a target sequence to be detected and designing a primer, creating a primer pool; the method for evaluating the score of the single primer comprises a) evaluating the primers with the score lower than a threshold value by a penalty machine, replacing the primers with the scores lower than the threshold value, wherein the condition that the head and the tail of each primer are continuously less than 2 base complementary pairs, +2, the condition that the head and the tail of each primer are continuously 2-3 base complementary pairs, +1, the condition that the head and the tail of each primer are continuously more than 3 base complementary pairs, +0, the offset degree of the Tm value of the primer and the Tm average value is within 2 ℃, each primer is +1, the offset degree of the Tm value of the primer and the Tm average value is more than 2 ℃, each primer is +0, a 3) the GC content of each primer is within 5% of the average value of GC content, the GC content of each primer is 5% of the average value, the condition that the GC content of each primer is 5% of the average value, the complementary pair of each primer is 5% of the primers is the average value, the complementary pair of the primers is 5% of th