CN-116206685-B - Primer design method, system and computer storage medium based on nucleic acid consistency
Abstract
The application discloses a primer design method, a system and a computer storage medium based on nucleic acid consistency, wherein the method comprises the steps of obtaining a target nucleic acid sequence, classifying the target nucleic acid sequence to obtain a plurality of classification units, taking at least one target nucleic acid sequence in each classification unit as a representative sequence, designing candidate primers of the representative sequence, sequentially selecting at most one pair of candidate primers of each classification unit, establishing a candidate primer pool, and removing the candidate primers with nonspecific amplification in the candidate primer pool to obtain a final primer pool. The application has simple operation, can process a large amount of nucleic acid libraries, and is especially suitable for the design of universal primers aiming at pathogen libraries. Compared with the similar software, the method provided by the application is more systematic and comprehensive, and the designed primer pool has high coverage and low host pollution rate.
Inventors
- XIA HAN
- YANG JUNBO
- GUAN YUANLIN
- Wei Kangfei
- DUAN MEILIN
- NIU YULONG
- HU LONG
- MU XIYU
- LUO CHEN
Assignees
- 予果生物科技(北京)有限公司
- 西咸新区予果微码生物科技有限公司
- 予果智造科技(北京)有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20230228
Claims (6)
- 1. A method for designing a primer based on nucleic acid identity, comprising: Obtaining a target nucleic acid sequence; Classifying the target nucleic acid sequences to obtain a plurality of classification units; taking at least one targeting nucleic acid sequence in each of the taxa as a representative sequence; Designing candidate primers of the representative sequence; sequentially selecting at most one pair of candidate primers of each classification unit, and establishing a candidate primer pool; removing the candidate primer with nonspecific amplification in the candidate primer pool to obtain a final primer pool; after the candidate primers are designed, sequencing the candidate primers according to the classification units from high primer efficiency to low primer efficiency, and sequentially selecting at most one pair of candidate primers of each classification unit according to the sequence of the candidate primers after sequencing to establish the candidate primer pool; Determining, when selecting a candidate primer of the classification unit to be added to the candidate primer pool, whether there is interaction between the candidate primer to be added and a candidate primer already added to the candidate primer pool, if there is no interaction, adding to the candidate primer pool, and if there is interaction, not adding to the candidate primer pool; Filtering the classification unit when no candidate primer in one classification unit is added into the candidate primer pool, and after the establishment of the candidate primer pool is completed, re-establishing the candidate primer pool according to the filtered candidate primer of the classification unit; If no candidate primer in the current classification unit is added to the candidate primer pool, returning to the previous classification unit, re-judging whether the candidate primer in the previous classification unit can be added to the candidate primer pool, and if so, adding a pair of candidate primers in the previous classification unit to the candidate primer pool and jumping to the next classification unit.
- 2. The method of designing a primer based on nucleic acid identity according to claim 1, wherein said targeting at least one nucleic acid sequence in each of said classification units as a representative sequence comprises: taking the target nucleic acid sequence with the length meeting the requirement in the classification unit as a representative sequence; And determining the similarity between other target nucleic acid sequences in the classification unit and the representative sequence, and taking the target nucleic acid sequence with the similarity reaching the similarity threshold as the representative sequence.
- 3. The method for designing a primer based on nucleic acid identity according to claim 2, further comprising: When the number of the representative sequences exceeds a number threshold, a certain number of the representative sequences is randomly selected as a final representative sequence.
- 4. The method for designing a primer based on nucleic acid identity according to claim 1, further comprising: checking whether the primer in the primer pool has a hairpin structure and a dimer, and if so, eliminating the hairpin structure and the dimer primer.
- 5. A primer design system based on nucleic acid identity, the system employing the method of claim 1, comprising: a sequence acquisition module for acquiring a target nucleic acid sequence; the sequence classification module is used for classifying the target nucleic acid sequences to obtain a plurality of classification units; A representative sequence query module for taking at least one target nucleic acid sequence in each of the taxa as a representative sequence; A candidate primer design module for designing a candidate primer of the representative sequence; A candidate primer pool establishing module, configured to sequentially select at most one pair of candidate primers of each classification unit, and establish a candidate primer pool; and the primer pool filtering module is used for removing the candidate primer with nonspecific amplification in the candidate primer pool to obtain a final primer pool.
- 6. A computer storage medium having stored therein a plurality of computer instructions for causing a computer to perform the method of any of claims 1-4.
Description
Primer design method, system and computer storage medium based on nucleic acid consistency Technical Field The application relates to the technical field of biological medicine, in particular to a primer design method, a primer design system and a computer storage medium based on nucleic acid consistency. Background For conventional PCR (Polymerase Chain Reaction ), a good pair of primers is critical to the success of the reaction, so primer design can be said to be the basis of PCR. Many factors are considered in conventional PCR primer design, and many software can now accomplish conventional PCR primer design. Multiplex PCR is a method developed on the basis of conventional PCR, and multiple PCR reactions are performed in one PCR reaction system, so that the multiple PCR reactions can be used for typing diseases and detecting pathogens in a high-efficiency system. However, multiplex PCR is more cumbersome than conventional PCR primer design because it is faced with a series of targeted nucleic acid sequences, and there are two methods commonly used at present, one is to select candidate primers by multiple sequence alignment followed by software such as DEGEPRIME or prider, and then to test the candidate primers for suitability using conventional PCR primer design software. This method is exponentially large in calculation amount and the coverage of primers is low. The second approach is to design primers one by one for the target nucleic acid and then mix the designed primers, typically in MPprimer software. This approach can lead to primer-primer dimer formation. Clinically, pathogen detection is generally classified into two cases, namely, a first pathogen to be detected is a known pathogen, and a second pathogen to be detected cannot be judged whether the pathogen is mixed infection or not. In the first case, the virus is only detected, but if the virus is highly mutated, the software is often not suitable for designing the universal primer when the genome is of a large variety and the nucleic acid sequence is uniform. The second case is more complex and requires detection from a library of potentially infectious pathogens, which makes primer design more difficult. The problem of primer design in clinical pathogen detection is found that the primer design object is huge, and the primer design object needs to be balanced in many aspects to eliminate other pollution. Therefore, how to design a primer pool with the minimum number and no special primer structure and no off-target generation becomes a key for the application of multiple PCR in clinical detection, and the research has extremely high application value but is not reported. Disclosure of Invention The embodiment of the application provides a primer design method, a primer design system and a computer storage medium based on nucleic acid consistency, which are used for solving the problems of low coverage rate and easy host pollution caused by multiplex PCR primer design in the prior art. In one aspect, embodiments of the present application provide a primer design method based on nucleic acid identity, comprising: Obtaining a target nucleic acid sequence; Classifying the target nucleic acid sequences to obtain a plurality of classification units; at least one targeting nucleic acid sequence in each taxa is taken as a representative sequence; designing candidate primers representing sequences; sequentially selecting at most one pair of candidate primers of each classification unit, and establishing a candidate primer pool; And eliminating the candidate primer with nonspecific amplification in the candidate primer pool to obtain a final primer pool. In another aspect, embodiments of the present application also provide a primer design system based on nucleic acid identity, comprising: a sequence acquisition module for acquiring a target nucleic acid sequence; the sequence classification module is used for classifying the target nucleic acid sequence to obtain a plurality of classification units; a representative sequence query module for taking at least one target nucleic acid sequence in each classification unit as a representative sequence; A candidate primer design module for designing a candidate primer representing the sequence; A candidate primer pool establishing module for sequentially selecting at most one pair of candidate primers of each classification unit and establishing a candidate primer pool; And the primer pool filtering module is used for removing the candidate primer with nonspecific amplification in the candidate primer pool to obtain a final primer pool. In another aspect, an embodiment of the present application further provides a computer storage medium, where a plurality of computer instructions are stored, where the plurality of computer instructions are configured to cause a computer to perform the method described above. The primer design method, the primer design system and the computer storage medium based on nucle