CN-115658052-B - On-line prediction website construction method based on biological sequence cis-acting regulatory element
Abstract
The invention discloses an online prediction website construction method based on a biological sequence cis-acting regulatory element, which is suitable for constructing a model obtained by machine learning training aiming at a biological sequence into an online prediction website. With the continuous development of artificial intelligence technology, machine learning algorithms have been vigorously developed and successfully applied to a variety of fields. In particular, in the field of biological and medical research, researchers have constructed a number of predictive models to serve various research tasks such as the prediction of cis-acting regulatory elements, biomedical pathogenic elements, proteins, and the like. Although a large number of predictive models have been successfully constructed, it is of great concern to the relevant practitioners who often utilize computer work how these models can be applied to actual work. Therefore, the prediction model in the paper is constructed into the online website which is friendly to users and is simple and easy to use, and the practical application value of the model can be greatly improved.
Inventors
- ZHANG YANJU
- WANG RONGXING
Assignees
- 桂林电子科技大学
Dates
- Publication Date
- 20260505
- Application Date
- 20220930
Claims (2)
- 1. The method for constructing the online prediction website based on the cis-acting control element of the biological sequence is characterized by comprising the following steps: (1) Preprocessing data, namely preprocessing an input data file according to a model, wherein the data file comprises two formats of fasta and fastq and is used for storing DNA, RNA and protein sequences; (2) Constructing a website main body display layer, namely performing UI design of each front-end page according to a specific application scene and a user prototype of a website; (3) Constructing a website main body service interaction layer, namely carrying out data interaction with a front-end page through a Java data end, transmitting a request task submitted by a user to a Perl computing end, connecting the front-end page constructed in the step (2) with the Java data end by utilizing a Struts2 framework, packaging the preprocessed data into a JSON format and transmitting the JSON format to the Perl computing end; (4) The method comprises the steps of constructing a website main body business logic layer, carrying out task scheduling by adopting a distributed task distribution framework Gearman, adding data into a task pool to be sequentially executed after receiving a request task transmitted by a Java data end in the step (3), and carrying out waiting areas for multiple tasks when the number of submitted request tasks exceeds a workbench value and starting calculation until a certain task in execution is finished; (5) Constructing a website main body persistence layer, namely writing a predicted result into a MySQL database when the business logic layer in the step (4) executes a calculation task, and preparing for outputting the result; (6) And outputting a prediction result, namely acquiring data from the database by the Java data end and displaying the data on a front-end page.
- 2. The method of claim 1, wherein the UI design of the front-end page in step (2) comprises the overall style of the page, page elements, operation controls and user direction.
Description
On-line prediction website construction method based on biological sequence cis-acting regulatory element Technical Field The invention relates to the field of data processing and analysis of biological information, in particular to a general method for constructing a biological sequence cis-acting regulatory element online prediction website. Background With the continued development of artificial intelligence technology, more and more machine learning algorithms and models are proposed and can be successfully applied in a variety of fields. Particularly, under the current development trend of high crossing and fusion of various subjects, a large number of prediction models are constructed for the scientific researchers, especially for bioinformatics, so as to promote the development of various research problems. For example, in the field of biological research, scientists have either perfected structural information of genes by annotating genomic data or predicted specific proteins to explore pathogenic mechanisms of bacteria, viruses, etc. Although a large number of predictive models have been constructed, they are rarely capable of practical application because these paper models are difficult for the average practitioner to apply, and on the other hand, as computers continue to be launched into people's lives and works, a large number of websites are constructed to meet various needs of people. Therefore, the online prediction website which is friendly to users and is simple and easy to use is constructed, and the practicability of the paper model can be exerted as much as possible. Disclosure of Invention Based on the problems pointed out in the prior art, the invention provides an online prediction website construction method based on a cis-acting regulatory element of a biological sequence. The technical scheme for realizing the aim of the invention is as follows: the method for constructing the online prediction website based on the cis-acting control element of the biological sequence comprises the following steps: (1) Preprocessing the input data file according to the model; (2) Constructing a website main body display layer, namely performing UI design of each front-end page according to a specific application scene and a user prototype of a website; (3) Constructing a website main body service interaction layer, namely carrying out data interaction with a front-end page through a Java data end, transmitting a request task submitted by a user to a Perl computing end, connecting the front-end page constructed in the step (2) with the Java data end by utilizing a Struts2 framework, packaging the preprocessed data into a JSON format and transmitting the JSON format to the Perl computing end; (4) The method comprises the steps of constructing a website main body business logic layer, carrying out task scheduling by adopting a distributed task distribution framework Gearman, adding data into a task pool to be sequentially executed after receiving a request task transmitted by a Java data end in the step (3), and carrying out waiting areas for multiple tasks when the number of submitted request tasks exceeds a workbench value and starting calculation until a certain task in execution is finished; (5) Constructing a website main body persistence layer, namely writing a predicted result into a MySQL database when the business logic layer in the step (4) executes a calculation task, and preparing for outputting the result; (6) And outputting a prediction result, namely acquiring data from the database by the Java data end and displaying the data on a front-end page. Further, the data file of step (1) includes two formats, fasta and fastq, for storing DNA, RNA and protein sequences. Further, the UI design of the front-end page in the step (2) includes the overall style of the page, page elements, operation controls, user guidance and the like. The method has the advantages that the common sequencing data files are processed, and the model trained by traditional machine learning or deep learning is constructed into the online prediction website which is friendly to users and simple and easy to use by means of the mainstream B/S (browser and server) architecture model development technology, so that the usability of the paper model is improved, and research work of practitioners in relevant fields is served. Drawings FIG. 1 is a diagram of an online predictive website building model in accordance with an embodiment of the present invention. Detailed Description The invention is further illustrated, but not limited, by the following figures and examples. Examples: a general biological sequence cis-acting regulatory element online prediction website construction method comprises the following three major parts: 1) Data preprocessing common sequencing file formats include fasta and fastq, which differ in that the latter has more sequencing quality than the former. In this section, necessary preprocessing work