CN-122024931-A - TMB color development liquid formula optimization method, system, computer equipment and storage medium

CN122024931ACN 122024931 ACN122024931 ACN 122024931ACN-122024931-A

Abstract

The invention discloses a TMB color development liquid formula optimization method, a TMB color development liquid formula optimization system, computer equipment and a storage medium. The method comprises the steps of obtaining an experimental data set of a formula sample to obtain an N-dimensional formula space, constructing a forward mapping relation from the N-dimensional formula space to an M-dimensional hidden space, constructing a Gaussian process regression proxy model, training the Gaussian process regression proxy model based on an initial experimental data set, calculating EI values of candidate points in the M-dimensional hidden space based on a trained Gaussian process regression proxy model by using an expected improvement point adding strategy to obtain an optimal hidden space point, and reversely mapping the optimal hidden space point through the forward mapping relation of PCA to obtain a new N-dimensional suggested formula. And performing experimental verification on the new N-dimensional suggested formula, adding a verification result as a new experimental data point into an experimental data set, returning to a model training process for iterative optimization, and outputting the optimized suggested formula.

Inventors

ZHAO XIAOPING
ZHENG HAIXUE
WU HAO
ZHU ZIXIANG
TIAN HONG
HUANG YUE
WU HAOHUI
WANG MINGZHU

Assignees

兰州大学

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (9)

1. The TMB color development liquid formula optimization method is characterized by comprising the following steps of: step 1, acquiring an experimental data set of a formula sample to obtain an N-dimensional formula space; Step 2, constructing a forward mapping relation from an N-dimensional formula space to an M-dimensional hidden space; Step 3, constructing a Gaussian process regression proxy model, and training the Gaussian process regression proxy model based on an initial experimental data set; step 4, calculating EI values of candidate points in the M-dimensional hidden space by using an expected improvement point adding strategy based on a trained Gaussian process regression proxy model to obtain optimal hidden space points; Step 5, reversely mapping the optimal hidden space point through the forward mapping relation of PCA to obtain a new N-dimensional suggested formula; And 6, performing experimental verification on the new N-dimensional suggested formula, adding a verification result as a new experimental data point into an experimental data set, returning to the model training process for iterative optimization until a preset convergence condition is met, and outputting the optimized suggested formula.
2. The method for optimizing a formula of a TMB color development liquid according to claim 1, wherein the experimental data set comprises a plurality of initial formula samples and training data sets of corresponding performance indexes, each initial formula sample is defined by concentration values of N chemical components, the performance indexes are absorbance maximum values of TMB color development reaction under specific wavelengths, and an N-dimensional formula space is formed by the concentration values of the N chemical components.
3. The method according to claim 2, wherein in the step 2, the N-dimensional formula space is processed by using a principal component analysis method to construct a forward mapping relationship from the N-dimensional formula space to an M-dimensional hidden space, wherein M < N.
4. The method for optimizing the formula of the TMB color development liquid according to claim 2, wherein the step 4 is characterized in that the expected improvement point adding strategy is utilized, the EI value of the candidate points is calculated, and k-Means clustering is carried out on a preset number of candidate points with high EI values to obtain the optimal hidden space points.
5. The method for optimizing the formulation of the TMB developing solution according to claim 1, further comprising an experiment constraint defining step of identifying and fixing components which cause adverse reactions of precipitation from N chemical components according to chemical prior knowledge or preliminary experimental observation for reducing the dimension N of actual optimization, before the experimental data set of the formulation sample is obtained in the step 1.
6. The method of optimizing a color development formulation of TMB according to any of claims 1-5, characterized in that N chemical components comprise at least 8 of TMB, hydrogen peroxide, pH regulator, sodium acetate, polyvinylpyrrolidone, EDTA, dimethyl sulfoxide, trehalose, glycerol, citric acid, sodium thiosulfate and 1,2, 6-hexanetriol, wherein N≥10.
7. A TMB color development formulation optimization system, comprising: the data acquisition module is used for acquiring an experimental data set of the formula sample to obtain an N-dimensional formula space; The mapping module is used for constructing a forward mapping relation from the N-dimensional formula space to the M-dimensional hidden space; The model training module is used for constructing a Gaussian process regression proxy model and training the Gaussian process regression proxy model based on an initial experimental data set; the optimal hidden space calculation module is used for calculating EI values of candidate points in the M-dimensional hidden space by using an expected improvement point adding strategy based on a trained Gaussian process regression proxy model to obtain optimal hidden space points; the reverse mapping module is used for carrying out reverse mapping on the optimal hidden space point through the forward mapping relation of PCA to obtain a new N-dimensional suggested formula; And the optimization module is used for carrying out experimental verification on the new N-dimensional suggested formula, adding a verification result as a new experimental data point into an experimental data set, returning to the model training process for iterative optimization until a preset convergence condition is met, and outputting the optimized suggested formula.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 5 when the computer program is executed by the processor.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 5.

Description

TMB color development liquid formula optimization method, system, computer equipment and storage medium Technical Field The invention relates to the technical field of intersection of analytical chemistry and artificial intelligence, in particular to a TMB color development liquid formula optimization method, a system, computer equipment and a storage medium. Background 3,3', 5' -Tetramethyl benzidine (TMB) as a high-efficiency and safe chromogenic substrate of horseradish peroxidase (HRP) has been widely used in various biological analysis and clinical diagnosis fields such as enzyme-linked immunosorbent assay (ELISA), immunoblotting (Western Blot) and Immunohistochemistry (IHC). The performance of TMB color development liquid, especially the detection sensitivity (usually expressed in terms of absorbance OD value) and long-term storage stability, directly determine the reliability, accuracy and detection lower limit of the whole analytical detection method, so that the development of TMB color development liquid formula with higher sensitivity and better stability has important significance for improving the quality and market competitiveness of In Vitro Diagnostic (IVD) reagents. The traditional TMB color development liquid formula optimization process is seriously dependent on experience of experiment scientists and labor-intensive error testing methods. One typical method is the One-Variable-at-a-Time (OVAT), i.e., the concentration of all other components is fixed, and only the concentration of a single component is changed to find its optimum. This approach, while intuitive, has the fatal disadvantage of completely ignoring complex synergistic effects (SYNERGISTIC EFFECT) or antagonistic effects (Antagonistic Effect) between chemical components. For example, changes in the concentration of the stabilizing agent PVP may affect the optimum pH, while the presence of the chelating agent EDTA may interact with the metal ions in the buffer system. The OVAT method cannot capture these cross effects, is very prone to be trapped in a locally optimal solution, and is difficult to find high performance formulations that are truly produced by multicomponent synergy. To overcome the limitations of OVAT, those skilled in the art began to employ some more systematic experimental design (Design of Experiments, doE) methods, such as the response surface method (Response Surface Methodology, RSM) and Taguchi methods. These methods allow to study the interaction of multiple variables to some extent by designing a smaller, orthogonal experimental matrix. However, these conventional DOE methods also require a large number of experimental samples as the number of variable components in the recipe increases further, creating a high dimensional (e.g., over 8-10 variables) recipe space. Therefore, in order to reduce the trial-and-error cost, some machine learning auxiliary means are introduced into the fields of analytical chemistry, biological agent research and development and the like, and are used for realizing low-cost, rapid and efficient agent model driven (Surrogate assisted) experimental result prediction. The Bayesian optimization (Bayesian Optimization, BO) is one of the representative global optimization methods capable of establishing a proxy model for an expensive chemical experiment under a small experiment sample size, and has great application potential. The core essence of bayesian optimization is its "proxy model driven intelligent exploration" paradigm. The agent model is used as a core driving algorithm of the method to directly determine global optimization precision of the BO. Gaussian process regression (Gaussian Process Regression, GPR) is recognized by its unique theoretical advantages as the most natural and powerful "golden partner" in combination with bayesian optimization frameworks. This is mainly due to the fact that the GPR model not only fits the relationships between existing experimental data points as other models (such as neural networks or random forests) and gives a predicted mean (μ) for unknown points, it also provides each predicted point with its "uncertainty" or confidence interval (i.e., predicted variance σ2) of its predicted value. It is this ability to provide both "predictors" and "uncertainties" that makes the GPR model a most representative combination in bayesian optimization in conjunction with the desired improvement (Expected Improvement, EI) approach. However, despite the fact that the binding of GPR to BO is very mature in theory, when applied directly in real, high-dimensional chemical formulation optimization practices, there are still two major technical challenges arising from the inherent properties of GPR models themselves: First, model degradation under high-dimensional, sparse data. For the reagent optimization problem driven by data, under the condition of as few experimental samples as possible, the establishment of the agent model with as high accuracy as p