CN-122024969-A - Zinc salt modified ruthenium catalyst performance optimization system and method based on active learning and uncertainty evaluation
Abstract
The invention relates to a zinc salt modified ruthenium catalyst performance optimization system and method based on active learning and uncertainty assessment, wherein the method aims at the defects of low efficiency, experience dependence and lack of prediction capability for an unknown scheme in the traditional trial-and-error method screening zinc salt modifier, a Gaussian process model capable of quantitatively predicting uncertainty is established by digitally characterizing candidate zinc salt schemes and constructing a standardized feature library and combining historical data, the potential and uncertainty of an unexperienced scheme are intelligently assessed by utilizing the model and an active learning strategy, a most valuable experimental scheme is automatically recommended, and finally the aim of efficiently finding the optimal zinc salt type and additive amount combination with minimum experiment times is fulfilled through a closed loop flow of experiment verification, data feedback and model iteration update. The method converts the traditional blind screening into data-driven intelligent optimizing, and remarkably improves the efficiency and systematicness of catalyst performance optimization.
Inventors
- SUN HAIJIE
- ZHAI DAJU
- CUI ZHICHENG
- ZHAO HAITAO
- HU MENGQI
- CHANG YUE
Assignees
- 郑州师范学院
- 江西心连心化学工业有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260211
Claims (9)
- 1. The zinc salt modified ruthenium catalyst performance optimization method based on active learning and uncertainty evaluation is characterized by comprising the following steps of: S1, performing digital characterization processing on candidate zinc salt modification schemes, converting zinc salt types and Zn/Ru molar ratios into digital vectors with uniform dimensions, and generating a standardized feature library containing standard feature vectors of all candidate schemes; S2, matching and aligning a historical data list obtained from a historical experiment record with the standardized feature library, and selectively matching a historical scheme from the standard feature vector and the actually measured cyclohexene to generate an initial training set and a current optimal performance benchmark; s3, carrying out Gaussian process regression training on the initial training set, fitting the mapping relation between the standard feature vector and cyclohexene selectivity, quantifying the uncertainty of a model prediction result, and generating an uncertainty prediction model with prediction mean and variance output capability; S4, inputting unexperienced candidate schemes and the current optimal performance standard in the standardized feature library into the uncertainty prediction model for active learning scoring, calculating expected lifting values and uncertainty comprehensive scores of the candidate schemes, and sequencing to generate an experiment recommendation list comprising specific schemes; And S5, performing experiment verification and model iteration processing based on the experiment recommendation list, executing experiments in the list, adding the obtained new measured data into a training set, and iteratively training the uncertainty prediction model again, updating the history experiment record, and updating the uncertainty prediction model and the current optimal performance standard.
- 2. The method according to claim 1, wherein S1 comprises: S11, carrying out feature coding treatment on zinc salt types and Zn/Ru molar ratios in a candidate zinc salt modification scheme, converting SMILES character strings of organic acid zinc salts into Morgan fingerprint vectors, converting chemical formulas of inorganic zinc salts into anion descriptor vectors containing electronegativity, ionic radius and pKa values, and generating basic feature data of each scheme; S12, splicing and standardizing the basic feature data and the corresponding Zn/Ru molar ratio, connecting the Morgan fingerprint vector and the anion descriptor vector with the molar ratio value of the basic feature data according to dimensions, and carrying out scale normalization to generate a standard feature vector; and S13, storing and indexing the standard feature vectors of all the candidate schemes, establishing a mapping relation between the scheme identifiers and the feature vectors, and generating a standardized feature library for unified query.
- 3. The method according to claim 1, wherein S2 comprises: S21, analyzing and information extracting the historical experiment record, extracting the recorded zinc salt type, zn/Ru molar ratio and corresponding cyclohexene selectivity actual measurement value of the completed experiment, and generating a structured historical data list; S22, carrying out feature query and pairing processing on each scheme in the historical data list based on zinc salt type and Zn/Ru molar ratio in the standardized feature library, matching the corresponding standard feature vector and correlating with an actually measured selectivity value to generate a feature-performance pairing data set; And S23, integrating and counting the feature-performance pairing data sets, summarizing all data pairs to form the initial training set, and selecting the maximum value from all measured selectivities to generate the current optimal performance standard.
- 4. The method according to claim 1, wherein S3 comprises: S31, carrying out input-output separation processing on the initial training set, extracting all the standard feature vectors as training input matrixes, extracting corresponding cyclohexene selectivity actual measurement values as training target vectors, and generating model training data; S32, carrying out Gaussian process modeling processing on the model training data based on a Martin kernel function, configuring a covariance function form and initializing super parameters, and generating a regression model after training by optimizing a length scale and noise variance parameters through maximizing an edge likelihood function; and S33, carrying out interface packaging processing on the trained regression model, so that the regression model can output a cyclohexene selective prediction mean value and prediction variance for the input feature vector to generate an uncertainty prediction model.
- 5. The method according to claim 1, wherein S4 comprises: S41, performing set difference analysis on the standardized feature library and the initial training set, screening out candidate schemes which do not appear in the initial training set and standard feature vectors corresponding to the candidate schemes, and generating a scheme set to be evaluated; S42, carrying out Gaussian process posterior inference processing on each scheme in the scheme set to be evaluated and the input uncertainty prediction model, calculating a cyclohexene selective prediction distribution parameter based on a covariance relation obtained by training, and generating a candidate scheme prediction result comprising a prediction mean value and a prediction variance; S43, carrying out expected lifting and uncertainty fusion scoring processing on the candidate scheme prediction result and the current optimal performance standard, calculating the comprehensive score of each scheme according to a preset exploration factor and a balance coefficient, and sequencing to generate the experiment recommendation list.
- 6. The method according to any one of claims 1-5, wherein S5 comprises: s51, carrying out catalytic experiments and data acquisition processing on each scheme in the experiment recommendation list, carrying out benzene hydrogenation reaction according to the zinc salt type and Zn/Ru molar ratio specified in the scheme, measuring and recording cyclohexene selectivity, and generating newly-increased actual measurement data; S52, carrying out data integration and performance reference updating processing on the newly-added actual measurement data and the initial training set, selectively pairing the standard feature vector corresponding to each new experimental scheme with the actual measurement data, adding the standard feature vector into the training set, extracting the highest value from the updated actual measurement value set, and generating an updated training set and an updated current optimal performance reference; and S53, inputting the updated training set as new training data into a Gaussian process regression frame to perform model parameter optimization, and generating an updated uncertainty prediction model.
- 7. A zinc salt modified ruthenium catalyst performance optimization system based on active learning and uncertainty evaluation, the system comprising: The zinc salt scheme characterization module is used for carrying out digital characterization processing on the candidate zinc salt modification scheme, converting zinc salt types and Zn/Ru molar ratio into digital vectors with uniform dimension, and generating a standardized feature library containing standard feature vectors of all the candidate schemes; The data matching and aligning module is used for matching and aligning a historical data list obtained from a historical experiment record with the standardized feature library, selectively matching a historical scheme from the standard feature vector and the actually measured cyclohexene of the historical scheme, and generating an initial training set and a current optimal performance benchmark; The Gaussian regression modeling module is used for carrying out Gaussian process regression training on the initial training set, fitting the mapping relation between the standard feature vector and cyclohexene selectivity and quantifying the uncertainty of a model prediction result to generate an uncertainty prediction model with prediction mean and variance output capability; The active learning recommendation module is used for inputting unexperienced candidate schemes and the current optimal performance standard in the standardized feature library into the uncertainty prediction model for active learning scoring, calculating expected lifting values and uncertainty comprehensive scores of the candidate schemes, and sequencing the expected lifting values and the uncertainty comprehensive scores to generate an experimental recommendation list comprising specific schemes; And the experiment iteration optimization module is used for carrying out experiment verification and model iteration processing based on the experiment recommendation list, executing experiments in the list, adding the obtained new measured data into a training set, carrying out repeated iteration training on the uncertainty prediction model, updating the history experiment record, and updating the uncertainty prediction model and the current optimal performance standard.
- 8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any one of claims 1 to 6 when executing the computer program.
- 9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the method of any one of claims 1 to 6.
Description
Zinc salt modified ruthenium catalyst performance optimization system and method based on active learning and uncertainty evaluation Technical Field The invention relates to the technical field of industrial catalysis and artificial intelligence intersection, in particular to a zinc salt modified ruthenium catalyst performance optimization system and method based on active learning and uncertainty evaluation. Background The selective hydrogenation of benzene to prepare cyclohexene is a key reaction process in the chemical industry, especially in the nylon industry chain. In this reaction, ruthenium catalysts are widely used because of their high activity, but their inherent reaction characteristics tend to result in unsatisfactory selectivity of the target product cyclohexene. To improve selectivity, industrial practice and research generally uses zinc salts as modifiers in the reaction system, and uses chemical characteristics of different zinc salts to adjust the reaction path. However, the screening of zinc salts constitutes a complex optimization problem, namely, on the one hand, the variety of candidate zinc salts is numerous, the chemical properties of the candidate zinc salts are obviously different from those of the candidate zinc salts, and on the other hand, the addition amount of the zinc salts (namely, the Zn/Ru molar ratio) is also a key variable, and complex interaction effects exist between the candidate zinc salts and the zinc salts. How to quickly and economically find the optimal combination from the high-dimensional, discrete-continuous mixed parameter space consisting of the "zinc salt species" and the "additive amount" is a core challenge in improving the catalyst performance. Currently, the optimization problem is primarily dependent on conventional "trial and error" or grid searching based on limited experience. The research and development personnel need to pre-select a few zinc salts and a limited amount of addition according to chemical intuition to carry out a large amount of experiments, and the process is not only time-consuming and high in cost, but also seriously depends on personal experience of specialists. More importantly, the existing method is essentially a 'passive' screening, lacks capability of predicting performance of an untried scheme, cannot intelligently evaluate exploratory values of different areas (such as high-potential areas or high-uncertainty areas) in the experimental process, and cannot actively plan the most efficient experimental sequence. This results in considerable blindness in the development process, a large amount of experimental resources may be consumed in the performance-mediocre parameter area, and truly superior "black horse" solutions may never be considered. Disclosure of Invention Based on the above, the invention aims to provide a zinc salt modified ruthenium catalyst performance optimization system and method based on active learning and uncertainty evaluation, which can actively learn, intelligently decide and efficiently traverse a chemical space in a data driving mode, so that an optimal zinc salt modification scheme is systematically discovered. The invention adopts the following scheme: In a first aspect, the invention provides a zinc salt modified ruthenium catalyst performance optimization method based on active learning and uncertainty evaluation, which comprises the following steps: S1, performing digital characterization processing on candidate zinc salt modification schemes, converting zinc salt types and Zn/Ru molar ratios into digital vectors with uniform dimensions, and generating a standardized feature library containing standard feature vectors of all candidate schemes; S2, matching and aligning a historical data list obtained from a historical experiment record with a standardized feature library, and selectively matching a historical scheme from a standard feature vector and actual cyclohexene of the historical scheme to generate an initial training set and a current optimal performance benchmark; S3, carrying out Gaussian process regression training on the initial training set, fitting a mapping relation between a standard feature vector and cyclohexene selectivity, quantifying uncertainty of a model prediction result, and generating an uncertainty prediction model with prediction mean and variance output capability; S4, inputting unexperienced candidate schemes and current optimal performance benchmarks in the standardized feature library into an uncertainty prediction model for active learning scoring, calculating expected lifting values and uncertainty comprehensive scores of the candidate schemes, and sequencing to generate an experiment recommendation list comprising specific schemes; And S5, performing experiment verification and model iteration processing based on an experiment recommendation list, executing experiments in the list, adding the obtained new measured data into a training set, and iterating to tra