Search

CN-121983174-A - Multi-index weighted scoring screening method and system for AIE fluorescent probes

CN121983174ACN 121983174 ACN121983174 ACN 121983174ACN-121983174-A

Abstract

The invention provides an AIE fluorescent probe multi-index weighted scoring screening method and system, wherein the method comprises the steps of obtaining probe molecular structure, experimental environment and other data, preprocessing to obtain a characteristic data set, optimizing optimal molecular descriptors and two classification prediction models respectively aiming at thermal stability, light stability, stokes displacement and AIE enhancement factors, inputting a probe to be evaluated into a target model to obtain a prediction result, calculating a synthesis accessibility score, combining the prediction result and the standardized synthesis accessibility score to form index data, determining each index weight through mixed data factor analysis, and calculating a comprehensive score to realize probe sequencing screening. The invention can realize the efficient screening of AIE probes, reduce trial-and-error cost, and enhance grading stability and applicability.

Inventors

  • WANG HAN
  • LIN ZHANG
  • XIAO YANG
  • LIN LE
  • REN CHANGHAI

Assignees

  • 中南大学

Dates

Publication Date
20260505
Application Date
20260327

Claims (10)

  1. 1. The AIE fluorescent probe-oriented multi-index weighted scoring screening method is characterized by comprising the following steps of: S1, acquiring original data comprising molecular structure information, experimental environment information and AIE property information of candidate AIE probes, and performing data preprocessing and feature engineering processing on the original data to obtain a feature data set for modeling; S2, respectively constructing a plurality of molecular descriptor characterization schemes aiming at four key properties of thermal stability, light stability, stokes shift and AIE enhancement factors, and respectively constructing candidate two-class prediction models based on different molecular descriptor characterization results; s3, inputting the candidate AIE probes to be evaluated into each target property prediction model, obtaining the prediction results of the candidate AIE probes to be evaluated on four key properties, and calculating the synthesis accessibility scores of the candidate AIE probes to be evaluated; S4, constructing comprehensive score index data based on the predicted results and the synthesis accessibility scores on the four key properties, and carrying out standardization processing on the synthesis accessibility scores so that the comprehensive score index data and the synthesis accessibility scores have the same scoring direction with the predicted results on the four key properties; S5, carrying out mixed data factor analysis on the comprehensive scoring index data, determining weights corresponding to the scoring indexes, and comprehensively scoring the candidate AIE probes to be evaluated according to the weights corresponding to the scoring indexes so as to realize sorting screening of the candidate AIE probes according to the comprehensive scoring.
  2. 2. The AIE fluorescent probe-oriented multi-index weighted scoring screening method of claim 1, wherein the molecular structure information comprises at least a SMILES representation of a molecule, the experimental environment information comprises at least solvent information, and the AIE property information comprises at least thermal stability, light stability, and property data related to an absorption spectrum or an emission spectrum.
  3. 3. The AIE-fluorescent-probe-oriented multi-index weighted scoring screening method according to claim 1, wherein in step S1, the feature engineering process at least includes molecular structural feature construction, solvent information numeralization process, and key property label construction; the solvent information is numerically processed by mapping solvent names into preset solvent parameters to form numerical characteristics for machine learning modeling, wherein the preset solvent parameters at least comprise one or more of Et (30), SP, sdP, SA and SB; The key property tag construction includes: The thermal stability is marked in a classification way according to the decomposition temperature; the light stability is marked in a classification mode according to whether decomposition, spectral change or molecular structure change occurs under the illumination condition; calculating Stokes displacement according to the absorption wavelength and the emission wavelength, and performing classification marking according to a preset threshold; and calculating an AIE enhancement factor according to the luminous intensity of the aggregation state and the solution state, and carrying out classification marking according to a preset threshold value.
  4. 4. The AIE-fluorescent-probe-oriented multi-index weighted scoring screening method according to claim 1, wherein in step S2, the plurality of molecular descriptor characterization schemes at least include two or more of Morgan fingerprint, dayleight fingerprint, atom-pair fingerprint, MACCS fingerprint, and Description molecular descriptor set; And comparing the performances of different molecular descriptor characterization schemes through a plurality of random data partitioning conditions, and determining target random data partitioning conditions for modeling each key property according to the comprehensive performance index.
  5. 5. The AIE-fluorescent-probe-oriented multi-index weighted score screening method according to claim 1, wherein in step S2, the candidate two-class prediction model at least includes two or more of a support vector machine classification model, a logistic regression model, a gradient lifting tree classification model, a multi-layer perceptron classification model, a random forest classification model, and an extreme gradient lifting classification model.
  6. 6. The AIE fluorescent probe-oriented multi-index weighted score screening method of claim 1, wherein in step S2, the four key properties correspond to different best molecular descriptors and/or different target property prediction models, respectively.
  7. 7. The AIE-fluorescent-probe-oriented multi-index weighted score screening method according to claim 1, wherein in step S3, the prediction results on four key properties are prediction probabilities outputted by each target property prediction model, and the prediction probabilities include a thermal stability standard reaching prediction probability, a light stability standard reaching prediction probability, a stokes shift standard reaching prediction probability, and an AIE enhancement factor standard reaching prediction probability.
  8. 8. The AIE-fluorescent-probe-oriented multi-index weighted score screening method of claim 1, wherein in step S3, the synthetic accessibility score is obtained by calculating candidate AIE probes by a chemoinformatics tool and participates in the subsequent weighted score as one of the comprehensive score indexes; In step S4, the normalization processing for the composite accessibility score includes normalization processing and direction unification processing.
  9. 9. The AIE-fluorescent-probe-oriented multi-index weighted score screening method according to claim 1, wherein in step S5, the integrated score index data includes both continuous score variables and discrete score variables, and the mixed data factor analysis is used for determining the weight of each score index under the mixed data condition of the continuous score variables and the discrete score variables; And determining the weight of each scoring index according to the load of each scoring index on the main component and the interpretation rate of the corresponding main component, and carrying out linear weighting on the predicted results on four key properties and the normalized synthesized accessibility scores based on the weight to obtain the comprehensive scores of the candidate AIE probes.
  10. 10. An AIE fluorescent probe-oriented multi-index weighted scoring system comprising a memory, a processor and a program stored on the memory and running on the processor, wherein the processor implements the steps of an AIE fluorescent probe-oriented multi-index weighted scoring screening method according to any one of claims 1-9 when executing the program.

Description

Multi-index weighted scoring screening method and system for AIE fluorescent probes Technical Field The invention relates to the technical field of machine learning material design and screening, in particular to an AIE fluorescent probe-oriented multi-index weighted scoring screening method and system. Background The AIE (Aggregation-Induced Emission) fluorescent probe has the characteristics of enhanced Aggregation state luminescence, low background signal, high detection sensitivity and the like, and has good application prospects in the fields of chemical sensing, biological imaging, environment detection and the like. With the continuous development of AIE molecule design, synthesis and performance research, the number of candidate probe molecules is continuously increasing, and how to rapidly screen target probes with better comprehensive performance and practical application value from a large number of candidate molecules has become one of the problems to be solved in the art. The screening mode of the existing AIE fluorescent probe mainly depends on experimental tests and artificial experience judgment, and experimental verification and condition optimization are usually required to be carried out item by item aiming at indexes such as luminous performance, stability and the like of target molecules. Although the method can obtain a more direct experimental result, under the conditions of larger number of candidate molecules, more evaluation indexes, complex experimental conditions, the method generally has the problems of long screening period, high experimental cost, large trial-and-error cost and the like, and is difficult to meet the actual requirements of quick design and iterative optimization of the AIE fluorescent probe. Especially in practical applications, the preference of the AIE fluorescent probe is not generally dependent on a single performance index, but rather needs to comprehensively consider various factors such as thermal stability, light stability, stokes shift, AIE enhancement factors, synthesis accessibility and the like. The traditional screening mode relying on single test or experience rule is difficult to comprehensively evaluate a plurality of key indexes uniformly, objectively and efficiently. With the gradual accumulation of AIE related experimental data and published document data, a structure-to-property prediction method based on machine learning provides a new technical path for AIE fluorescent probe screening. However, the prior art focuses on single property prediction or local performance analysis, and it is still difficult to form a multi-index comprehensive scoring screening mechanism oriented to the preference of AIE fluorescent probes. Moreover, the suitability of different property tasks to the molecular characterization mode and the modeling strategy is different, if a unified descriptor or a unified modeling mode is adopted, partial property prediction effect is limited easily, and the reliability of the comprehensive screening result is further affected. In addition, in constructing an AIE fluorescent probe comprehensive evaluation system, the index involved in scoring typically includes both continuous data and discrete data. For example, the composite accessibility score is typically a continuous variable, while whether a key property meets or corresponds to a predicted outcome may appear as discrete data or classification outcome. For such mixed type index data, if a traditional weight analysis method only applicable to a single data type is directly adopted, partial index information is easily underutilized, so that the rationality of weight distribution and the stability and the interpretability of a comprehensive scoring result are affected. Therefore, there is a need for a multi-index weighted scoring method and system for optimizing AIE probes, which can realize efficient sorting and screening of candidate AIE probes, reduce trial-and-error cost, and enhance stability and applicability of scoring results. Disclosure of Invention The invention aims to provide a multi-index weighted scoring method and system for AIE probe optimization, and aims to solve the problems that multi-index is difficult to uniformly evaluate, weight distribution is unreasonable and screening efficiency is low in the existing AIE fluorescent probe screening. In order to achieve the above object, in a first aspect, the present invention provides a multi-index weighted scoring screening method for AIE fluorescent probes, comprising the steps of: S1, acquiring original data comprising molecular structure information, experimental environment information and AIE property information of candidate AIE probes, and performing data preprocessing and feature engineering processing on the original data to obtain a feature data set for modeling; S2, respectively constructing a plurality of molecular descriptor characterization schemes aiming at four key properties of thermal stability,