KR-20260066243-A - EVALUATION METHOD FOR MODELS BASED ON GAUSSIAN PROCESS REGRESSION AND STRUCTURAL EQUATION MODELING
Abstract
The present invention relates to the field of machine learning model evaluation, and more specifically to a method for evaluating prediction models based on small datasets. In particular, the present invention proposes a comprehensive evaluation method that reflects the uncertainty of a Gaussian process regression-based structural equation model (GPR-SEM). A method for evaluating a structural equation model based on Gaussian process regression according to the present invention is characterized by comprising: a first step of generating a predicted value and an uncertainty value based thereon using Gaussian process regression (GPR); a second step of setting a confidence interval for each predicted value based on the generated uncertainty value; a third step of checking whether verification data is included within the set confidence interval; and a fourth step of assigning a score to the verification data and calculating a final score (GS-Score) based on the number of data included within the confidence interval.
Inventors
- 문경렬
- 박건욱
Assignees
- 한국소재융합연구원
Dates
- Publication Date
- 20260512
- Application Date
- 20241104
Claims (5)
- In the method of evaluating the performance of a prediction model, Step 1: Generating predicted values and corresponding uncertainty values using Gaussian Process Regression (GPR); A second step of setting a confidence interval for each predicted value based on the uncertainty value generated above; A third step of checking whether the verification data is included within the confidence interval set above; A fourth step of assigning a score to the verification data and calculating a final score (GS-Score) based on the number of data included within the confidence interval; characterized by including Gaussian process regression-based structural equation model evaluation method.
- In Article 1, The above prediction model is characterized as being a GPR-SEM model that combines Gaussian Process Regression (GPR) and Structural Equation Modeling (SEM) to model non-linear relationships in a small dataset environment. Gaussian process regression-based structural equation model evaluation method.
- In Article 1, The above GS-Score is characterized by being calculated by comparing the number of data points in which the verification data is included within the confidence interval with a predefined target score. Gaussian process regression-based structural equation model evaluation method.
- In Article 1, Characterized by using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Coefficient of Determination (R²) as additional performance evaluation metrics, Gaussian process regression-based structural equation model evaluation method.
- In Article 1, The above evaluation method is characterized by setting a confidence interval for each predicted value to reflect the uncertainty of the prediction model, and evaluating prediction performance based on the proportion of verification data included in the confidence interval. Gaussian process regression-based structural equation model evaluation method.
Description
Evaluation Method for Models Based on Gaussian Process Regression and Structural Equation Modeling The present invention relates to the field of machine learning model evaluation, and more specifically to a method for evaluating prediction models based on small datasets. In particular, the present invention proposes a comprehensive evaluation method that reflects the uncertainty of a Gaussian process regression-based structural equation model (GPR-SEM). Recently, the development of accurate machine learning models has become an essential role in quality control and product life cycle management across various industries, including manufacturing. While traditional evaluation metrics such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and Coefficient of Determination (R²) are widely used, these metrics have limitations in evaluating models that involve uncertainty, such as Gaussian Process Regression-based Structural Equation Models (GPR-SEM). In particular, in fields using small datasets, evaluations may be incomplete because they fail to adequately reflect uncertainty information. GPR-SEM is a machine learning modeling technique that, unlike conventional linear regression models, can effectively reflect non-linear relationships. It is a highly useful tool for modeling the uncertainty and complexity arising during polymer development and mass production processes, and its utility is particularly emphasized for enabling precise predictions even with small amounts of data. Previous research applied the GPR-SEM technique to predict the physical properties of polymer materials such as polyacetal resins, confirming that it can achieve higher prediction accuracy than conventional linear modeling methods. Therefore, the present invention proposes GS-Score, a comprehensive evaluation method that reflects the uncertainty of the GPR-SEM model, thereby enabling more objective comparison between models and performance evaluation. The terms used in this specification will be briefly explained, and the invention will be described in detail. The terms used in this invention have been selected based on currently widely used general terms while considering their functions within the invention; however, these terms may vary depending on the intent of those skilled in the art, case law, the emergence of new technologies, etc. Therefore, the terms used in this invention should be defined not merely by their names, but based on their meanings and the overall context of the invention. When a part of a specification is described as “comprising” a certain component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components. Embodiments of the present invention are described below with reference to the attached drawings so that those skilled in the art can easily implement them. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Specific details regarding the problem to be solved by the present invention, the means for solving the problem, and the effects of the invention are included in the embodiments and drawings described below. The advantages and features of the present invention, and the methods for achieving them, will become clear by referring to the embodiments described below in detail together with the accompanying drawings. Hereinafter, the present invention will be described in more detail with reference to the attached drawings. The GPR-SEM (Gaussian Process Regression - Structural Equation Modeling) method, utilizing Bayesian methodology presented in previous studies, was proposed to effectively model nonlinear relationships even in environments with small amounts of data. However, existing evaluation metrics are insufficient for assessing the performance of GPR-SEM models. Existing predictive model fit metrics, such as RMSE (Root Mean Square Error), MAE (Mean Absolute Error), R² (Coefficient of Determination), modified R², AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and k-fold cross-validation, are useful for evaluating specific aspects of a model, but they have limitations in that they do not sufficiently reflect the overall characteristics of a model that includes uncertainty information, unlike GPR-SEM models. Accordingly, the present invention proposes a new comprehensive evaluation index called GS-Score (GPR-SEM Score). GS-Score aims to facilitate performance comparison and selection between models by expressing the overall performance of the model as a single score. This new evaluation method is designed to incorporate uncertainty information, a characteristic of GPR-SEM models, into the model. 1) GPR-SEM Structural Equation Modeling (SEM) is a multivariate modeling technique widely used in the social sciences that combines confirmatory factor analysis and path analysis. GPR-SEM is a modeling method that combines SEM with Ga