CN-122025134-A - Fatty liver risk prediction method, system and storage medium based on multi-model integration
Abstract
The invention relates to a fatty liver risk prediction method, a fatty liver risk prediction system and a fatty liver risk prediction storage medium based on multi-model integration, and belongs to the technical field of artificial intelligence and medical information. The method comprises the steps of obtaining blood biochemical parameters of a target object, carrying out feature selection on the blood biochemical parameters based on clinical priori knowledge to obtain a key parameter set, arranging the key parameter set into a two-dimensional parameter matrix taking glycosylated hemoglobin as a center, inputting the two-dimensional parameter matrix into a pre-trained integrated prediction model, wherein the model comprises a special convolutional neural network, an artificial neural network, a support vector machine, XGBoost, lightGBM and other heterogeneous submodels, and finally determining the risk level of the target object with progressive non-alcoholic steatohepatitis through a hard voting integration strategy with a preset threshold value K based on the output of each submodel. The method can realize high-precision risk prediction by only using conventional blood parameters, avoids relying on expensive or invasive examination, and has important application value in basic medical institutions.
Inventors
- LIAN ZHENYU
- LI JIA
Assignees
- 宁波绮色佳金属制品有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260130
Claims (10)
- 1. The fatty liver risk prediction method based on multi-model integration is characterized by comprising the following steps of: Acquiring a plurality of blood biochemical parameters of a target object; performing feature selection based on clinical priori knowledge on the plurality of blood biochemical parameters to obtain a key parameter set; Arranging the key parameter sets into a two-dimensional parameter matrix with a preset structure; Inputting the two-dimensional parameter matrix into a pre-trained integrated prediction model, wherein the integrated prediction model comprises at least two heterogeneous sub-models, and the at least two heterogeneous sub-models comprise a special convolutional neural network model and are used for processing the two-dimensional parameter matrix; And determining the risk level of the target object suffering from progressive nonalcoholic steatohepatitis through a hard voting integration strategy and a preset threshold K based on the output of each submodel in the integrated prediction model.
- 2. The method for predicting fatty liver risk based on multimodal integration of claim 1, wherein the feature selection based on clinical priori knowledge is performed on a plurality of blood biochemical parameters, specifically, a key parameter set is selected from the plurality of blood biochemical parameters based on a preset parameter selection rule related to fatty liver and a second type diabetes pathology mechanism, wherein the key parameter set contains a smaller number of blood biochemical parameters than the plurality of blood biochemical parameters obtained initially.
- 3. The method for predicting fatty liver risk based on multimodal integration according to claim 2 wherein the key parameter set comprises age, body mass index, aspartate aminotransferase, alanine aminotransferase, glycosylated hemoglobin, insulin, ferritin, fasting blood glucose, high density lipoprotein, total cholesterol, triglycerides, high sensitivity C-reactive protein, albumin, platelet count, glutamate transpeptidase and thyroid stimulating hormone.
- 4. The method for predicting fatty liver risk based on multi-model integration of claim 1, wherein the special convolutional neural network model comprises: the self-attention module is used for carrying out global feature association analysis on the two-dimensional parameter matrix to generate a first feature; The parallel convolution module is used for carrying out local feature extraction on the two-dimensional parameter matrix through convolution check of at least two different scales to generate a second feature; And the feature fusion module is used for fusing the first features and the second features and carrying out risk prediction based on the fused features.
- 5. The method for predicting fatty liver risk based on multimodal integration of claim 1 wherein the at least two heterogeneous sub-models further comprise at least three of an artificial neural network model, a support vector machine model, a XGBoost model, and a LightGBM model.
- 6. The fatty liver risk prediction method based on multi-model integration according to claim 1, wherein: The determining the risk level based on the hard voting integration strategy and a preset threshold K specifically comprises the following steps: Obtaining a classification result output by each sub-model, wherein the classification result comprises high risk or low risk, counting the number of sub-models with high risk, and determining the target object to be a high risk level if the number reaches or exceeds a preset threshold K, wherein K is an integer greater than or equal to 3 and less than or equal to the total number of sub-models.
- 7. The method for predicting fatty liver risk based on multimodal integration according to claim 6, wherein the integrated prediction model comprises five sub-models, and the preset threshold is three or four.
- 8. The method for predicting fatty liver risk based on multimodal integration of claim 1, wherein an online difficult sample mining strategy and/or a weighted cross entropy loss function is employed during training of the integrated prediction model to address imbalance of positive and negative samples in training data.
- 9. A fatty liver risk prediction system based on multi-model integration, comprising: one or more processors; a memory; And one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the multimodal integrated fatty liver risk prediction method of any of claims 1 to 8.
- 10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements a method for predicting fatty liver risk based on multimodal integration according to any one of claims 1 to 8.
Description
Fatty liver risk prediction method, system and storage medium based on multi-model integration Technical Field The invention relates to the technical field of artificial intelligence and medical information, in particular to a fatty liver risk prediction method, a fatty liver risk prediction system and a storage medium based on multi-model integration, and especially relates to a computer-aided decision-making technology for performing noninvasive and early-stage auxiliary screening and evaluation on Progressive non-alcoholic steatohepatitis (Progressive NASH) risks by integrating multiple heterogeneous artificial intelligence models to analyze conventional blood biochemical parameters. Background Nonalcoholic fatty liver disease has become the most common chronic liver disease worldwide, with a spectrum ranging from simple fatty liver to nonalcoholic steatohepatitis, liver fibrosis, cirrhosis, and even liver cancer. Among them, progressive steatohepatitis is a key turning point for the transition of disease to irreversible liver injury, characterized by significant hepatocyte injury (balloon-like changes) and liver fibrosis (grade No. F2). Accurate identification of Progressive NASH patients is critical for early intervention and prevention of disease progression. Currently, the main methods used in clinical practice to assess NAFLD severity and Progressive NASH risk have significant limitations: Invasive gold standard and dilemma of invasive examination liver biopsy is the gold standard for current histological diagnosis and staging. However, it is an invasive procedure with risks of bleeding, pain, infection, etc., low patient acceptance, high cost, and operation dependent specialists, and is difficult to use for large-scale population screening or routine follow-up. Noninvasive evaluation of liver fibrosis scanning based on vibration-controlled transient elastography techniques, relying on expensive equipment and complex parameters, is currently an important noninvasive evaluation tool. The FAST Score derived therefrom was internationally validated for efficient identification of Progressive NASH by combining liver hardness values, controlled attenuation parameters and aspartate aminotransferase. However, FAST Score calculation relies on FibroScan equipment, which is costly to purchase and maintain, requires special training for operation, has low popularity in basic medical institutions, and limits its application as a common screening tool. The efficacy of traditional hematology indicators is inadequate in order to overcome the limitations described above, a variety of noninvasive indices based on conventional blood biochemical parameters, such as FIB-4 index and NAFLD fibrosis score, have been developed clinically. Although the indexes are easy to obtain, the indexes are mainly designed for evaluating the advanced liver fibrosis, the positive predictive value of the patients in the specific high-risk stage of advanced NASH is low (about 65%), the sensitivity and the specificity are limited, the missed diagnosis or the misdiagnosis is easy to cause, and the requirement of accurate clinical decision cannot be met. High risk group risk management is lack, and a large number of researches show that the type II diabetes mellitus is an independent risk factor for NAFLD and NASH to progress, and the liver fibrosis risk of the coexisting patients is obviously increased. However, current clinical management paths and care systems for diabetes focus on complications of heart, kidney, eye, etc., and lack standardized assessment and tracking tools for liver disease-related risks, resulting in that the liver disease risk of this high-risk population is not systematically managed. The singleness and limitation of the existing AI model in recent years, although research is attempted to apply machine learning or deep learning models (such as support vector machine, random forest and neural network) to liver disease prediction, a single algorithm model is adopted in the methods. Medical data, especially discrete biochemical parameters of blood, has the characteristics of high dimensionality, nonlinearity, large individual difference, unbalanced positive and negative samples and the like. A single model is susceptible to its inherent bias, e.g., decision tree-like models may be overfitted to specific features, while neural networks have insufficient generalization ability with small amounts of data, resulting in poor prediction stability, robustness, and overall performance in complex real scenes. In view of the foregoing, there is a need for a Progressive NASH risk prediction tool that can achieve high accuracy, low cost, and high popularity. The ideal tool can be used for (1) avoiding expensive equipment dependence by only using conventional blood biochemical parameters which are easy to obtain, (2) optimizing the model especially aiming at high-risk groups such as type II diabetes and the like, and (3) overcoming t