Search

CN-121981989-A - ResNet50 0 depth feature extraction and Stacking integrated fusion-based medical image analysis method

CN121981989ACN 121981989 ACN121981989 ACN 121981989ACN-121981989-A

Abstract

A medical image analysis method based on ResNet depth feature extraction and Stacking integration fusion comprises the steps of 1 carrying out standardized pretreatment on medical image data to construct a training set, 2 extracting high-dimensional depth features of images based on a pre-training ResNet network to capture pathological information, 3 adopting a double-layer Stacking framework, enabling a first layer to learn the depth features by using a plurality of heterogeneous base learners, outputting prediction probability as meta features, 4 splicing and standardizing the meta features and original depth features to form fusion features, 5 enabling a second layer to use logistic regression as a meta learner to obtain a final classification model, and 6 evaluating performance through cross verification and external testing set, combining feature importance sorting and attention visualization, and analyzing key image areas and pathological association. The invention combines the advantages of the characteristic representation capability of the convolutional neural network and the integrated learning framework, and improves the accuracy, stability and interpretability of medical image classification.

Inventors

  • JIA DAYU
  • NIU JINGYU
  • Song Beihan
  • HUANG JIN
  • WANG YUZI
  • ZHOU JIANGHAO

Assignees

  • 辽宁大学

Dates

Publication Date
20260505
Application Date
20260122

Claims (7)

  1. 1. A medical image analysis method based on ResNet depth feature extraction and Stacking integration fusion is characterized by comprising the following steps: Step 1), carrying out standardization and pretreatment on a medical image data set, wherein the standardization and pretreatment comprise uniform image size, contrast enhancement and data amplification, and constructing a training sample; Step 2), extracting high-dimensional depth features of the images by utilizing a ResNet depth residual error network pre-trained on a large-scale natural image dataset, and capturing pathological information from local textures to global semantics through a multi-level convolution structure of the high-dimensional depth features; step 3), constructing a double-layer Stacking integrated fusion framework, wherein a first layer introduces a plurality of heterogeneous base learners such as a support vector machine, a random forest, a gradient lifting tree and the like, performs preliminary learning on the features extracted by ResNet and outputs prediction probability as meta-features; Step 4), splicing and standardizing original depth features extracted by ResNet and meta-features generated by the Stacking first layer to form an enhanced fusion feature vector; Step 5), adopting logistic regression as a meta learner of a Stacking second layer, training and optimizing the fusion characteristics, and generating a final classification decision model; Step 6), evaluating the performance of the model through cross validation and an external test set, and analyzing the association between the key image area and the pathological characterization by combining feature importance sorting and attention visualization.
  2. 2. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 1), the specific method is as follows: Analyzing the images in the data set, checking and deleting DICOM files damaged by missing key information or metadata, quantifying the image quality by calculating the entropy value and the signal-to-noise ratio of the images, automatically removing the images which cannot be identified by the key structure, and removing the scanning slices which do not contain the target area according to the task target; Step 1.2) unifying image size and format by resampling all images to a fixed size using bilinear interpolation algorithm, adjusting window width and window level of the CT image to map HU value range of tissue of interest to display range and normalizing all images by performing Z-Score normalization on pixel values of the entire dataset, the specific formulas are as follows, Converting a single-channel gray level image into a three-channel image through a copying channel so as to adapt to the input requirement of a ResNet model pre-trained on an ImageNet; step 1.3) image enhancement and contrast optimization, namely enhancing the overall contrast of an image by using contrast limitation self-adaptive histogram equalization, limiting contrast amplification while blocking the image and carrying out local histogram equalization, and carrying out smoothing denoising by applying Gaussian filtering; step 1.4) data amplification, performing image amplification on the training set image.
  3. 3. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 2), the specific method is as follows: step 2.1) loading and configuring a pre-training ResNet model, namely loading a pre-trained ResNet model weight on an ImageNet dataset, removing a global pooling layer and a fully-connected classification layer at the top of the model weight, reserving a convolution layer as a feature extractor, selectively freezing shallow convolution parameters according to task requirements to reserve general visual features and allow deep network fine adjustment to adapt to medical image characteristics; step 2.2) constructing a feature extraction flow, namely inputting preprocessed standardized images into the adjusted ResNet network in batches, enabling the images to sequentially pass through five stages of residual convolution modules, wherein each module is formed by convolution layers, batch normalization layers, activation functions and residual connection, extracting and fusing multi-scale features stage by stage, and finally obtaining a high-dimensional feature map containing rich space and semantic information in the last convolution layer; Step 2.3) feature map aggregation and vectorization, which is to execute global average pooling on the high-dimensional feature map with the size of [ batch_size, 2048, 7, 7] output by the last convolution module, calculate the average value of all pixel values on the two-dimensional space of each channel, compress the feature map of each sample into a 2048-dimensional feature vector, and enable the vector to compactly encode hierarchical information from micro texture to macro semantic in the image; Step 2.4) feature post-processing and data set construction, namely L2 normalization is carried out on the extracted 2048-dimensional feature vectors, the modular length of each feature vector is 1, the possible influence of feature scale difference on a subsequent model is eliminated, the normalization formula is as follows, Wherein, the Is the L2 norm of the vector v, and the normalized feature vector, the corresponding image label and the related clinical metadata are associated to construct a structured feature data set for training and verifying the subsequent Stacking integrated model.
  4. 4. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 3), the specific method is as follows: Step 3.1) setting a first layer heterogeneous base learner pool, namely selecting and initializing a plurality of machine learning models with different structural principles as a first layer base learner, and capturing different modes in data by utilizing diversity of the machine learning models; Step 3.2) generating meta-features based on K-fold cross validation, namely randomly dividing a ResNet extracted and normalized feature dataset into K disjoint subsets, carrying out K-turn training and prediction on each base learner, sequentially using K-1 subset training models in each turn, generating prediction probabilities for the rest subset which is not seen by using the trained models, splicing the prediction probabilities of all folds according to an original sequence to form meta-feature columns which are not only corresponding to the base learner but also one-to-one corresponding to original training set samples, and taking the average value of the K training model prediction results as the meta-features of a test set; Step 3.3) splicing and constructing a meta-feature data set, namely transversely splicing meta-feature columns generated by all base learners through the cross verification process, and selectively splicing original ResNet depth features together to form an enhanced fusion feature data set, wherein the data set synthesizes original depth features and "views" obtained by a plurality of models after data are interpreted from different angles, and the dimension is that the number of samples is x (the number of base learners is x the predicted probability dimension+2048); And 3.4) training a second-layer element learner to carry out final fusion, namely taking the fusion characteristic data set constructed in the step 3.3) as new training data, taking the corresponding original label as a target variable, training a second-layer element learner which is used for learning how to optimally balance and combine the 'views' and the original characteristics provided by each base learner of the first layer to make final classification decisions, and after training, generating the element characteristics of the double-layer Stacking model by all base learners of the first layer, and inputting the element characteristics into the second-layer element learner after splicing with the original characteristics to obtain final results so as to realize the prediction of new samples.
  5. 5. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 4), the specific method is as follows: Step 4.1) feature stitching, wherein the original depth feature matrix extracted by ResNet to be the following N is the number of samples, and the meta-feature matrix generated by M base learners of the Stacking first layer through K-fold cross validation is D, connecting the original depth feature vector and the meta feature vector corresponding to each sample into a longer fusion feature vector by vector splicing according to the classification category number and the base model number to form a primary fusion feature matrix ; Step 4.2) feature standardization, namely, in order to avoid preference of the subsequent model caused by scale difference due to the fact that the spliced features have different scales and distribution, the fused features are required to be re-standardized, namely, Z-Score standardization is adopted to independently process each dimension feature, so that the average value is 0, the standard deviation is 1, and a calculation formula is as follows: Wherein the method comprises the steps of Is the original value of the ith sample on the jth dimensional feature, And Respectively obtaining a mean value and a standard deviation of the j-th dimension characteristic on the training set, and obtaining a fusion characteristic matrix after standardization ; Step 4.3) constructing a final fusion characteristic data set, namely fusing characteristic matrixes And carrying out association storage on the sample label y and necessary identification information to construct an enhanced fusion characteristic data set, wherein the data set integrates the bottom-layer deep visual characteristics and the upper-layer multi-model prediction semantics, can express richness and discrimination diversity, and provides input for training of a Stacking second-layer element learner.
  6. 6. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 5), the specific method is as follows: Selecting a logistic regression as a second-layer element learner, outputting class probability through a Sigmoid function, processing multiple class probability for multiple classified tasks by using multiple logical regression of an expanded form, outputting class probability through a Softmax function, training the model by using the enhanced fusion characteristic data set output in the step 4, generating the data set through K-fold cross validation by the first-layer element learner, and ensuring unbiased of the element characteristics; Step 5.2) super-parameter optimization and model training, wherein grid search or random search strategy is adopted to perform optimization in a preset super-parameter space, the space comprises regularization intensity C, regularization type L1 or L2 and key parameters of an optimization solver, the performance of the model is evaluated and selected through nested cross validation, and an AUC or F1 score is used as a core optimization index to finally determine an optimal super-parameter combination and complete model training; And 5.3) generating and evaluating a final model, namely retraining a logistic regression model on the whole fusion feature training set by using the determined optimal super parameters to obtain a final Stacking classification decision model, wherein the model receives the fusion feature vector x as input and outputs the probability that a sample belongs to a positive class, and for a bi-classification task, the decision function is as follows: Where w is the weight vector and b is the bias, the performance of the final model will be evaluated comprehensively on the test set of step 6).
  7. 7. The method for analyzing the medical image based on ResNet50 0 depth feature extraction and Stacking integration according to claim 1, wherein in the step 6), the specific method is as follows: Step 6.1) internal robustness assessment, namely carrying out robustness assessment on a complete training process from ResNet feature extraction to a Stacking model by using layered K-fold cross validation, dividing an original training set into K folds, sequentially using the rest K-1 fold data in each fold for complete model training of the steps 1) to 5), carrying out assessment by using the rest 1 fold as an internal validation set, summarizing the mean value and standard deviation of assessment indexes of all folds, assessing the stability and generalization capability of model performance, and avoiding accidental results caused by single data division; step 6.2) verifying external generalization capability, namely performing performance test on a finally trained Stacking model on an external test set which is completely independent and does not participate in any training or tuning process, calculating and reporting comprehensive performance indexes of the model on the test set, drawing a ROC curve and a PR curve, calculating AUC, and comprehensively evaluating the judging efficiency and generalization capability of the model under a real application scene; Step 6.3) model interpretability analysis and clinical relevance verification, namely carrying out global importance sequencing on fusion features based on the weight coefficient of a Stacking second-layer logistic regression model, quantifying contribution degrees of all base learner output and original depth features, carrying out sample-level local contribution decomposition by assisting with SHAP values, identifying key feature modes for distinguishing pathological states, generating an attention thermodynamic diagram through a gradient weighted type activation mapping Grad-CAM technology, superposing the attention thermodynamic diagram on an original image, intuitively positioning a key area on which a model decision depends, carrying out relevance analysis on feature importance sequencing results and a high activation area revealed by the attention thermodynamic diagram, carrying out cross verification on the feature importance sequencing results and doctor labeling or clinical pathological reports, explaining consistency between the image modes captured by the model and known pathological characterization, and verifying decision rationality and clinical reliability of the model.

Description

ResNet50 0 depth feature extraction and Stacking integrated fusion-based medical image analysis method Technical Field The invention relates to the field of intelligent analysis of medical images, and aims at solving the problems that the current medical image diagnosis method based on deep learning is mostly dependent on feature extraction and classification of a single model, and lacks a systematic fusion and cooperative utilization mechanism of multi-level and multi-dimensional features of images, so that the model has insufficient generalization capability and poor interpretability. The invention constructs a layered and interpretable medical image analysis method by fusing strong feature representation capability of a pre-training deep convolutional network and a robust decision mechanism of a multi-layer integrated learning framework, solves the problem of cooperative utilization of local detail and global semantic information in medical images, and provides a medical image analysis method based on ResNet feature extraction and Stacking integration fusion. Background In recent years, the amount of medical image data shows explosive growth, and images such as CT and the like become important bases for early disease screening, diagnosis and stage separation and treatment evaluation. However, the traditional medical image analysis mainly depends on visual interpretation and experience judgment of radiologists, and has the problems of low diagnosis efficiency, great influence on the consistency of diagnosis, easy fatigue missed diagnosis and misdiagnosis caused by long-time work, and the like, and particularly in a high-load clinical environment, the problems directly influence the timeliness of diagnosis and the prognosis condition of patients. Currently, the medical image analysis technology based on artificial intelligence has made remarkable progress, and mainly can be divided into two main methods, namely an end-to-end model based on deep learning, such as directly adopting a ResNet, denseNet, efficientNet convolutional neural network pre-trained on natural images and the like to extract and classify the characteristics of medical images, and an analysis method based on traditional machine learning, wherein the analysis method based on traditional machine learning is used for making decisions by combining a support vector machine, a random forest and other classifiers through manual design or radiological characteristics extracted by a shallow network. Although the methods show good performance in part of specific tasks, the method has the following defects that firstly, a single deep learning model is often focused on the characteristics of a certain scale or an abstract level, the omnibearing pathological information in a medical image is difficult to capture simultaneously and effectively, traditional manual characteristics are limited by priori knowledge of designers and cannot adapt to potential complex modes in learning data, secondly, the traditional method cannot integrate abstract semantic characteristics extracted by the deep model and robust statistical discrimination characteristics provided by the traditional model effectively, so that complementary advantages among different characteristic sources are not fully utilized, thirdly, the model has poor interpretability, a 'black box' decision is difficult to trace to a specific image area, a diagnosis basis for enabling a clinician to trust cannot be provided, fourthly, the generalization capability and robustness of the model are insufficient, and performance attenuation is obvious on cross-center and cross-equipment data. These drawbacks together limit the conversion of the prior art into highly reliable and popular clinically assisted diagnostic tools. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a medical image analysis method based on ResNet depth feature extraction and Stacking integration fusion. Firstly, a standardized medical image preprocessing flow is constructed, and a high-quality input sample is formed through uniform image size, window width and window level adjustment, Z-Score standardization, self-adaptive histogram equalization and data amplification. Second, extract high-dimensional depth features using ResNet network pre-trained on ImageNet, capture multi-level information from local texture to global semantics in the image. On the basis, a double-layer Stacking integrated fusion framework is designed, wherein a first layer is integrated with a plurality of heterogeneous base learners, depth features are subjected to preliminary learning and meta features are generated, original depth features and meta features are spliced and standardized to form an enhanced fusion feature vector, and a logistic regression is adopted as a second layer of meta learner to train and optimize the fusion features to obtain a final classification model The invention is realized by the following techni