CN-116959083-B - Feature fusion-based tuberculous retina OCT image classification method

CN116959083BCN 116959083 BCN116959083 BCN 116959083BCN-116959083-B

Abstract

The invention discloses a method for classifying tuberculous retina OCT images based on feature fusion, which comprises the steps of firstly acquiring tuberculous retina OCT images, secondly preprocessing the images, thirdly constructing a plurality of reference neural networks, fourthly adding a multiscale gating channel attention mechanism into the reference networks, and acquiring importance degrees among feature channels by utilizing learning of the neural networks so as to strengthen the beneficial features of classification tasks and inhibit the non-beneficial features. Training and testing, namely finding out an optimal model under the optimal parameters of each network structure through multiple comparison experiments, and step six, fusing the models, namely constructing a basic learner by utilizing a multi-layer convolutional neural network and a gating attention mechanism to serve as a feature extractor, fusing image features extracted by different models, and inputting the fused features into a support vector machine for further classification and prediction, so that the classification precision of the tubercular retina OCT images is further improved.

Inventors

WANG LIPING
CHEN TAO

Assignees

浙江工业大学

Dates

Publication Date: 20260505
Application Date: 20230505

Claims (4)

1. The tuberculous retina OCT image classification method based on model fusion is characterized by comprising the following steps of: step 1, acquiring an OCT image of the tuberculous retina, and dividing the OCT image into a training set, a verification set and a test set according to the proportion; step 2, image preprocessing, namely performing noise reduction processing on an original image, and performing data enhancement operation on the noise-reduced image, wherein the operations comprise scaling, random cutting, rotation, vertical overturning, horizontal overturning and batch normalization so as to ensure the generalization capability of a model; step 3, constructing a basic network model; Step 4, adding a multiscale gating channel attention mechanism in the basic network model, and acquiring importance degrees among feature channels by utilizing the learning of a neural network so as to strengthen the beneficial features of classification tasks and inhibit the non-beneficial features; the specific process in the step 4 is as follows: the gating attention mechanism is realized through a multi-scale gating attention channel module GCT-A, wherein the multi-scale gating attention channel module GCT-A comprises three parts, a global context embedding module, a channel normalization module and a gating self-adaptation module; The global context embedding module uses 3 convolution kernels with different sizes, namely 1×1,3×3 and 5×5, wherein the 1×1 convolution kernel is used for adjusting the number of channels to adapt to the requirement of the subsequent convolution operation, and the 3×3 and 5×5 convolution kernels are used for capturing receptive fields with different sizes to cover a wider range of global context information so as to avoid trapping local semantic ambiguity, and the global context information in each channel is aggregated, and the specific definition is as follows: (1) Where the parameter α is defined as α= [ α 1 ,α 2 ,...,α c ], when α n approaches 0, the channel does not participate in channel normalization, X c represents the channel of the feature map, H represents the height of the feature map, W represents the width of the feature map, All channels representing the ith row and the jth column in the feature map, p represents p-norm, 1-norm and 2-norm are adopted to form a double scale, and a very small constant ɛ is introduced to avoid the problem of derivation at the zero point; in order to establish a competing relationship between neurons and channels, channel normalization was performed by using BatchNorm normalization, which is specifically defined as follows: (2) For each channel C, each pixel in the feature map is regarded as a vector S C in the C dimension, vector S C is normalized BatchNorm, C represents the number of channels in the feature map, Dividing BatchNorm normalized feature vectors by a scaling factor related to the channel number C to keep the amplitude of the feature vectors unchanged, and finally obtaining the channel importance after 1-norm and 2-norm aggregation global information; And fusing the channel importance calculated in the two modes to obtain a final normalized vector so as to obtain better performance and stability, wherein a fusion formula is as follows: (3) When a gating mechanism of a channel is actively activated, the gating mechanism facilitates the channel to compete with other channels, whereas when it is negatively activated, the gating mechanism facilitates the channel to cooperate with other channels, as specifically defined below: (4) Wherein X C is a characteristic diagram of an original channel C, relu is a linear rectification function, a weight gamma is defined as [ gamma 1 ,γ 2 ,...,γ c ], a weight beta is defined as [ beta 1 ,β 2 ,...,β c ], and when a gating weight gamma and a gating bias beta are both 0, original characteristics are transmitted to the next layer in a lossless manner, so that the GCT can effectively solve the degradation problem caused by a deep network; finally, the activation profile of the GCT-A module is expressed as: (5) the method comprises the following steps of (a) determining a formula of a GCT-A, wherein alpha, gamma and beta represent trainable parameters, alpha is used for being responsible for self-adaptive embedded output, gating weight gamma and bias beta are used for controlling the activation of a gate, and the three parameters determine the behavior of the GCT-A in each channel, under the formula, the calculation complexity of the GCT-A is only O (C), and O (C 2 ) lower than SENet is calculated because SENet uses two fully connected layers, and C is the number of channels; Training and testing, namely finding out an optimal model under optimal parameters of each network structure through multiple comparison experiments; And 6, model fusion, namely inputting an original image dataset into a trained optimal model to obtain image features under different models, cascading in feature dimensions to complete feature fusion, constructing a feature dataset, and finally inputting the feature dataset into a support vector machine to perform further classification training.
2. The method for classifying tubercular retina OCT images based on model fusion according to claim 1, wherein the noise reduction process in step 2 is as follows: the method PM using nonlinear diffusion filtering, PM is described by a nonlinear partial differential equation, as shown in equation (6): (6) wherein L is the gray value of the image, t is time, x and y are the spatial coordinates of the image, L is the gradient vector of L, div is the dispersion of the gradient; The conduction function c (x, y, t) is a function of controlling the nonlinear diffusion filtering speed, and is constructed as shown in formula (7): (7) Wherein L σ is the gradient of the Gaussian smoothed image L σ , and the function g () adopts a monotonically decreasing function form, and has two kinds of functions as follows: (8) (9) where k is a constant used to control the value of the gradient response.
3. The method for classifying the tubercular retina OCT images based on model fusion according to claim 1, wherein in step 4, GCT-a is deployed before the convolution layer, so that background and irrelevant noise can be better suppressed, and the learning center of gravity of the network is concentrated in the beneficial features.
4. The method for classifying tubercular retina OCT images based on model fusion according to claim 1, wherein the specific procedure of step 5 is as follows: The method comprises the steps of inputting a preprocessed training set into a basic learner, storing a trained model by continuously adjusting network parameters, inputting a preprocessed testing set into the stored model for further testing network performance, and evaluating the experimental network model performance by using 5 evaluation index Precision, recall rate Recall, specificity SPECIFICITY, F-Score and Accuracy Accuray as follows due to different meanings of different evaluation indexes on experimental results, wherein the formulas are as follows: (10) (11) (12) (13) (14) Wherein TN is the number of the normal images and the model predicts the normal images, FP is the number of the normal images and the model predicts the tuberculosis images, TP is the number of the tuberculosis images and the model predicts the tuberculosis images, FN is the number of the tuberculosis images and the model predicts the normal images.

Description

Feature fusion-based tuberculous retina OCT image classification method Technical Field The invention relates to the technical field of medical image processing and analysis in the field of computer vision, in particular to a tuberculous retina OCT (optical coherence tomography) image classification method based on model fusion. Background Tuberculosis (TB) is a chronic infectious disease caused by infection with mycobacterium Tuberculosis (Mycobacterium Tuberculosis, mtb) and has been seriously harmful to human health until now. Therefore, it is necessary to perform a rapid, efficient and accurate tuberculosis screening assay. The ocular tissues except the crystalline lens can be infected by mycobacterium tuberculosis, the incidence rate of the intraocular tuberculosis is 1.4-18%, the initial symptoms of the affected eye are usually lighter, and the symptoms similar to keratitis, nodular scleritis, trachoma, retinal vein peri-infection and the like are often ignored by patients and doctors, and are easy to misdiagnose or miss diagnosis, when the patients have ocular pain, redness or vision reduction, most non-ophthalmology professional doctors cannot always perform reliable diagnosis, and the help of ophthalmologists is not known when the ophthalmologists should be sought, if only common eye diseases are applied without anti-tuberculosis treatment, the eye diseases are often not helped, and the serious consequences of blindness can be finally caused. The current tuberculosis retinopathy diagnosis is generally achieved by the independent reading of 2 specialized fundus doctors through the results of wide-angle fundus photography, wide-angle fluorescein angiography, indocyanine green angiography, optical correlation tomography and the like. The eye is used as a micro-vascular system which can be observed noninvasively, the exposed nerves and vascular organs can be directly seen through a simple instrument without surgical cutting, and the eye characteristics can also provide a channel for diagnosing the potential systemic tuberculosis infection. Optical correlation tomography (Optical Coherence Tomography, OCT) is a type of optical tomography technology, which is a novel tomography technology with the most development prospect in recent years, has attractive prospects in biological tissue biopsy and imaging, and is applied to clinical diagnosis of medical ophthalmology, but also to interventional cardiology to help diagnose coronary artery diseases. At present, based on an optical correlation tomography technology, how to acquire the lesion characteristics of the tuberculous eye part from the OCT image by an artificial intelligence method and automatically identify the tuberculous lesion are key to improving the diagnosis efficiency of tuberculosis doctors and relieving the diagnosis pressure. Disclosure of Invention The invention provides a feature fusion-based tubercular retina OCT image classification method, which utilizes a multi-layer convolutional neural network and a gated attention mechanism to construct a basic learner to serve as a feature extractor, fuses image features extracted by different models, inputs the fused features into a support vector machine for further classification and prediction, and further improves the accuracy of tubercular retina OCT image classification. The invention is realized by the following technical scheme: A tuberculous retina OCT image classification method based on model fusion comprises the following steps: step 1, acquiring an OCT image of the tuberculous retina, and dividing the OCT image into a training set, a verification set and a test set according to the proportion; step 2, image preprocessing, namely performing noise reduction processing on an original image, and performing data enhancement operation on the noise-reduced image, wherein the operations comprise scaling, random cutting, rotation, vertical overturning, horizontal overturning and batch normalization so as to ensure the generalization capability of a model; step 3, constructing ResNet and GoogLeNet network models as basic network models; And 4, adding a multiscale gating channel attention mechanism in the ResNet network, and acquiring importance degrees among the feature channels by utilizing the learning of the neural network so as to strengthen the beneficial features of the classification task and inhibit the non-beneficial features, wherein the non-beneficial features are irrelevant information such as noise in the image, and otherwise, the important information in the image is the beneficial features. Training and testing, namely finding out an optimal model under optimal parameters of each network structure through multiple comparison experiments; inputting an original image dataset into a trained optimal model to obtain image features under different models, cascading in feature dimensions to complete feature fusion, constructing a feature dataset, and finally inputting the feature datas