CN-117593317-B - Retina blood vessel image segmentation method based on multi-scale dilation convolution residual error network

CN117593317BCN 117593317 BCN117593317 BCN 117593317BCN-117593317-B

Abstract

The invention discloses a retina blood vessel image segmentation method based on a multi-scale expansion convolution residual error network, which aims to solve the problems that the retina blood vessel is accurately segmented on a retina fundus image due to limited labeling data, differences among blood vessels, interference of lesion areas and the like, particularly the problem that a thin blood vessel is still a challenge, the advantages of expansion convolution and DropBlock are combined to relieve network overfitting and reduce the influence of a lesion area on blood vessel feature extraction, in addition, a multi-scale mean pooling module is introduced to acquire advanced features and retain context information, and finally, by improving the jump connection mode, the expansion convolution is effectively used for improving the jump connection information transmission capacity, and compared with other algorithms, the method can more accurately divide the tiny blood vessels in the retina image under the complex condition, and has better robustness.

Inventors

SHANG ZHENHONG
HUANG HUA

Assignees

昆明理工大学

Dates

Publication Date: 20260508
Application Date: 20231204

Claims (8)

1. The retinal blood vessel image segmentation method based on the multi-scale dilation convolution residual network is characterized by comprising the following steps of: step 1, preprocessing retina images; Step 2, increasing the number of training images by using random horizontal overturn, vertical overturn, random rotation with the angle range of [0,360] and a random clipping data enhancement method to obtain a model training set; The method comprises the steps of 3, improving three aspects of a conventional U-Net network in an encoder, a decoder, jump connection and a loss function by combining with the characteristics of retina images, adding a multi-scale residual error input MRI module, a multi-scale residual error output MRO module and a multi-scale average pooling module, respectively processing different-scale input images by the MRI modules in each layer of the encoder, respectively transmitting the outputs of the MRI modules in each layer to a combined expansion convolution DC module formed by combining different expansion rate expansion convolutions, dropBlock, batch normalization BN and Relu activation functions to obtain the output characteristics of each layer of the encoder, connecting each layer of the encoder with the DC module of the corresponding decoder layer through the jump connection DRes Path module, receiving the characteristics output by the corresponding decoder layer DC module by the multi-scale residual error output MRO module in each layer except the first layer of the decoder, generating different-scale output characteristic maps, and weighting and fusing the different-scale output characteristic maps to obtain retina segmentation result images; Step 3, replacing jump connection in the U-Net by adopting a DRes Path module, and carrying out feature transformation in the jump connection so as to better fuse the features and retain important detail information; and 4, forming a mixed loss function by using the binary cross entropy loss function and the dice loss function, and training the network model designed in the step 3 by using the training data set obtained in the step 2 until the mixed loss function converges optimally.
2. The method for segmenting retinal vascular images based on the multi-scale dilation convolutional residual network of claim 1, wherein the preprocessing operation in step 1 is as follows, a contrast-limited adaptive histogram equalization method and a gamma correction method are used after gray scale processing to enhance image contrast, highlighting retinal vascular structures.
3. The retinal vessel image segmentation method based on the multi-scale dilation convolution residual network according to claim 1, wherein step 3 introduces multi-scale information in an encoder part, and respectively transmits input images after downsampling of different scales to a multi-scale residual input MRI module for extracting features of different scales; The multi-scale residual error input MRI module consists of two layers of convolution operation, wherein DropBlock, batch normalization and linear rectification activation functions are applied after each layer of convolution operation; The second layer 1×1 convolution operation reduces the number of channels and compresses information to reduce network complexity and improve computation efficiency, avoid the problem of deep explosion, the sequence of the two continuous convolution layers is executed 3 times on the input image, each time different expansion rates are used, then the 3 output feature images are combined and spliced, and the combined and spliced feature images and the original input image are subjected to pixel addition through residual connection, so that multi-scale features are fully utilized, feature transfer capability is enhanced, an MRI module output feature image is obtained, and in order to ensure that the MRI module output feature is matched with the spliced expansion convolution module output feature, the number of MRI module channels is increased along with the increase of network depth.
4. The retinal vessel image segmentation method based on the multi-scale dilation convolutional residual network according to claim 1, wherein the step 3 is to add a multi-scale residual output MRO module after the combined dilation convolutional module of each layer of the decoder except for the first layer of the decoder, for extracting features of different scales; The multi-scale residual error output MRO module adds an up-sampling method on the basis of the MRI module to ensure that the size of each MRO output characteristic diagram is consistent with that of a 3DC block output characteristic diagram of a decoder, the outputs of the 3 MRO modules and the 3DC block output characteristic diagrams in the decoder are all 1 channel characteristic diagrams, the 3 MRO modules and the 3DC block output characteristic diagrams in the decoder are weighted and spliced to form a 4 channel characteristic diagram, wherein the value of a weighting coefficient w 1 ~w 4 is 0 to 1, and finally the characteristic map is processed by applying a 1X 1 convolution and Sigmoid activation function to generate a probability value image with the value range of 0 to 1.
5. The retinal vessel image segmentation method based on the multi-scale dilation convolutional residual network as set forth in claim 1, wherein the step 3 network introduces a combined dilation convolutional DC module formed by combining dilation convolutions with different dilation rates, dropBlock and batch normalization BN and Relu activation functions, and the combined structures with the dilation rates of 1, 2 and 3 are named as a 1DC module, a 2DC module and a 3DC module, and the DC modules can accelerate convergence of network training and effectively relieve the problem of overfitting of the convolutional network; the improved U-Net model in step 3 gradually increases the expansion rate of the DC module with increasing depth in the left encoder and gradually decreases the expansion rate of the DC module with decreasing depth in the right decoder.
6. The method for segmenting retinal vascular images based on a multi-scale dilation convolutional residual network of claim 1, wherein step 3 adds a multi-scale mean pooling MAP module between the convolutional layers of each encoder and decoder of the U-Net, the multi-scale mean pooling MAP module solving the segmented object size variation problem, the MAP module outputting Wherein c is the number of image channels, h is the image height, w is the image width, and Z is calculated as follows: ; ; ; Wherein, the Representing the characteristics of the input MAP module, U represents the upsampling operation, For the mean pooling operation, wherein Represents the size of the pooled core, and , And Representing the convolution kernel, [ ] represents the join operation, Represents a standard convolution; The MAP module of pyramid structure encodes the input feature MAP S to capture global context information, the pyramid structure uses four receptive fields with different sizes to carry out pooling operation, in order to increase balance parameters, the feature MAP after pooling of each layer of pyramid is reduced in dimension by a 1X 1 convolution, the number of channels is reduced to 1/N, N is the number of pyramid layers, then up-sampling is carried out on the 1X 1 convolution result by bilinear interpolation, and finally, the input features and all up-sampled feature MAPs are spliced to obtain However, however The semantic incompatibility problem exists between different feature graphs in the MAP, the direct fusion can lead to inaccurate segmentation, and in order to effectively fuse the features, the MAP module introduces another 1 multiplied by 1 convolution layer pair And processing, wherein the convolution layer reduces feature dimension and realizes effective feature fusion.
7. The retinal vessel image segmentation method based on the multi-scale dilation convolutional residual network according to claim 1, wherein in the step 3, the U-Net network adopts a binary cross entropy loss function L bce and a dice loss function L dice to form a mixed loss function, and the calculation of the binary cross entropy loss function L bce and the dice loss function L dice are respectively shown in the following formulas: ; ; Wherein the method comprises the steps of Is a model predictive value of the model, and, Its value reflects the pixel The greater the value, the greater the likelihood of being predicted as a vessel pixel; is a label, the value is 0 or 1, n is the pixel number of the image, and the total loss function is expressed as: 。
8. The retinal vessel image segmentation method based on the multi-scale dilation convolutional residual network according to claim 1, wherein an Adam optimizer is used during model training in step3, the learning rate is set to 0.000001 to 0.001, the number of network training iterations is 150 to 250, the batch size is 4 to 8, the discarded block size of DropBlock of each dataset is set to 4 to 7, and the output probability of each neuron is kept to be 0.1 to 0.9.

Description

Retina blood vessel image segmentation method based on multi-scale dilation convolution residual error network Technical Field The invention relates to the field of medical image processing, in particular to a retinal vessel image segmentation method based on a multi-scale dilation convolution residual error network. Background Retinal diseases such as diabetic retinopathy, glaucoma and age-related maculopathy are the main causes of blindness in the elderly. Clinically, observing the changes of retinal blood vessels in fundus images helps to diagnose related diseases and make subsequent treatment decisions. This procedure requires highly specialized literacy doctors to manually mark retinal blood vessels, is time-consuming, labor-consuming, and the marking results are affected by human subjective factors. The automatic segmentation of retinal blood vessels by means of a computer is of great importance in the diagnosis of related diseases. Retinal vessel segmentation has been a challenging task in the field of medical image segmentation. The existing method still has a lifting space in the aspect of accuracy of retinal vessel segmentation, especially for tiny retinal vessel segmentation. Currently, retinal vessel segmentation methods are roughly classified into two categories, a supervised learning method and an unsupervised method. Unsupervised methods attempt to achieve segmentation using the fixed structure of retinal blood vessels. The non-supervision method mainly comprises a mathematical morphology method, a matched filtering method, a multi-scale-based method, a region growing-based method and the like. This has the advantage that no manually marked dataset is needed. However, there are two major drawbacks to such methods, firstly, that they partition relatively low accuracy due to the lack of global context information. Second, such methods rely on manually designed feature extractors, and there are complex and diverse backgrounds in retinal images, such as infection lesions, which often lead to false positives or false negatives in the segmentation results. Compared with the supervised method, the unsupervised method has poor robustness and limited performance. In recent years, a supervised deep learning method has become a mainstream method of retinal vascular segmentation. Among them, in view of the prominent performance of the U-Net deep learning network in the medical image segmentation task, the network is widely applied to the retinal vascular segmentation task. Compared with the traditional unsupervised method, the U-Net-based method can automatically learn complex features, and improves the accuracy of retinal vessel segmentation. The problem that the network fuses low-level features and high-level features through a jump connection mechanism to solve the problem of spatial information loss caused by downsampling of a deep convolutional neural network, but partial spatial information at a shallow stage of an encoder is difficult to recover at a decoder due to semantic feature difference, and in addition, the U-Net network has limitations when processing the problems of small and irregular retinal vascular structures, limited labeling data and the like. in recent years, researchers have further improved the retinal vessel segmentation performance on the basis of U-Net. For example, the SD-UNet network model alleviates the network overfitting problem by introducing DropBlock structures in the U-Net architecture. SA-UNet introduces a batch normalization (Batch Normalization, BN) layer in the convolution block of SD-UNet, and improves the network feature extraction capability by using a spatial attention mechanism. DRNet the dense linking method is used to improve the jump connection of U-Net, and through the combination of residual structure and DropBlock structure, the semantic gap between encoder and decoder is reduced, the network depth is increased and the over fitting problem is alleviated. Although these methods further improve the retinal vessel segmentation effect on a U-Net basis, they are mainly directed to the network overfitting problem, ignoring retinal vessel, especially the tiny vessel structural features. For the curve structure of retinal blood vessels, CS 2 -Net uses 1 x 3 and 3 x 1 convolutions to extract vessel morphology features in both directions, highlighting the region of interest by channel and spatial attention mechanisms. In order to solve the problem of discontinuous segmentation in the blood vessel segmentation result, FR-UNet is expanded in the horizontal and vertical directions through a multi-resolution convolution interaction mechanism while maintaining the full image resolution, and then the thin blood vessel pixels are extracted through a double-threshold iterative algorithm to improve the blood vessel connectivity. In order to solve the problem of spatial information loss caused by continuous convolution and pooling operations, the CE-Net utilizes dens