CN-120598971-B - Retina blood vessel segmentation method and system based on automatic coding-decoding network optimized by discrete binary particle swarm
Abstract
The invention discloses a retina blood vessel segmentation method and a retina blood vessel segmentation system based on an automatic coding-decoding network optimized by discrete binary particle swarm, wherein the segmentation method comprises the steps of constructing a retina blood vessel data set; the method comprises the steps of obtaining a lightweight U-shaped neural network model, obtaining an optimal U-shaped neural network model, and obtaining a segmentation result of a retinal blood vessel image through the optimal U-shaped neural network model. According to the invention, the neural network structure is automatically searched and optimized through the discrete binary particle swarm optimization algorithm, so that the network architecture can be automatically adjusted according to task requirements, meanwhile, by introducing FPN Attention Block and combining with the attention mechanisms such as ECA-Net and CBAM, the attention mechanism is flexibly configured according to selection factors, the output of the encoder is weighted and then fused with the decoder, important local features can be effectively focused, and the segmentation precision of small blood vessels and complex structures is improved.
Inventors
- FAN XIAO
- YAO JUNJIE
- ZHU GUIJIE
- Zeng Xitao
- YIN YUNZHE
- DU YUFENG
- ZHUANG JIAFAN
- LI WENJI
- LI YUN
Assignees
- 电子科技大学(深圳)高等研究院
Dates
- Publication Date
- 20260512
- Application Date
- 20250606
Claims (8)
- 1. The retina blood vessel segmentation method of the automatic coding-decoding network based on the discrete binary particle swarm optimization is characterized by comprising the following steps of: Step S11, constructing a retina blood vessel data set, and dividing the retina blood vessel data set into a training set, a verification set and a test set; Step S12, after the architecture search space of the U-shaped neural network is designed, searching the internal structures and the space architectures of different modules in the U-shaped neural network through a discrete binary particle swarm algorithm, and training and verifying by utilizing a training set and a verification set to obtain a lightweight U-shaped neural network model; The architecture search space comprises a first coding region, a second coding region and a third coding region; the first coding region comprises two binary numbers for expressing the number of gene layers; The second coding region comprises seven binary numbers, wherein the 1 st bit to the 3 rd bit are used for representing operation genes, the 4 th bit and the 5 th bit are used for representing the positions of attention factors, the 6 th bit is used for representing the selected attention factors, and the 7 th bit is used for representing residual connection genes; The third coding region includes six-bit binary numbers, bit 1 is used to represent a connection relationship between intermediate nodes (n 1, n 2), bit 2 and bit 3 are used to represent a connection relationship between intermediate nodes (n 1, n 3) and (n 2, n 3), bit 4, bit 5 and bit 6 are used to represent a connection relationship between intermediate nodes (n 1, n 4), (n 2, n 4) and (n 3, n 4); step S13, testing the performance of the light U-shaped neural network model through a test set to obtain an optimal U-shaped neural network model; And S14, dividing the input retinal blood vessel image through the optimal U-shaped neural network model, and displaying the division result of the retinal blood vessel image.
- 2. The method of retinal vessel segmentation based on an automated encoding-decoding network optimized for discrete binary particle swarm according to claim 1, wherein in step S12, "00" in the first encoding region represents the second layer, "01" represents the third layer, "10" represents the fourth layer, and "11" represents the fifth layer.
- 3. The method for retinal vascular segmentation IN an automated encoding-decoding network based on discrete binary particle swarm optimization according to claim 1, wherein IN step S12, the 1 st to 3 rd bits IN the second encoding region are "3x3 convolutions+activation functions ReLU activation functions" denoted by "000" for the basic operation sequence "3x3 convolutions+batch standardized bn+activation functions ReLU", "010" for the basic operation sequence "activation functions relu+3x3 convolutions", "011" for the basic operation sequence "batch standardized bn+activation functions relu+3x3 convolutions", "100" for the basic operation sequence "3x3 convolutions+example normalized in+activation functions ReLU", "101" for the basic operation sequence "3x3 convolutions+example normalized in+activation functions Mish" for the basic operation sequence "example normalized in+activation functions relu+3x3 convolutions", "111" for the basic operation sequence "example normalized in+activation functions Mish +3 convolutions", wherein the number of convolutions is 10; bits 4 and 5, with "00" representing intermediate node n1, "01" representing intermediate node n2, "10" representing intermediate node n3, and "11" representing intermediate node n4; In bit 6, "0" means ECA-Net is used as the attention mechanism, and "1" means CBAM is used as the attention mechanism; Bit 7, "0" indicates that no residual connection is employed, and "1" indicates that a residual connection is employed.
- 4. The method for retinal vessel segmentation in an automated encoding-decoding network based on discrete binary particle swarm optimization according to claim 1, wherein in step S12, a "0" in each binary number indicates that there is no connection between two intermediate nodes and a "1" indicates that there is a connection between two intermediate nodes in the third encoding region.
- 5. The method for retinal vessel segmentation in an automated encoding-decoding network based on discrete binary particle swarm optimization according to claim 1, wherein a feature pyramid attention module is employed between the encoder output and the decoder input of the U-shaped neural network to enhance the fusion capability of the U-shaped neural network to global features and local features, comprising the steps of: Extracting feature images with different scales from the input feature images by adopting a pyramid feature network, and carrying out first up-sampling on the feature images with different scales to adjust the feature images to the same space size, so that the feature images with different scales are spliced together in the channel dimension to form a multi-scale feature image; Performing second upsampling on the multi-scale feature map to align the multi-scale feature map with the input feature map in the space dimension, and fusing the multi-scale feature map with the input feature map to obtain a fused feature map; Capturing global semantic information of global features in the fusion feature map by adopting a convolution layer, and simultaneously weighting local features in the fusion feature map by adopting a self-adaptive pooling layer so as to respectively calculate attention weights corresponding to the global features and the local features; And weighting the feature values of each position in the input feature map according to the attention weights corresponding to the global features and the local features so as to output a weighted feature map.
- 6. The automated encoding-decoding network retinal vessel segmentation method based on discrete binary particle swarm optimization according to claim 5, wherein features between encoder and decoder are fused by bit-wise addition.
- 7. The automated encoding-decoding network retinal vessel segmentation method based on discrete binary particle swarm optimization according to claim 1, wherein the speed V i of the ith particle, when optimized using the discrete binary particle swarm algorithm, represents the speed and direction of movement of the particle in each dimension, defined as: Wherein t is the iteration number, w is the inertial weight, Is the locally optimal position of the ith particle, G * is the optimal position of the whole particles, r 1 ,r 2 is the random number generated from the uniform distribution U (0, 1), and c 1 、c 2 is the learning factor.
- 8. A retinal vascular segmentation system based on an automated encoding-decoding network optimized for discrete binary particle swarms, comprising: the construction dividing unit is used for constructing a retina blood vessel data set and dividing the retina blood vessel data set into a training set, a verification set and a test set; The training verification unit is used for searching the internal structures and the space architectures of different modules in the U-shaped neural network through a discrete binary particle swarm algorithm after the architecture search space of the U-shaped neural network is designed, and training and verifying by utilizing a training set and a verification set to obtain a light U-shaped neural network model; The architecture search space comprises a first coding region, a second coding region and a third coding region; the first coding region comprises two binary numbers for expressing the number of gene layers; The second coding region comprises seven binary numbers, wherein the 1 st bit to the 3 rd bit are used for representing operation genes, the 4 th bit and the 5 th bit are used for representing the positions of attention factors, the 6 th bit is used for representing the selected attention factors, and the 7 th bit is used for representing residual connection genes; The third coding region includes six-bit binary numbers, bit 1 is used to represent a connection relationship between intermediate nodes (n 1, n 2), bit 2 and bit 3 are used to represent a connection relationship between intermediate nodes (n 1, n 3) and (n 2, n 3), bit 4, bit 5 and bit 6 are used to represent a connection relationship between intermediate nodes (n 1, n 4), (n 2, n 4) and (n 3, n 4); the test unit is used for testing the performance of the light U-shaped neural network model through the test set so as to obtain an optimal U-shaped neural network model; the segmentation unit is used for segmenting the input retinal blood vessel image through the optimal U-shaped neural network model and displaying the segmentation result of the retinal blood vessel image.
Description
Retina blood vessel segmentation method and system based on automatic coding-decoding network optimized by discrete binary particle swarm Technical Field The invention relates to the technical field of image processing, in particular to a retina blood vessel segmentation method and a retina blood vessel segmentation system based on an automatic coding-decoding network optimized by discrete binary particle swarm. Background An automated retinal vessel segmentation method for NSMD-NAS (non-downsampling module and neural architecture search) model aims to improve the efficiency and accuracy of Diabetic Retinopathy (DR) screening. The method combines multi-scale, multi-frequency and multi-directional feature extraction, builds a feature extraction layer through a non-downsampling filter (NSMD) and a convolution and transform module, and automatically optimizes a network structure to adapt to a complex structure of a retina image. Specifically, the model optimizes 14 key parameters such as convolution layer type, filter number, pooling mode and the like under a fixed-length genotype coding strategy, and searches a network structure with highest adaptability by using a genetic evolutionary algorithm (GA) to obtain an optimal network structure, so that more accurate vessel segmentation is realized, and particularly, the processing capacity of tiny vessels and complex backgrounds is higher. In the model structure, an NSMD module is fused with a CNN (convolutional neural network) to extract boundary information of different scales, frequencies and directions, and stability of the features before the features are input into the network is ensured through normalization processing. Combining TUnet basic network and multi-head attention mechanism further improves modeling ability of local and global information. Experimental results show that the segmentation effect of the NSMD-NAS model on the DRIVE, STARE and CHASE_DB1 data sets is superior to that of other retina blood vessel segmentation models which are popular at present, particularly the segmentation accuracy of small blood vessels and complex areas is outstanding, and the NSMD-NAS model has strong robustness and generalization capability. However, the existing NSMD-NAS model suffers from the following disadvantages: 1. Stability of the optimization procedure Genetic Algorithms (GA) are optimization methods based on natural selection, the search process of which is highly random. The GA generates a plurality of candidate solutions by selection, crossover and mutation operations and evaluates them according to fitness. However, the optimization results of GA often depend on the diversity of the initializing population and the setting of the search parameters. Different initialization approaches may lead to the optimization going in different directions and even falling into a locally optimal solution. The model may exhibit different behavior in different experiments, resulting in a certain uncertainty in the resulting network architecture, which is a challenge for stability and consistency requirements in clinical applications. 2. Adaptability for real-time applications The NSMD-NAS model requires high computational resources due to its complex architecture and parameter optimization process. Rapid feedback and real-time processing are critical in making practical clinical diagnoses. Despite the high accuracy of the model training phase, the inference phase (i.e., image segmentation in actual use) may still require a long time to calculate, especially on hardware resource constrained devices. If the model reasoning speed is slow, the result can not be provided in real time or quickly in clinical practice, and the practical application value of the model reasoning speed as an auxiliary tool is reduced. 3. Model complexity The NSMD-NAS model incorporates multiple complex modules, such as non-downsampling modules (NSMD), convolutional layers, transform modules, etc., that nest and work in concert with each other, increasing the number of parameters and layers of the network. Furthermore, the use of Genetic Algorithms (GA) in network architecture searches further increases the complexity of the model. The design and tuning of each module need to consider various factors, such as multi-scale, multi-frequency and multi-directional information fusion, which makes the overall network architecture very complex. This highly complex model structure not only increases training time, but also makes the model's reasoning process slower and difficult to optimize. Complexity also results in increased difficulty in tuning parameters, requiring extensive experimentation and adjustment to find the optimal network structure. Furthermore, complex architectures consume more computing resources, which may limit their application in resource constrained environments, particularly in clinical real-time diagnostics, where a relationship between accuracy and computational efficiency is requ