CN-122023128-A - OCT retina image super-resolution reconstruction method and system
Abstract
The invention belongs to the technical field of medical image processing, and particularly relates to an OCT retina image super-resolution reconstruction method and system. The method comprises the steps of obtaining original OCT retina images, inputting the original OCT retina images into a network model comprising a plurality of layers of encoders and a plurality of layers of decoders, carrying out global context modeling of features by utilizing a multi-head self-attention block in each layer of encoder, then carrying out depth separable convolution by utilizing a parallel activated feedforward network and carrying out feature fusion to obtain output features of each layer of encoder, and carrying out feature stitching on the output features of the corresponding layer of encoder and the output features of the decoder to finish reconstruction of the original OCT retina images. The invention realizes high-quality and high-fidelity OCT image super-resolution reconstruction by cooperatively utilizing modeling capability of a attention mechanism on global context and enhancement capability of a feedforward network on local nonlinear transformation.
Inventors
- SONG WEIYE
- LIAO QI
- WANG YUKUAN
- WEI HUA
- CUI YUAN
Assignees
- 山东大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260413
Claims (15)
- The OCT retina image super-resolution reconstruction method is characterized by comprising the following steps of: acquiring an original OCT retina image; Inputting an original OCT retina image into a network model comprising a multi-layer encoder and a multi-layer decoder, in a plurality of cascaded characteristic enhancement modules in each layer encoder, performing global context modeling of characteristics by using an enhancement multi-head self-attention block, and then performing depth separable convolution and characteristic fusion by using a parallel activation feedforward network to obtain the output characteristics of each layer encoder; And performing feature stitching on the output features of the corresponding layer encoder and the output features of the decoder to finish reconstruction of the original OCT retina image and obtain a reconstructed retina image.
- 2. The OCT retinal image super-resolution reconstruction method of claim 1, wherein the global context modeling of features with enhanced multi-headed self-attention blocks specifically comprises: Performing convolution operation on the input of the multi-head self-attention enhancement block, extracting local features and generating a joint feature tensor; Uniformly dividing the combined feature tensor into three parts in the channel dimension to obtain an initial query feature, an initial key feature and an initial value feature; performing two-time depth separable convolutions on the initial query feature, the initial key feature and the initial value feature respectively to obtain an enhanced query feature, an enhanced key feature and an enhanced value feature; And performing self-attention calculation on the enhanced query feature, the enhanced key feature and the enhanced value feature to obtain a global enhanced feature.
- 3. The OCT retinal image super-resolution reconstruction method of claim 2, wherein the performing self-attention computation on the enhanced query feature, the enhanced key feature, and the enhanced value feature results in a global enhanced feature, comprising: Calculating attention distribution by dot product operation and Softmax operation on the enhanced query features and the enhanced key features; The global enhancement features are calculated by combining the enhancement value features with the attention profile.
- 4. The OCT retinal image super-resolution reconstruction method of claim 2, wherein the initial query feature, the initial key feature, and the initial value feature are respectively subjected to two depth-separable convolutions, wherein: channel mixing by a first depth separable convolution; The spatial features are further enhanced by a second depth separable convolution.
- 5. The OCT retinal image super-resolution reconstruction method of claim 2, further comprising, after obtaining the global enhancement feature: The method comprises the steps of applying a simple gating mechanism, equally dividing global enhancement features into two parts in a channel dimension, and multiplying the parts element by element to obtain the output of the simple gating mechanism; The output of the simple gating mechanism is weighted by the attention of the channel to further enhance the characteristic representation, so as to obtain the enhanced characteristic; restoring the enhanced features to original dimensions to obtain the output of the enhanced multi-head self-attention block; And adding the output of the enhanced multi-head self-attention block and the input of the enhanced multi-head self-attention block through residual connection, and performing layer normalization processing to obtain the input of the parallel activated feedforward network.
- 6. The OCT retinal image super-resolution reconstruction method of claim 5, wherein the further enhancement of the output of the simple gating mechanism by channel attention weighting is represented by the enhanced features, and wherein the method comprises: channel attention weighting is to acquire channel statistical information through global average pooling, and generate channel weights through convolution operation; Multiplying the channel weight with the characteristics of the corresponding channel to obtain the enhanced characteristics.
- 7. The OCT retinal image super-resolution reconstruction method of claim 5, wherein the depth separable convolution and feature fusion with a parallel-activated feed forward network specifically comprises: Performing convolution operation on the input of the parallel activated feedforward network to realize channel expansion and obtain the expanded characteristics; Applying GELU an activation function to the expanded features to obtain activated features; Uniformly dividing the activated characteristic into two parts in the channel dimension, and respectively applying independent depth separable convolution to two branches corresponding to the two parts to obtain the branch characteristics of the two branches; And respectively applying GELU the branch characteristics of the two branches to activate functions, and adding the two branch characteristics element by element to obtain the output of the parallel activated feedforward network.
- 8. The OCT retinal image super-resolution reconstruction method of claim 7, further comprising: After the output of the parallel activation feedforward network is obtained, combining the output of the parallel activation feedforward network with the input of the parallel activation feedforward network through residual connection to complete the calculation of a characteristic enhancement module; and circularly calculating a plurality of feature enhancement modules for a plurality of cascaded feature enhancement modules in each layer of encoder to obtain the features of each layer of encoder.
- 9. The OCT retinal image super-resolution reconstruction method of claim 1, further comprising: inputting the output characteristics of the last layer of encoder into the last layer of decoder, and performing decoding operation by utilizing a plurality of cascaded characteristic enhancement modules in the decoder to obtain the output characteristics of the last layer of decoder; splicing the output characteristics of the last layer of decoder with the output characteristics of the last layer of encoder to obtain the input characteristics of the last layer of decoder; the reconstructed retinal image is finally output via the first layer decoder.
- 10. The OCT retinal image super-resolution reconstruction method according to claim 1, wherein the training of the network model comprising the multi-layer encoder and the multi-layer decoder uses a comprehensive loss function, in particular: ; Wherein, the As a total loss function; Loss for L1; is a loss of structural similarity; And Is a loss weight coefficient, satisfies 。
- 11. The OCT retinal image super-resolution reconstruction method of claim 10, wherein the specific calculation formula of the structural similarity loss is: ; Wherein, the Is a structural similarity index; Representing a predicted image; Representing a real image; structural similarity index The calculation formula of (2) is as follows: Wherein, the And The average of the predicted image and the real image respectively, And The variance of the predicted image and the real image respectively, To predict the covariance of the image and the real image, And Is a stable constant.
- 12. The OCT retinal image super-resolution reconstruction method of claim 1, wherein, when training a network model comprising a multi-layer encoder and a multi-layer decoder: Constructing a corresponding low-resolution-high-resolution OCT retinal image pair; Preprocessing OCT retina image pairs, including clipping, data enhancement and normalization operations; the preprocessed low-resolution-high-resolution OCT retina image pair is used as the input of network training and as a training supervision target.
- OCT retina image super-resolution reconstruction system, characterized by comprising: an original image acquisition module configured to acquire an original OCT retinal image; The characteristic coding module is configured to input an original OCT retina image into a network model comprising a multi-layer encoder and a multi-layer decoder, in a plurality of cascaded characteristic enhancement modules in each layer encoder, perform global context modeling of characteristics by utilizing an enhancement multi-head self-attention block, and then perform depth separable convolution and characteristic fusion by utilizing a parallel activation feedforward network to obtain output characteristics of each layer encoder; And the decoding reconstruction module is configured to perform characteristic splicing on the output characteristics of the corresponding layer encoder and the output characteristics of the decoder to finish reconstruction of the original OCT retina image and obtain a reconstructed retina image.
- 14. A computer readable storage medium, having stored thereon a program, which when executed by a processor, implements the steps in the OCT retinal image super-resolution reconstruction method according to any one of claims 1 to 12.
- 15. An electronic device comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor performs the steps in the OCT retinal image super-resolution reconstruction method according to any one of claims 1-12 when the program is executed.
Description
OCT retina image super-resolution reconstruction method and system Technical Field The invention belongs to the technical field of medical image processing, and particularly relates to an OCT retina image super-resolution reconstruction method and system. Background Optical Coherence Tomography (OCT) is a non-invasive high-resolution imaging technology based on an optical coherence principle, can acquire a high-resolution tissue cross-section image without touching a sample, can present internal structure information of biological tissues under micron-level resolution, and has great significance for diagnosis of various ophthalmic retinal diseases. By virtue of the advantages of high resolution, real-time imaging, noninvasive detection and the like, OCT has become an indispensable imaging tool in ophthalmic disease diagnosis, treatment evaluation and scientific research. Especially in clinical detection of retina diseases, such as diabetic retinopathy, age-related macular degeneration, glaucoma and the like, OCT images can clearly display the change of each layer of retina structure, and provide key basis for early diagnosis and disease course monitoring of diseases. Conventional high resolution Optical Coherence Tomography (OCT) imaging systems typically rely on expensive hardware and complex optical paths, limiting their widespread use. In recent years, image super-resolution technology based on deep learning provides a new solution for improving imaging quality of low-cost, low-resolution OCT apparatuses. The existing method is mainly based on Convolutional Neural Network (CNN) or generating countermeasure network (GAN), so that the visual quality of the image is improved to a certain extent. However, when medical images with complex layered structures and fine textures are oriented to OCT retinal images, the existing methods still have significant limitations: (1) The traditional convolutional neural network is difficult to effectively model long-distance dependency relationship in an image, so that important global structural information is easy to lose in the reconstruction process; (2) Most methods are single in feature extraction and fusion mechanisms, resulting in insufficient performance in recovering high frequency details. This problem is particularly pronounced in images acquired using low cost OCT equipment such as narrow bandwidth light sources, which limits its application in clinical accurate diagnosis. Disclosure of Invention In order to overcome the defects of the prior art, the invention provides the OCT retina image super-resolution reconstruction method and system, which cooperatively utilize the modeling capability of an attention mechanism on global context and the enhancement capability of a feedforward network on local nonlinear transformation, construct a novel network architecture of depth fusion attention-driven global relation modeling and efficient parallel local feature enhancement mechanism, realize high-quality and high-fidelity OCT retina image super-resolution reconstruction, and remarkably improve the resolution and diagnostic value of OCT retina images. To achieve the above object, one or more embodiments of the present invention provide the following technical solutions: the first aspect of the invention provides an OCT retina image super-resolution reconstruction method. The OCT retina image super-resolution reconstruction method comprises the following steps: acquiring an original OCT retina image; Inputting an original OCT retina image into a network model comprising a multi-layer encoder and a multi-layer decoder, in a plurality of cascaded characteristic enhancement modules in each layer encoder, performing global context modeling of characteristics by using an enhancement multi-head self-attention block, and then performing depth separable convolution and characteristic fusion by using a parallel activation feedforward network to obtain the output characteristics of each layer encoder; And performing feature stitching on the output features of the corresponding layer encoder and the output features of the decoder to finish reconstruction of the original OCT retina image and obtain a reconstructed retina image. The second aspect of the invention provides an OCT retinal image super-resolution reconstruction system. An OCT retinal image super-resolution reconstruction system comprising: an original image acquisition module configured to acquire an original OCT retinal image; The characteristic coding module is configured to input an original OCT retina image into a network model comprising a multi-layer encoder and a multi-layer decoder, in a plurality of cascaded characteristic enhancement modules in each layer encoder, perform global context modeling of characteristics by utilizing an enhancement multi-head self-attention block, and then perform depth separable convolution and characteristic fusion by utilizing a parallel activation feedforward network to obtain o