CN-121999005-A - Endoscope imaging optimization method

CN121999005ACN 121999005 ACN121999005 ACN 121999005ACN-121999005-A

Abstract

The application discloses an endoscope imaging optimization method, and belongs to the technical field of medical optical imaging and image processing. The method comprises the steps of firstly obtaining a super-structured lens distortion image, a high-definition image and a semantic segmentation label, then extracting multi-scale initial features through a multi-branch encoder, then carrying out feature enhancement and self-adaptive downsampling by utilizing a dynamic super-surface feature extraction module, then combining the high-definition image, constructing robust features through a dynamic parameter generation module, then extracting contour guide features from the high-definition image and fusing the contour guide features with the robust features, and finally executing image segmentation and restoration tasks in parallel, and outputting segmentation masks and undistorted images. By fusing physical priori with data driving, the problems of chromatic aberration, information dispersion and the like of the super-structured lens imaging are effectively solved, and the image definition, the small focus recognition precision and the model generalization capability are remarkably improved.

Inventors

Ju Fayin
LI NING

Assignees

浙江优众新材料科技有限公司

Dates

Publication Date: 20260508
Application Date: 20260407

Claims (10)

1. An endoscopic imaging optimization method, comprising: Obtaining a distorted image generated by the super-structured lens and a corresponding high-definition reference image; Extracting features of the distorted image to obtain initial image features; performing self-adaptive downsampling processing on the initial image characteristics to obtain downsampled characteristics; generating robust image features based on the high definition reference image and the downsampled features; Extracting contour guiding features in the high-definition reference image, and fusing the contour guiding features with the robust image features to obtain correction features; based on the correction features, an image segmentation task and an image restoration task are executed in parallel to output segmentation results and undistorted restored images, respectively.
2. The method of claim 1, wherein the step of feature extracting the distorted image is performed by an encoder, and wherein an initial layer of the encoder comprises a multi-branch convolution structure including at least two branches having convolution kernels of different sizes for extracting image features of different scales.
3. The method of claim 2, wherein the multi-branch convolution structure comprises: A first branch having a first size convolution kernel for extracting detail features; A second branch having a second size convolution kernel larger than the first size for extracting contour features; A third branch having a third size convolution kernel that is larger than the second size is used to extract the contextual features.
4. The method of claim 1, wherein the step of adaptively downsampling the initial image features is performed by a dynamic feature extraction and downsampling module comprising: A dynamic feature extraction sub-module for enhancing the input image features, and And the dynamic downsampling submodule is used for downsampling the image characteristics subjected to enhancement processing based on importance weights.
5. The method of claim 4, wherein the dynamic feature extraction submodule comprises: The feature splitting unit is used for splitting the input features into first path features and second path features; a feature enhancement unit for performing enhancement transformation including hole convolution and channel attention processing on the second path feature, and And the feature fusion unit is used for fusing the second path feature after the enhancement transformation with the first path feature.
6. The method of claim 5, wherein the hole convolution processing in the feature enhancement unit employs a re-parameterizable structure, wherein multiple convolution branches with different hole rates are used to work in parallel during a training phase, and wherein parameters of the multiple convolution branches are combined into a single convolution during a reasoning phase.
7. The method of claim 4, wherein the dynamic downsampling sub-module comprises: The weight prediction unit is used for predicting importance weights of pixels in each region in the input feature map; A downsampling unit for performing downsampling operation on the input feature map, and And the weighted fusion unit is used for weighting the down-sampled characteristics according to the importance weight and outputting the self-adaptive down-sampled characteristics.
8. The method of claim 1, wherein generating robust image features based on the high definition reference image and the downsampled features comprises: Constructing a pixel level importance mask according to the similarity between the high-definition reference image and the downsampling characteristics; Weighting and aggregating the downsampled features with the pixel level importance mask, and The aggregated features are mapped to dynamic convolution parameters and applied to the downsampled features to generate the robust image features.
9. The method of claim 1, wherein extracting contour guide features in the high definition reference image comprises: Extracting multi-scale context features from the high definition reference image, and The contour guide feature is generated based on the multi-scale context feature.
10. The method of claim 1, wherein the step of performing the image segmentation task and the image restoration task in parallel comprises: Inputting the correction feature into a segmentation decoder, which outputs the segmentation result by upsampling and combining with intermediate features of the encoder, and Inputting the correction characteristic into a recovery decoder, wherein the recovery decoder performs optimization processing through a network structure comprising a convolution layer, a residual block and an attention mechanism, and outputs the recovery image.

Description

Endoscope imaging optimization method Technical Field The application relates to the technical field of medical optical imaging and image processing, in particular to an endoscope imaging optimization method. Background In recent years, with the development of the super-structured lens technology, it has shown great potential in miniature imaging systems such as medical endoscopes. However, because the super-structured lens performs optical wavefront regulation and control based on the nano-scale super-atomic array, the phase distribution of the super-structured lens has wavelength dependence, so that optical defects such as uneven intensity, chromatic aberration dislocation, information dispersion and the like often occur in the imaging process, and the image quality and the accuracy of subsequent clinical analysis are seriously affected. In the prior art, there have been studies on the attempt to correct superlens imaging by convolutional neural networks, but the following problems are prevalent: 1. the receptive field of the feature extraction module is fixed, so that the characteristics of super-surface information dispersion and multi-scale distribution are difficult to adapt; 2. The correction process is often only driven by data, and lacks fusion of an optical physical model, so that the result lacks physical consistency; 3. the multitasking (such as image recovery and semantic segmentation) is usually performed independently, and feature sharing and collaborative optimization cannot be achieved, so that the efficiency is low and associated information is easy to lose. Disclosure of Invention In order to solve the problems, the application provides an endoscope imaging optimization method, which aims to solve the problems of unmatched characteristics, poor physical consistency of correction results, insufficient multi-task cooperation and the like in the prior art and realize high-quality recovery and accurate semantic segmentation of a super-lens distorted image. The first technical scheme adopted by the application is that an endoscope imaging optimization method is provided, which comprises the following steps: Obtaining a distorted image generated by the super-structured lens and a corresponding high-definition reference image; Extracting features of the distorted image to obtain initial image features; performing self-adaptive downsampling processing on the initial image characteristics to obtain downsampled characteristics; generating robust image features based on the high definition reference image and the downsampled features; Extracting contour guiding features in the high-definition reference image, and fusing the contour guiding features with the robust image features to obtain correction features; based on the correction features, an image segmentation task and an image restoration task are executed in parallel to output segmentation results and undistorted restored images, respectively. In an alternative embodiment, the step of feature extraction of the distorted image is implemented by an encoder, an initial layer of which comprises a multi-branch convolution structure comprising at least two branches with convolution kernels of different sizes for extracting image features of different scales. In an alternative embodiment, the multi-branch convolution structure includes: A first branch having a first size convolution kernel for extracting detail features; A second branch having a second size convolution kernel larger than the first size for extracting contour features; A third branch having a third size convolution kernel that is larger than the second size is used to extract the contextual features. In an alternative embodiment, the step of adaptively downsampling the initial image features is implemented by a dynamic feature extraction and downsampling module, where the dynamic feature extraction and downsampling module includes: A dynamic feature extraction sub-module for enhancing the input image features, and And the dynamic downsampling submodule is used for downsampling the image characteristics subjected to enhancement processing based on importance weights. In an alternative embodiment, the dynamic feature extraction submodule includes: The feature splitting unit is used for splitting the input features into first path features and second path features; a feature enhancement unit for performing enhancement transformation including hole convolution and channel attention processing on the second path feature, and And the feature fusion unit is used for fusing the second path feature after the enhancement transformation with the first path feature. In an alternative embodiment, the hole convolution processing in the feature enhancement unit adopts a re-parameterizable structure, a plurality of convolution branches with different hole rates are used for parallel operation in a training stage, and parameters of the convolution branches are combined into a single convolution in a r