CN-121639715-B - Intestinal polyp segmentation method and system based on spiral scanning state space model

CN121639715BCN 121639715 BCN121639715 BCN 121639715BCN-121639715-B

Abstract

The invention relates to the technical field of medical image processing, in particular to an intestinal polyp segmentation method and system based on a spiral scanning state space model, comprising the following steps of S1, acquiring and preprocessing an image; S2, multi-scale context feature coding, S3, self-adaptive frequency domain feature enhancement, S4, decoding and segmentation prediction, S5, thresholding to generate a binary segmentation mask image. According to the invention, the boundary integrity of an irregular target is kept through a dynamic multi-focus spiral scanning strategy, and an adaptive frequency domain filtering module is introduced to enhance the discrimination of key features, so that the segmentation precision and robustness in complex medical image tasks such as intestinal polyps are comprehensively improved.

Inventors

WU JIAHUA
LIU XIAO
LIN WEITAO
Wang dahan
LIN YINGQUAN
HUANG RONGXIANG

Assignees

厦门理工学院

Dates

Publication Date: 20260512
Application Date: 20260204

Claims (9)

1. The intestinal polyp segmentation method based on the spiral scanning state space model is characterized by comprising the following steps of: S1, acquiring and preprocessing an image, namely acquiring an original image to be segmented, and carrying out standardized preprocessing on the original image to obtain an input feature image, wherein the preprocessing comprises unified adjustment of the size to a preset resolution, and normalization processing is carried out to eliminate the difference caused by imaging equipment and acquisition conditions and ensure that input data meets the model requirement; S2, multi-scale context feature coding, wherein the input feature map is input to an encoder of a U-Net model to generate a multi-center attention map, the encoder comprises a plurality of cascaded multi-center spiral scanning Mamba modules, each module dynamically generates a plurality of focuses based on the input feature map and generates a spiral scanning sequence for each focus so as to capture and fuse local details and global context information of an image and output a multi-level feature map, and the encoder for inputting the input feature map to the U-Net model generates the multi-center attention map specifically comprising the following steps of: Wherein the method comprises the steps of Is an input feature map; Representing a linear rectification activation function; Representing a3 x 3 convolution operation; attn _map is a multi-center attention map; S3, self-adaptive frequency domain feature enhancement, namely inputting a multi-level feature map output by the encoder into a bottleneck layer of a U-Net model, wherein the bottleneck layer comprises a self-adaptive frequency domain filtering module which is used for transforming the features of a spatial domain into a frequency domain and carrying out self-adaptive filtering according to the energy distribution of the features so as to enhance modeling of a key tissue structure and obtain an enhanced context feature map; S4, decoding and segmentation prediction, namely inputting the enhanced context feature map to a decoder of a U-Net model, gradually recovering the spatial resolution through up-sampling and convolution operation, performing jump connection with the features of the corresponding level of the encoder, and finally generating a segmentation probability map; And S5, generating a result, namely thresholding the segmentation probability map so as to output a binarization segmentation mask image corresponding to the original image.
2. The method of claim 1, wherein each of the foci in S2 generates a helical scan sequence with a helical radius of the helical scan The method comprises the following steps: wherein A reference radius constant that is resolution dependent; is a scaling factor; Representing a variance operation weigthted _ coords is the attention weighted centroid, , wherein, Is the first The segmentation of the individual locations leads to the attention value, Is the first The spatial coordinates of the individual positions are used, Constant to ensure numerical stability.
3. The method of claim 1, wherein the multi-center helical scan module in S2 further performs the step of dividing the feature map into a plurality of horizontal slices and vertical slices, bi-directionally scanning the horizontal slices and vertical slices, and generating a flipped helical scan sequence to enhance the robustness of the model.
4. The intestinal polyp segmentation method based on the spiral scanning state space model according to claim 1, wherein the processing step of the adaptive frequency domain filtering module is that input features are converted into a frequency domain, routing weights are dynamically calculated by using a multi-layer perceptron, a plurality of frequency filters are adaptively guided to filter the frequency domain features based on the routing weights, and the filtered frequency domain features are converted back into a spatial domain, and the specific formula is as follows: Wherein the method comprises the steps of In order to input the feature map, For adaptive frequency domain filtering, as would be the Hadamard product, i.e., element-wise multiplication, In the form of a two-dimensional fast fourier transform, For the function complex operation, the pretreatment operation is firstly carried out on the input characteristic diagram Re-performing a two-dimensional fast Fourier transform , For preprocessing operations of the input feature map, including point-by-point convolution and identity mapping, for optimizing the input feature distribution to counteract acquisition fluctuations of the medical image, Is a two-dimensional inverse fast fourier transform, is used to transform the filtered frequency domain features back into the spatial domain, The output result of the adaptive frequency domain filtering operation is an enhanced context feature map.
5. The intestinal polyp segmentation method based on the helical scan state space model according to claim 1, wherein the U-Net model in S4 includes a jump connection in which a channel attention bridge and a spatial attention bridge are integrated for achieving multi-scale feature fusion between an encoder and a decoder.
6. The intestinal polyp segmentation method based on the helical scan state space model according to claim 1, characterized in that in S2 the encoder is built based on Mamba architecture, in particular using a selective state space model to model the helical scan sequence and handle long-range dependencies with linear time complexity.
7. The intestinal polyp segmentation method based on the spiral scanning state space model according to claim 1, wherein in the step S4, bilinear interpolation is adopted for up-sampling, and multi-scale characteristics are enhanced through a channel attention bridge and space attention bridge module, so that the sensitivity of the model to lesions of different scales is improved, wherein the specific formula is as follows: Wherein the method comprises the steps of Is an input feature map; calculating a one-dimensional channel attention map; in order to calculate a two-dimensional spatial attention map, Representing element-wise multiplication, F' is the feature map enhanced by the channel dimension, and F″ is the final output feature map enhanced by the spatial dimension.
8. The intestinal polyp segmentation method based on the spiral scanning state space model according to claim 7, wherein the fusion of the multi-scale features is completed through a channel attention bridge and space attention bridge module, and the calculation process of the channel attention bridge and space attention bridge module is as follows: where F represents the feature map of the input, And Respectively representing global average pooling operation and global maximum pooling operation; Representing a multi-layer perceptron comprising two weighting matrices And , The sigmoid function is represented as a function, The channel dimension maximum value characteristic vector obtained after the global average pooling treatment is represented, Representing the channel dimension maximum value characteristic vector obtained after global maximum pooling treatment, Indicating that the convolution kernel is of size Is used for the convolution operation of (1), Representing the maximum value characteristic vector of the space dimension obtained after the global average pooling treatment, Representing the maximum value characteristic vector of the space dimension obtained after global maximization processing, Representing a stitching operation.
9. An intestinal polyp segmentation system based on a helical scan state space model, characterized in that the segmentation is performed by using the intestinal polyp segmentation method according to any one of claims 1-8.

Description

Intestinal polyp segmentation method and system based on spiral scanning state space model Technical Field The invention relates to the technical field of medical image processing, in particular to an intestinal polyp segmentation method and system based on a spiral scanning state space model. Background Medical image segmentation is a key link in modern clinical diagnosis, pathology typing, treatment planning and surgical robotics. With the development of deep learning technology, convolutional neural network (Convolutional Neural Network, CNN) based methods (such as U-Net and variants thereof) have become the benchmark in this field due to their strong feature extraction capabilities. However, CNN is limited to local receptive fields, and it is difficult to capture long-distance dependency, resulting in insufficient segmentation accuracy for complex anatomy and variable lesion morphology. To address the long range dependence problem, a transducer-based approach (e.g., transUNet, swin-UNet) was introduced into medical image segmentation. Although the transducer performs excellently in global modeling, its computational complexity grows quadratically with sequence length, and often ignores fine-grained local details, resulting in reduced spatial accuracy of the segmentation results. Recently, the Mamba architecture based on state space models has been attracting attention because of its linear time complexity and efficient long-range dependent modeling capabilities. However, the conventional Mamba-based method faces serious challenges when being applied to medical images, particularly aims at the tasks of intestinal polyp segmentation, and has more remarkable limitations, namely firstly, the fixed unidirectional or line-row scanning strategy is seriously mismatched with the irregular circular shape of a medical target (such as polyp), so that the spatial adjacency and semantic continuity can be destroyed, and boundary blurring and distortion are caused, secondly, a single-center scanning mechanism is difficult to effectively cover frequent multiple and scattered focus in a colonoscope image, miss detection is easy to cause, thirdly, the conventional method is mainly used for carrying out feature modeling only in a spatial domain, frequency domain information which is critical for distinguishing tissue textures and edges cannot be effectively utilized, and the model robustness is insufficient when complex noise such as faeces and bubbles exists in the intestinal tract image. Therefore, a new medical image segmentation architecture is needed that can dynamically adapt to the lesion morphology, maintain spatial continuity, effectively fuse local and global features, and fully utilize frequency domain information. Disclosure of Invention Aiming at the defects in the prior art, the invention provides an intestinal polyp segmentation method based on a spiral scanning state space model, which maintains the boundary integrity of an irregular target through a dynamic multi-focus spiral scanning strategy and introduces an adaptive frequency domain filtering module (Adaptive Frequency Domain Filtering Module, AFDFM) to enhance the discrimination of key features, thereby comprehensively improving the segmentation precision and robustness in complex medical image tasks such as intestinal polyps and the like, and specifically comprises the following steps: S1, acquiring and preprocessing an image, namely acquiring an original image to be segmented, and carrying out standardized preprocessing on the original image to obtain an input feature image, wherein the preprocessing comprises unified adjustment of the size to a preset resolution, and normalization processing is carried out to eliminate the difference caused by imaging equipment and acquisition conditions and ensure that input data meets the model requirement; S2, multi-scale context feature coding, namely inputting the input feature map to an encoder of a U-Net model to generate a multi-center attention map, wherein the encoder comprises a plurality of cascaded multi-center spiral scanning Mamba modules, each module dynamically generates a plurality of focuses based on the input feature map and generates a spiral scanning sequence for each focus so as to capture and fuse local details and global context information of an image and output a multi-level feature map; S3, self-adaptive frequency domain feature enhancement, namely inputting a multi-level feature map output by the encoder into a bottleneck layer of a U-Net model, wherein the bottleneck layer comprises a self-adaptive frequency domain filtering module which is used for transforming the features of a spatial domain into a frequency domain and carrying out self-adaptive filtering according to the energy distribution of the features so as to enhance modeling of a key tissue structure and obtain an enhanced context feature map; S4, decoding and segmentation prediction, namely inputting the enhanced context fe