CN-121999217-A - Dynamic cross-branch guided pneumonia CT image segmentation method
Abstract
The invention discloses a dynamic cross-branch guided pneumonia CT image segmentation method which is characterized by comprising the following steps of S1, inputting a pneumonia CT image, S2, extracting global features from the pneumonia CT image through a freezing branch, S3, and obtaining the pneumonia CT image through a learner-able Mamba branch Extracting multi-scale local features, S4, dynamically calibrating and enhancing global features by utilizing the multi-scale local features through a multi-scale dynamic cross-branch feature guiding mechanism, S5, aligning the multi-scale local features with the enhanced global features in space dimensions, S6, enabling a prompt encoder to receive labeling prompts and map the labeling prompts into embedded vectors, S7, inputting the fused features and the prompt embedded vectors into a mask decoder at the same time, and outputting a binary segmentation mask of a pixel level. The invention remarkably improves the accuracy and the edge fidelity of the pneumonia focus segmentation while keeping the semantic priori of the large model.
Inventors
- WANG DAIWEI
- ZOU LANLAN
- KANG BIN
- YU XIAOFAN
Assignees
- 南京邮电大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260120
Claims (6)
- 1. The dynamic cross-branch guided pneumonia CT image segmentation method is characterized by comprising the following steps of: s1, inputting a CT image of pneumonia , S2, CT image of pneumonia through freezing branch Extracting global features The frozen branch adopts ViT coder of a pre-trained SAM general segmentation large model, freezes ViT coder weight of the pre-training, S3, CT image of pneumonia through a learner-able Mamba branch The method for extracting the multi-scale local features specifically comprises the following steps: S31, downsampling the input features, wherein the first layer is a CT image of pneumonia The input characteristics of each layer at the back are the output characteristics of the upper layer; The S32.LocalMamba module uses the proximity attention handling input features of the sliding window mechanism as follows: Wherein, the The vector Query is queried for the current pixel, And Key and Value in adjacent areas, respectively, B is the relative position offset, Representing a factor that scales the feature dimension, S33 feature of outputting proximity attention Input to two-dimensional selective scanning In the method, a direction sensing gating mechanism is introduced to obtain scanning characteristics The following formula: Wherein the method comprises the steps of Is the direction of the scan and, As a direction-gating factor, Wherein, the ( ) The operator rearranges the feature map spatially in the scan direction d to capture salient patterns on a particular path, Representing the Sigmoid activation function, A convolution operation of 1 x 1 is shown, S34, striving through boundary attention Residual connection is carried out to obtain local characteristics of the current level The following formula: the addition of the elements indicates that the multiplication is performed element by element, Is a Sigmoid activation function that is activated by, Representation of features The modular square of the spatial gradient is calculated, S4, dynamically calibrating and enhancing global features by utilizing multi-scale local features through a multi-scale dynamic cross-branch feature guiding mechanism, wherein the method comprises the following specific steps of: s41, self-adaptively distributing weights to multi-scale local features of different downsampling levels s through a dynamic gating network MLP, wherein the weights are represented by the following formula: s42 Global features of the branches to be frozen And respectively carrying out cross attention calculation with the local features of each scale, wherein the following formula is as follows: Wherein, the 、 And Is a learnable projection matrix, P represents a learnable relative position bias, A scaling factor representing the feature dimension. S43, carrying out linear weighting on the characteristics subjected to cross attention calculation to obtain enhanced global characteristics The following formula: S5, multi-scale local features With enhanced global features And (3) aligning space dimensions to obtain fusion characteristics, wherein the following formula is as follows: Wherein, the Representing a fully connected layer for dimensional alignment, The cascade operator is represented by a concatenation operator, S6, the prompt encoder receives the labeling prompt and maps the labeling prompt to an embedded vector , S7, fusing the characteristics Embedding vectors for hints And simultaneously, the two-value division masks of the pixel level are output to a mask decoder.
- 2. The method for segmenting a dynamic cross-branch guided pneumonia CT image according to claim 1, wherein the specific process in step S33 is to dynamically adjust the activation degree of each scan path through parallel paths in four directions, and perform context integration by using a discretized state equation, as follows: Wherein, the 、 And Is a state transition matrix that dynamically changes with input, For the hidden state in the t-th time step, d-th scanning direction, t represents the time step or position index during the sequence scanning, A characteristic representation of the pixel point representing the current position t, Is a weighted sum of hidden states in four scan directions, passing through an output matrix And finally outputting the obtained mapping result.
- 3. The method for dynamic cross-branch guided pneumonia CT image segmentation according to claim 1, wherein in step S43, global features are enhanced by normalization layer pairs Normalization processing is performed, and nonlinear transformation and residual connection are performed using MLP.
- 4. The dynamic cross-branch guided pneumonia CT image segmentation system is characterized by comprising a pneumonia CT image input module for acquiring an original pneumonia CT image, The system further includes a dual-branch feature extraction network including a frozen ViT branch and a learnable Mamba branch, The learner Mamba branch comprises a multi-layer local feature extraction module, each layer local feature extraction module comprises a downsampling module, a neighboring attention operator, a two-dimensional selective scanning unit based on direction gating and a boundary attention force diagram beta, The system also comprises a feature guiding module, wherein the feature guiding module comprises an adaptive weight distribution module, and local features with different scales are adaptively distributed with weights through a dynamic gating network MLP The feature directing module further comprises a plurality of parallel cross-branch cross-attention modules, and the feature directing module further comprises a linear weighting module for weighting by weight Weighting the features after the cross attention calculation to obtain enhanced global features And inputs it into the freeze branch, The system also comprises a fusion module for enhancing the global features Spatial dimension alignment is carried out on the multi-scale local features through the full-connection layer, multi-scale feature fusion is carried out through cascading operators, The system further includes a prompt encoder module that maps received preset labeling prompts for a pneumonia lesion to an embedded vector, The system also includes a mask decoder module that fuses features Embedding vectors for hints The output is a binary segmentation mask at the pixel level.
- 5. The dynamic cross-branch guided pneumonia CT image segmentation device is characterized by comprising a memory, a processor and a dynamic cross-branch guided pneumonia CT image segmentation program which is stored on the memory and can run on the processor, wherein the dynamic cross-branch guided pneumonia CT image segmentation program is configured with a pneumonia CT image segmentation method for realizing the dynamic cross-branch guide according to claim 1.
- 6. A storage medium, wherein the storage medium stores a dynamic cross-branch guided pneumonia CT image segmentation program, and wherein the dynamic cross-branch guided pneumonia CT image segmentation program, when executed, implements the dynamic cross-branch guided pneumonia CT image segmentation method according to claim 1.
Description
Dynamic cross-branch guided pneumonia CT image segmentation method Technical Field The invention relates to the technical field of computer vision and medical image processing. Background Pneumonia is a common pulmonary infectious disease, and early accurate diagnosis and disease assessment of the pneumonia are of great importance to clinical treatment scheme establishment and prognosis improvement. With the rapid development of medical imaging technology, computed tomography (ComputedTomography, CT) has become a central tool for clinical diagnosis of pneumonia, assessment of disease severity, and monitoring of therapeutic effects. The pneumonia focus is often represented as glass grinding shadow, lung solid change, interstitial change and the like on CT images, and accurate focus segmentation can provide doctors with key quantitative indexes such as volume, density, spatial distribution and the like of lesions. However, because the pneumonia focus generally has the characteristics of blurred boundary texture, irregular shape and height, strong spatial distribution heterogeneity, extremely low contrast with surrounding blood vessels or healthy tissues, and the like, the traditional image analysis method is difficult to realize robust and accurate automatic segmentation, and the manual labeling is time-consuming and labor-consuming in clinic and has large subjective deviation. In recent years, a deep learning method typified by a Convolutional Neural Network (CNN) has been remarkably advanced in the field of medical image segmentation. However, the traditional CNN method has obvious limitation in processing pneumonia segmentation tasks, namely the CNN is limited by local receptive fields, long-distance dependence is difficult to capture, global context perception is lacked when the traditional CNN method faces large-area diffuse lesions, and effective distinction of lesion boundaries and background artifacts is difficult. To introduce global modeling capabilities, a basic large model represented by SAM (SegmentAnythingModel) is beginning to be applied in the medical field. SAM has excellent zero sample generalization ability and flexible interactive segmentation framework, and can respond to the segmentation intention of users by prompting the encoder. However, the direct application of SAM to CT segmentation of pneumonia still faces the serious challenge of, on the one hand, the loss of spatial detail and the field mismatch of the pre-trained large model. SAM image encoders are typically based on high-magnification Token downsampling, which, while advantageous for extracting macroscopic semantics, for fine textures of pneumonia lesions, severe downsampling can lead to critical edge-space information loss. Even if the weights are frozen to preserve priori knowledge, since the pre-training data are mostly natural images, the model is difficult to spontaneously generate specific characterization aiming at CT image low-contrast environment and focus specific trend extension characteristics, so that the segmentation contour is excessively smooth. On the other hand, the assist feature extraction branch contradicts the receptive field and the computational efficiency. In order to make up for the shortage of detail perception of large models, the prior art often introduces lightweight CNN or transducer branches as a complement. However, CNN is limited by receptive fields, it is difficult to capture long-range features spanning multiple lung segments, and the transducer branch suffers from the dilemma that computational complexity grows with square resolution. More importantly, the pneumonia focus shows extremely strong spatial heterogeneity under different scales, the existing method often adopts a fixed characteristic fusion mode, and a self-adaptive weight distribution mechanism is lacked, so that local noise is easily introduced when a model is used for treating a large focus, and edge details are lost due to mismatch of receptive fields when a small focus is treated. In addition, the pneumonia focus often extends along the lung texture or specific anatomical direction, the existing dynamic state space model algorithm usually adopts a fixed scanning path, ignores the direction perception characteristic of the focus, and is difficult to inhibit artifacts and enhance the fidelity of the focus edge by dynamically adjusting the scanning strategy under a complex background. How to construct a medical image segmentation large model which can not only keep SAM strong semantic priori, but also calibrate local detail features aiming at the CT features of pneumonia has become a key technical requirement for improving the segmentation precision of the focus of pneumonia and pushing the basic large model to fall to the ground in medical clinic. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that