CN-122023795-A - Conditional diffusion medical image segmentation method integrating local and global features

CN122023795ACN 122023795 ACN122023795 ACN 122023795ACN-122023795-A

Abstract

The invention discloses a conditional diffusion medical image segmentation method fusing local and global features. The method comprises the steps of (1) obtaining and preprocessing a medical image sample, (2) extracting local textures and edge features of an image by using a convolutional neural network, (3) extracting global semantics and spatial dependence features of the image by using a network based on a self-attention mechanism, (4) merging the local features and the global features to generate a conditional feature representation for guiding a segmentation process, (5) inputting the conditional features into a conditional diffusion model to generate a high-precision segmentation result by a step-by-step denoising mode, and (6) outputting a final medical image segmentation graph. According to the invention, the local detail and the global semantic information are fused deeply through the diffusion model guided by the condition features, so that the problems of noise, blurring, complex structure and the like in the medical image are effectively solved, the accuracy and the robustness of segmentation are remarkably improved, and the method is suitable for automatic segmentation of focuses, quantitative analysis of anatomical structures and clinical auxiliary diagnosis.

Inventors

WANG DANDAN
CHEN RUIYUAN
Yan Yilian
ZHANG SHIQING

Assignees

台州学院

Dates

Publication Date: 20260512
Application Date: 20260115

Claims (8)

1. A conditional diffusion medical image segmentation method for fusing local and global features is carried out according to the following steps: step 1, acquiring a medical image sample, and preprocessing the medical image sample; step 2, inputting the preprocessed medical image into a local feature extraction network to extract local feature representation of the medical image; Step 3, inputting the preprocessed medical image into a global feature extraction network to extract global feature representation of the medical image; step 4, fusing the local feature representation and the global feature representation to generate a conditional feature representation for guiding the segmentation process; Step 5, inputting the conditional feature representation into a conditional diffusion model, and generating a segmentation result of the medical image in a step-by-step denoising mode; And 6, outputting a final medical image segmentation result.
2. The conditional diffuse medical image segmentation method based on local features and global features according to claim 1, wherein: in the step 1, the preprocessing of the medical image sample comprises the following steps: and performing size unification processing on the acquired medical images and normalizing the pixel values of the images to eliminate scale differences caused by different image acquisition conditions, thereby obtaining a preprocessed medical image sample meeting the network input requirements.
3. The conditional diffusion medical image segmentation method based on local features and global features according to claim 1, wherein: In the step 2, a local feature extraction network is constructed based on a convolutional neural network structure and is used for extracting local texture features and edge structure features from the preprocessed medical image, and the obtained local features represent detail information used for representing the medical image.
4. The conditional diffuse medical image segmentation method based on local features and global features according to claim 1, wherein: in the step 3, a global feature extraction network is constructed based on a self-attention mechanism, and global modeling is performed on the medical image features to obtain a global feature representation containing long-range spatial dependency relationships and global semantic information.
5. The conditional diffuse medical image segmentation method based on local features and global features according to claim 1, wherein: In the step 4, the local feature representation and the global feature representation are fused through a feature fusion operation, wherein the feature fusion operation comprises at least one of feature stitching, weighted fusion or attention-directed fusion, so as to generate a conditional feature representation containing local detail information and global semantic information at the same time.
6. The conditional diffuse medical image segmentation method based on local features and global features according to claim 1, wherein: in the step 5, the conditional diffusion model includes a forward diffusion process and a backward denoising process, wherein: during forward diffusion, gradually introducing random noise into the segmentation representation; in the reverse denoising process, under the guidance of the condition characteristic representation, the noise is gradually removed through a parameterized denoising network so as to reconstruct the segmentation result of the medical image.
7. The method for conditional diffuse medical image segmentation based on local and global features according to claim 6, wherein the inverse denoising procedure is expressed as: Wherein, the Represent the first The state of the segmentation of the steps, The condition characteristic representation is represented by a set of conditions, Representing the parameters as Is provided).
8. The conditional diffuse medical image segmentation method based on local features and global features according to claim 1, wherein: in the step 6, the output medical image segmentation result is used for representing the spatial distribution of the target focus area or anatomical structure in the medical image, and can be used for assisting clinical diagnosis or medical image analysis.

Description

Conditional diffusion medical image segmentation method integrating local and global features Technical Field The invention relates to the technical field of medical image processing and artificial intelligence, in particular to a medical image segmentation method and a medical image segmentation system based on deep learning, which can be used for automatic segmentation of focus areas and anatomical structures and auxiliary clinical diagnosis. Background Medical image segmentation is one of important research contents in the fields of medical image processing and computer-aided diagnosis, and aims to automatically and accurately separate focus areas or anatomical structures from medical images, thereby providing technical support for aided diagnosis, treatment planning and prognosis evaluation of diseases. The technology can be widely applied to clinical scenes such as skin lesion analysis, tumor delineation, organ structure segmentation and the like, and is a research hotspot in the field of intelligent analysis of medical images for many years. The design of the model frame is a core link in the medical image segmentation technology, and the structural rationality and the feature expression capability of the model frame directly influence the accuracy and the stability of a segmentation result. Currently, typical methods widely used for medical image segmentation are mainly based on a deep learning model framework, such as convolutional neural networks (Convolutional Neural Network, CNN) and their derived encoder-decoder structures. Related researches show that the segmentation model based on CNN can effectively extract local textures and edge features in medical images, and achieves a certain effect in various medical image tasks (see literature ：Kayalibay, Baris, Grady Jensen, and Patrick Van Der Smagt. "CNN-based segmentation of medical imaging data." arXiv preprint arXiv:1701.03056 (2017).）., however, due to the limitation of local receptive fields of convolution operation, the method has the defects in modeling long-range spatial dependence and global semantic information. In order to overcome the above problems, in recent years, a transducer model based on a self-attention mechanism is gradually introduced into the field of medical image segmentation. The modeling capability of the model to the remote dependency relationship is enhanced through the global attention mechanism, and certain advantages are shown in the complex structure segmentation task (see literature ：Xiao, Hanguang, et al. "Transformers in medical image segmentation: A review." Biomedical Signal Processing and Control 84 (2023): 104791.）., however, the transform model is limited in the capability of usually describing local detail features, the computational complexity is high, and the segmentation precision and the computational efficiency are difficult to be simultaneously achieved in a high-resolution medical image scene. In recent years, a Diffusion Model (Diffusion Model) has received attention as a new type of generated deep learning framework for tasks such as image generation and reconstruction. Part of the research attempts to introduce a diffusion model into a medical image segmentation task, and model a segmentation result through a gradual denoising process so as to improve the overall consistency and robustness of the segmentation result (see literature ：Wu, Junde, et al. "Medsegdiff: Medical image segmentation with diffusion probabilistic model." Medical Imaging with Deep Learning. PMLR, 2024.）., in addition, related patents also already disclose a technical scheme for carrying out medical image segmentation by using a deep learning model (see patent: xu Qian et al. NnUNet zebra juvenile fish whole cerebral vascular system segmentation method based on autonomous dataset training-application number/patent number: 202510985795.4). Although the method improves the segmentation performance of medical images to a certain extent, the prior art still has the defects that on one hand, the prior deep learning model is usually focused on single type feature modeling, and is difficult to simultaneously consider local detail features and global semantic information, and on the other hand, the feature fusion mode between different model frames is simpler, so that the overall expression capacity of the model is limited. In addition, the existing medical image segmentation method based on the diffusion model is mostly used as an independent generation or post-processing module, and is not yet subjected to effective collaborative fusion with depth feature representation at the model frame level, and the condition constraint mechanism and the local-global feature modeling capability of the method still need to be further improved. In summary, there is still room for improvement in the existing medical image segmentation technique at the model framework level, and a new deep learning model framework is needed, and on the basis of mergi