CN-120689619-B - Three-dimensional brain network dynamic segmentation method based on deep learning

CN120689619BCN 120689619 BCN120689619 BCN 120689619BCN-120689619-B

Abstract

The invention discloses a three-dimensional brain network dynamic segmentation method based on deep learning, which comprises the steps of S1, constructing fusion image data with consistent time and consistent space, S2, generating a multi-scale sparse transducer coding feature pyramid, S3, obtaining a same-scale brain region graph structure, S4, taking a cross-scale node alignment fusion graph structure as initial output of a cross-scale node alignment fusion mechanism, S5, obtaining first round fusion brain region graph node embedding features, S6, obtaining an updated multi-scale sparse transducer coding feature pyramid, and repeating the steps S3 to S5 until multi-scale sparse transducer-GNN interaction updating is completed, and S7, obtaining a three-dimensional brain region dynamic segmentation result. The invention realizes the collaborative extraction of local fine granularity and global coarse granularity characteristics, not only can effectively inhibit redundant information, but also can enhance the modeling capability of spatial structure and functional connection characteristics in the cross-modal fusion image.

Inventors

GAO ZHAO
LIU JIAYU

Assignees

中国人民解放军总医院第一医学中心

Dates

Publication Date: 20260505
Application Date: 20250614

Claims (9)

1. A three-dimensional brain network dynamic segmentation method based on deep learning is characterized by comprising the following steps: s1, constructing fused image volume data with consistent time and consistent space; s2, segmenting the fused image volume data into cubes of different scales according to a multi-level cube partitioning strategy, performing position coding on each scale cube, inputting the cubes into a multi-scale sparse transform coder, and generating a multi-scale sparse transform coding feature pyramid; S3, establishing an initial brain region graph node set on each scale of the multi-scale sparse transducer coding feature pyramid based on a predefined medical partition template and diffusion tensor imaging communication information to obtain a same-scale brain region graph structure; s4, executing cross-scale node alignment on the nodes of the homonymous brain regions with different scales, and taking a cross-scale node alignment fusion graph structure as initial output of a cross-scale node alignment fusion mechanism; S5, performing inter-node message transmission in the cross-scale node alignment fusion graph structure through graph convolution and multi-head graph attention to obtain a first round of fusion brain region graph node embedding feature; s6, taking the embedded features of the first round of fusion brain region graph nodes as residual information and feeding back to the next layer of input of the multi-scale sparse transducer encoder to obtain an updated multi-scale sparse transducer encoding feature pyramid, and repeating the steps S3 to S5 until the multi-scale sparse transducer-GNN interactive updating is completed; s7, performing jump connection fusion and hierarchical upsampling reconstruction on the final multi-scale sparse transform coding feature pyramid to generate voxel level segmentation probability volume data; s8, performing post-processing on the voxel level segmentation probability volume data in the training stage and the reasoning stage respectively to obtain a three-dimensional brain region dynamic segmentation result.
2. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 1, wherein said S1 comprises the steps of: S11, acquiring structural magnetic resonance image data, functional magnetic resonance image data and diffusion tensor imaging data; s12, respectively executing initial resampling processing on the structural magnetic resonance image data, the functional magnetic resonance image data and the diffusion tensor imaging data; s13, carrying out voxel-level rigid registration on the functional magnetic resonance image data and the diffusion tensor imaging data relative to the structural magnetic resonance image data; s14, further performing non-rigid registration processing on the functional magnetic resonance image data and the diffusion tensor imaging data, and establishing a deformation field which describes the tiny displacement of each voxel position (x, y, z) in three spatial dimensions; s15, respectively performing intensity normalization processing on the structural magnetic resonance image data, the functional magnetic resonance image data and the diffusion tensor imaging data which are subjected to registration to obtain normalized structural magnetic resonance image data, functional magnetic resonance image data and diffusion tensor imaging data; S16, carrying out artifact removal processing on the normalized structural magnetic resonance image data, the functional magnetic resonance image data and the diffusion tensor imaging data; s17, carrying out channel-level splicing and fusion on the structural magnetic resonance image data, the functional magnetic resonance image data and the diffusion tensor imaging data subjected to registration, normalization and artifact removal processing along the voxel dimension to form final fusion image volume data D fusion .
3. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 2, wherein said S2 comprises the steps of: S21, segmenting the fused image volume data D fusion according to a multi-level cube segmentation strategy, dividing an original three-dimensional space voxel grid into non-overlapping cube sets corresponding to scales by adopting different scale division factors s l , wherein each scale level corresponds to a cube set under one resolution Where N l represents the total number of cubic blocks of the first layer, l represents the scale level, The N l cube is shown under the scale level l and is a small cube area after being segmented in the three-dimensional space; S22, for each cube at each scale level Performing position coding to obtain a position coding vector S23, position coding vectors under each scale level Fused feature vectors with corresponding cubes Splicing to form a scale feature representation set F (l) ; s24, for the shallow scale level l high , the scale characteristic representation is assembled Input local window self-attention module, set window size as Executing dense self-attention mechanism within each window, outputting shallow local fine-grained representation S25, for the deep scale level l low , the scale characteristic representation is collected The input cross-window sparse self-attention module only establishes connection with a preset partial Query and the related Key thereof, and outputs deep global coarse-grained representation S26, carrying out element level fusion on shallow local fine granularity representation and matched deep sparse global representation for each shallow scale level to form fusion coding representation Z (l) ; s27, fusion coding representation of all scale levels And carrying out unified organization to construct a multi-scale sparse transducer coding feature pyramid with cross-hierarchy sensing capability.
4. A three-dimensional brain network dynamic segmentation method based on deep learning according to claim 3, wherein said S3 comprises the steps of: s31, constructing a predefined medical partition template to form brain region diversity R, wherein the brain region diversity R consists of a plurality of brain region template areas, in each scale level l, carrying out space coincidence judgment on the space position of each cube in the fusion coding representation set Z (l) and the brain region template area, and corresponding the cubes to map nodes in a brain region map structure through a mapping function to form an initial node mapping relation; S32, on a scale level l, generating an initial brain region graph node set V (l) based on a node mapping relation, wherein each graph node in the initial brain region graph node set is used for representing a cube set of a brain region template under the current scale, and calculating the average value of all cube coding representations contained in the current node for each graph node to obtain an initial feature vector of the current graph node; S33, calculating the anatomical similarity between each pair of graph nodes by using an initial feature vector set of all graph nodes under the scale level l, and forming an anatomical similarity matrix; S34, extracting white matter fiber bundle tracking quantity between each pair of brain region template areas based on diffusion tensor imaging data after normalization processing, calculating normalized connectivity based on fiber distribution density inside and outside the areas, and organizing functional connectivity results among all the areas into a functional similarity matrix; s35, constructing a graph adjacency matrix under a scale level l by carrying out linear weighted fusion on the anatomical similarity matrix and the functional similarity matrix; s36, combining the brain region graph node set under the scale level l with the graph adjacency matrix to form a same-scale brain region graph structure G (l) under the current scale level.
5. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 4, wherein said S3 comprises the steps of: S41, setting a cross-scale alignment threshold, selecting all scale levels with resolution lower than the cross-scale alignment threshold to form a low-resolution set, and forming all scale levels with resolution higher than or equal to the cross-scale alignment threshold to form a high-resolution set; S42, extracting space center coordinates of each graph node for the brain region graph structure in each low-resolution scale level, and extracting initial feature vectors of the graph nodes; S43, in the high-resolution scale level, calculating the space Euclidean distance between each candidate graph node and the low-resolution graph node aiming at all candidate graph nodes with the same medical template name as the low-resolution graph node, and selecting the candidate node with the minimum Euclidean distance as a high-resolution alignment target graph node; S44, forming matching pairs by each pair of low-resolution map nodes and nearest high-resolution alignment target map nodes, and forming a cross-scale pair Ji Yingshe set; S45, calculating feature similarity by using initial feature vectors of two map nodes for each pair of matched map nodes in the cross-scale alignment mapping set; s46, recombining all the cross-scale matching graph node pairs with the corresponding pairs Ji Quan to form a cross-scale edge set; S47, merging the low-resolution scale level brain region graph structure with the high-resolution scale level brain region graph structure and the cross-scale side set to obtain a final cross-scale node alignment fusion brain region graph structure.
6. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 5, wherein said S5 comprises the steps of: S51, mapping the cubic block coding features under each scale level in the multi-scale sparse transform coding feature pyramid into initial attribute features of corresponding graph nodes according to the mapping relation between the graph nodes and the cubic blocks in the cross-scale node alignment fusion brain region graph structure; S52, splicing and integrating initial attribute features of the graph nodes in all scale levels according to node numbers to form an initial feature matrix of a cross-scale node set; S53, inputting the initial feature matrix into a graph neural network message transmission module, and executing a first round of graph rolling operation by combining adjacent matrices of the cross-scale node alignment fusion brain region graph structure, so as to obtain a first round of node embedding representation H (1) ; S54, the first round of graph convolution represents an input multi-head graph attention mechanism module, outputs obtained by all attention heads of each graph node are integrated through splicing operation, and a first round of fusion brain region graph node embedding feature h i (1) is generated.
7. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 1, wherein said S6 comprises the steps of: s61, embedding the first round of fusion brain region graph nodes into features according to the mapping relation between the graph nodes and the original cubes Projecting and broadcasting back to original cubes under each scale level in the multi-scale sparse transform coding feature pyramid to obtain graph feedback residual information corresponding to each cube; S62, for each scale level l, feeding back corresponding residual vectors in residual information R (1) of the graph Cube characteristics output by upper layer of transducer code Performing joint update to form new cube input representation S63, inputting new cubic blocks into the representation Inputting to a next layer of multi-scale sparse transducer encoder, executing a local window dense self-attention mechanism and a cross-window sparse self-attention mechanism, extracting multi-scale features from the new input, and outputting an updated multi-scale sparse transducer encoding feature pyramid { S64, repeating the construction process of the brain region graph structure with the same scale in the S3 after each round of updating, and updating to obtain a new brain region graph structure; s65, repeating the cross-scale node alignment process in the S4 based on the new brain region graph structure to generate a new cross-scale node alignment fusion brain region graph structure; S66, performing a graph neural network message transmission and multi-head graph attention mechanism on the new cross-scale node alignment fusion brain region graph structure to generate a second round of fusion brain region graph node embedding characteristics And feeding back to the encoder again to update the residual error; S67, repeating the steps S61 to S66 to perform multi-pass transform-GNN double-pass interactive updating until the set iteration round number or residual error convergence threshold is reached, and finally obtaining the stable multi-scale sparse transform coding feature pyramid and the node embedded representation of the fusion brain region graph.
8. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 7, wherein said S7 comprises the steps of: S71, coding characteristic pyramid according to stable multi-scale sparse transducer For coded features at all scale levels Information integration is carried out by adopting a jump connection fusion strategy, and channel splicing and convolution fusion are carried out with the current level coding feature, so that a fused trans-scale voxel feature representation is generated; S72, embedding and representing the fusion brain region graph nodes into corresponding voxel block areas projected back through nearest neighbors according to the mapping relation between the graph nodes and the cube blocks, establishing a graph feature field in a voxel space, and splicing the trans-scale voxel feature representation and the graph feature field in a channel dimension to form final fusion voxel region classification features; S73, inputting the final fusion voxel level segmentation characteristics into a segmentation head network formed by convolution layers, performing a group of 1 multiplied by 1 convolution operation on the segmentation head network, outputting multi-class probability prediction of a voxel level through a softmax function, and generating voxel level segmentation probability volume data.
9. The method for dynamic segmentation of a three-dimensional brain network based on deep learning according to claim 8, wherein S8 specifically comprises introducing a neighboring time frame probability consistency constraint and deformation field smoothing regularization into voxel level segmentation probability volume data of continuous time frames in a training stage, applying recursive bayesian filtering and B-spline-based deformation compensation to the voxel level segmentation probability volume data in an reasoning stage, and performing conditional random field refinement, void filling based on time-of-flight mapping and confidence assessment on the voxel level segmentation probability volume data of smooth evolution of a time domain to obtain a three-dimensional brain region dynamic segmentation result.

Description

Three-dimensional brain network dynamic segmentation method based on deep learning Technical Field The invention relates to the technical field of image segmentation, in particular to a three-dimensional brain network dynamic segmentation method based on deep learning. Background With the rapid development of the neural imaging technology, the multi-mode brain imaging means of functional magnetic resonance imaging, structural magnetic resonance imaging and diffusion tensor imaging are widely applied to human brain cognitive mechanism research and early diagnosis of brain diseases, in recent years, researchers increasingly pay attention to the dynamic evolution relationship between brain functional connection and brain region atlas, and three-dimensional brain network segmentation is taken as a basic link in brain image analysis, and has gradually developed from traditional static brain region segmentation to brain network segmentation tasks with time dynamic characteristics. The current mainstream three-dimensional brain network segmentation method is mainly based on a convolutional neural network to carry out voxel-level semantic segmentation on single-mode image data, and although a certain progress is made, a plurality of key problems still exist, on one hand, the traditional convolutional neural network model is difficult to fully capture the spatial relationship with strong non-Euclidean characteristics in brain image data, especially lacks global modeling capability when processing cross-scale unstructured brain region connection relationship, on the other hand, the multi-mode data has obvious differences in terms of spatial resolution, image registration and signal-to-noise ratio, and the traditional fusion strategy is often simple to adopt cascading or average operation, can not effectively extract deep cooperative characteristics among modes, and limits further improvement of fusion segmentation precision. In addition, although the existing partial brain region segmentation method introduces a graph neural network to model the topological structure of brain regions, the graph structure is generally statically constructed, cannot be dynamically adjusted for different scales or time periods, and is difficult to support the trans-scale evolution and information feedback of brain region connection maps in real neural activity, and particularly when a complex brain network is processed, an effective coupling mechanism is lacking between graph modeling and segmentation models, so that the segmentation result is poor in terms of boundary consistency, local detail and dynamic consistency. Moreover, the existing brain partition method generally neglects consistency modeling of cross-time frames, and can not keep consistency and stability of brain region identification on continuous time points, which is particularly critical to capturing evolution process of brain region functional states, and the existing partition method generally processes single-frame data in training and reasoning stages, lacks modeling capability in time dimension, causes delay and ambiguity in detection of brain region dynamic activity, and is difficult to meet actual requirements of neuroscience and clinical diagnosis on high space-time resolution partition technology. In summary, the existing three-dimensional brain network segmentation method still has great technical defects in the aspects of multi-modal data fusion precision, spatial structure expression capability, cross-scale structure consistency modeling and dynamic segmentation continuity, and a novel segmentation strategy capable of fully fusing multi-modal characteristics, combined spatial topological structure and time dynamic information is needed to realize more accurate and stable three-dimensional brain region dynamic segmentation with time evolution characteristics. Disclosure of Invention The invention aims to provide a three-dimensional brain network dynamic segmentation method based on deep learning, which realizes the collaborative extraction of local fine granularity and global coarse granularity characteristics, can effectively inhibit redundant information, and can enhance the modeling capability of spatial structure and functional connection characteristics in a cross-modal fusion image. According to the embodiment of the invention, the three-dimensional brain network dynamic segmentation method based on deep learning comprises the following steps of: s1, constructing fused image volume data with consistent time and consistent space; s2, segmenting the fused image volume data into cubes of different scales according to a multi-level cube partitioning strategy, performing position coding on each scale cube, inputting the cubes into a multi-scale sparse transform coder, and generating a multi-scale sparse transform coding feature pyramid; S3, establishing an initial brain region graph node set on each scale of the multi-scale sparse transducer coding feature p