CN-121527083-B - Pathological image diagnosis method based on GDKAN and multi-order context interaction gating
Abstract
The invention provides a pathological image diagnosis method based on GDKAN and multi-stage context interaction gating, which belongs to the technical field of medical image processing and artificial intelligence intersection, and comprises the following steps of 1, collecting full-field pathological full-section images; step 2, constructing an adaptive grouping dynamic Kolmogorov-Arnold network, step 3, constructing an adaptive grouping GDKansformer coding module, step 4, establishing a multi-order context interaction gating mechanism, step 5, training to obtain a GDKansformer model, and step 6, outputting a pathological image lesion recognition result. The invention realizes the automatic diagnosis of pathological images through the whole flow design of pretreatment, dynamic feature optimization, multi-order context interaction gating fusion and classification optimization.
Inventors
- GONG XUN
- LU XIAOYAN
Assignees
- 西南交通大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260114
Claims (4)
- 1. The pathological image diagnosis method based on GDKAN and multi-order context interaction gating is characterized by comprising the following steps of: the method comprises the steps of 1, collecting full-view pathology full-slice images, sequentially carrying out color standardization, denoising and tissue region segmentation on the full-view pathology full-slice images, cutting the processed images into image blocks with preset sizes by using a fixed step sliding window under a preset magnification, and carrying out convolution downsampling on the image blocks with linear projection and preset multiples to generate image block embedding to obtain sequence characteristics; Step 2, constructing a self-adaptive grouping dynamic Kolmogorov-Arnold network, adaptively determining grouping number and channel division by a grouping strategy network based on statistics of sequence features, calculating pixel or channel layer information entropy of each grouping feature and obtaining a density coefficient, dynamically distributing the number of base spline functions for each grouping within the upper and lower bounds of the preset spline number; consider the input tensor Batch size Number of token Feature dimension Adaptive determination of packet number by packet policy And each group of channels is divided into characteristic dimensions Equally divided into Group: ; is the first The number of sub-feature blocks, Then splice in token dimension and flatten in batch dimension: ; Indicating handle Splicing the obtained new features along the token dimension; the input characteristics of the dynamic KAN are obtained by grouping, splicing and flattening; Step 3, constructing an adaptive grouping GDKansformer coding module, taking a transform coding module as a main body, taking the output characteristics of an adaptive grouping dynamic Kolmogorov-Arnold network as a nonlinear mapping unit to replace the traditional linear mapping to generate a self-attention query matrix, a key matrix and a value matrix, calculating a scaling dot product attention based on the query matrix and the key matrix, splitting the value matrix according to attention heads, executing nonlinear interpolation on each head value matrix through the grouping dynamic Kolmogorov-Arnold network, and splicing the head value matrix and a corresponding attention result after interpolation to obtain the adaptive grouping GDKansformer coding output characteristics; Step 4, establishing a multi-order context interaction gating mechanism, performing spatial feature enhancement operation on GDKansformer coding output features, inhibiting redundant channel features through convolution and batch normalization, and combining global average pooling and GELU function activation to obtain enhancement features; Step 5, global average pooling is carried out on the output characteristics of the multi-order context interaction gating mechanism, the pooling result is input into a full-connection layer and pathological image category prediction probability is obtained through Softmax, the training stage is based on cross entropy loss and combines L2 weight attenuation regularization to construct a total loss function, and all the learnable parameters of the model are subjected to end-to-end optimization to obtain a GDKansformer model; step 6, inputting GDKansformer a full-view pathological full-section image to be diagnosed into a model, and outputting a pathological image lesion recognition result; Cutting the processed image into image blocks with preset sizes by using a fixed step length sliding window, specifically, only cutting a tissue region part obtained by dividing a tissue region, removing a background region in a full-view pathological full-section image, and obtaining the image blocks with preset sizes only containing tissues; Step 2, calculating the information entropy of the pixel or channel level of each grouping feature, specifically, performing softmax operation on each grouping feature determined by a grouping strategy to obtain the activation intensity distribution of the grouping feature; Step 2, obtaining a density coefficient based on the information entropy, namely calculating the maximum value and the minimum value of the information entropy of all grouping features; Combining the linear transformation of the operation result with the linear transformation of the grouping feature after GELU function activation to obtain the grouping dynamic Kolmogorov-Arnold network output feature, specifically, performing the linear transformation on the result of the dynamic Kolmogorov-Arnold network operation to obtain a first linear output, performing GELU function processing on the original grouping feature, and then performing the linear transformation to obtain a second linear output; In the step 4, the space feature enhancement operation is carried out on the coding output features of the step 3, specifically, the GDKansformer coding output features are subjected to batch normalization processing, and then compressed features are obtained through convolution and compression of channel dimensions; In the step 4, the pixel-level non-local context and the self-adaptive local context are aggregated, specifically, the self-adaptive local context is obtained by processing enhancement features through depth separable convolution and group convolution, the pixel-level non-local context is obtained by processing enhancement features through global average pooling and convolution, and the two context features are weighted and summed based on gating weights to obtain the aggregated feature.
- 2. The pathological image diagnosis method based on GDKAN and multi-order context interaction gating according to claim 1, wherein in step 3, nonlinear interpolation is performed on each head value matrix through a self-adaptive grouping dynamic Kolmogorov-Arnold network, specifically, each head value matrix is input into the grouping dynamic Kolmogorov-Arnold network constructed in step 2, nonlinear fitting is performed on elements of the head value matrix through a base spline function dynamically distributed in the network, and the fitted head value matrix is obtained, namely the head value matrix after interpolation.
- 3. The pathological image diagnosis method based on GDKAN and multi-order context interaction gating according to claim 1 is characterized in that in step 5, a total loss function is built by combining L2 weight attenuation regularization, specifically, all the leachable parameters of a model are combined into a parameter set, square values of all the parameters in the parameter set are calculated and summed, then the summed result is multiplied by a preset regularization coefficient to obtain an L2 weight attenuation regularization term, and the total loss function is the sum of cross entropy loss and the L2 weight attenuation regularization term.
- 4. The method for diagnosing a pathological image based on GDKAN-and multi-stage context interaction gating according to claim 1, wherein the full-field pathological full-section image to be diagnosed in the step 6 is a full-field pathological full-section image corresponding to at least one organ of lung cancer, breast cancer and colorectal cancer, and the outputted pathological change recognition result includes whether a tumor pathological change exists or not and a corresponding tumor pathological subtype.
Description
Pathological image diagnosis method based on GDKAN and multi-order context interaction gating Technical Field The invention relates to the technical field of medical image processing and artificial intelligence intersection, in particular to a pathological image diagnosis method based on GDKAN (packet dynamic Kolmogorov-Arnold network) and multi-order context interaction gating. Background Pathological image diagnosis is a "gold standard" for the definitive diagnosis of clinical diseases (especially tumors), but traditional diagnosis relies on pathologists manually observing full-field pathological full-section images (WSI), with the following core pain points: the WSI data characteristics have the processing problems that the WSI resolution is extremely high, the pixel scale is large, the images have uneven dyeing and noise interference, meanwhile, pathological tissues have the characteristics of small distribution difference and high heterogeneity (such as the mixing of normal cells and pathological cells in the same slice), and the traditional image processing method is difficult to efficiently extract effective characteristics; Limitations of existing deep learning models: The traditional transducer model relies on linear mapping to generate self-attention (Q, K, V) weight, has insufficient characterization capability on complex nonlinear characteristics of pathological images, has large parameter quantity and high calculation complexity, and is difficult to adapt to large-scale data processing of WSI; Classical Kolmogorov-Arnold networks (KAN) are prone to overfitting on pathological images (because the model structure cannot be dynamically adjusted according to feature complexity), and feature fusion of "local focus details" and "global tissue association" is difficult to consider by the attention mechanism of a single view; The diagnosis reliability and the interpretation are insufficient, the existing model has weak attention capability to key focus areas, is easily interfered by redundant information, is opaque in model decision process, and is difficult to meet the requirement of clinical diagnosis on traceability and interpretation. In summary, there is a need in the art for an automated diagnostic solution for pathological images that balances the "feature characterization capability, computational efficiency, diagnostic interpretability". Disclosure of Invention The invention provides a pathological image diagnosis method based on GDKAN and multi-stage context interaction gating, which realizes the automatic diagnosis of pathological images through pretreatment-dynamic feature optimization-multi-stage context interaction gating fusion-classification optimization full-flow design. In order to achieve the above purpose, the invention adopts the following technical scheme: A pathological image diagnosis method based on GDKAN and multi-order context interaction gating comprises the following steps: the method comprises the steps of 1, collecting full-view pathology full-slice images, sequentially carrying out color standardization, denoising and tissue region segmentation on the full-view pathology full-slice images, cutting the processed images into image blocks with preset sizes by using a fixed step sliding window under a preset magnification, and carrying out convolution downsampling on the image blocks with linear projection and preset multiples to generate image block embedding to obtain sequence characteristics; Step 2, constructing a self-adaptive grouping dynamic Kolmogorov-Arnold network, adaptively determining grouping number and channel division by a grouping strategy network based on statistics of sequence features, calculating pixel or channel layer information entropy of each grouping feature and obtaining a density coefficient, dynamically distributing the number of base spline functions for each grouping within the upper and lower bounds of the preset spline number; Step 3, constructing a self-adaptive grouping GDKansformer coding module, taking a transducer coding block as a main body, taking the output characteristics of a grouping dynamic Kolmogorov-Arnold network as a nonlinear mapping unit to replace the traditional linear mapping to generate a self-attention query matrix, a key matrix and a value matrix; Step 4, establishing a multi-order context interaction gating mechanism, performing spatial feature enhancement operation on GDKansformer coding output features, inhibiting redundant channel features through convolution and batch normalization, and combining global average pooling and GELU function activation to obtain enhancement features; Step 5, global average pooling is carried out on the output characteristics of the multi-order context interaction gating mechanism, the pooling result is input into a full-connection layer and pathological image category prediction probability is obtained through Softmax, the training stage is based on cross entropy loss and combines L2 w