CN-122025110-A - ATC tissue pathology molecular typing method and computing equipment based on knowledge distillation

CN122025110ACN 122025110 ACN122025110 ACN 122025110ACN-122025110-A

Abstract

The invention relates to an ATC tissue pathology molecular typing method and computing equipment based on knowledge distillation, which are used for constructing a teacher-student distillation frame and utilizing a basic large model of freezing parameters to guide a lightweight model to capture complex pathology morphological characteristics in order to reproduce the characterization capability of a large visual model under the clinical routine calculation force, so that the problem that a high-precision model is difficult to deploy locally is effectively solved. Furthermore, in view of the strong nonlinear coupling relation between the macroscopic morphology and the microscopic molecules, the scheme introduces a depth fusion mechanism based on cooperative gating and bilinear interaction, and breaks through the limitation that the co-occurrence enhancement effect between modes cannot be captured by traditional linear fusion. The mechanism can dynamically mine high-order interaction features, so that high-precision and high-interpretability molecular subtype prediction is realized on the premise of no need of continuous expensive histology detection.

Inventors

GE MINGHUA
YAN HUIHUI
HUANG PING
TAN ZHUO
Pan Zongfu

Assignees

浙江省人民医院

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (8)

1. An ATC histopathological molecular typing method based on knowledge distillation, comprising: S1, carrying out standardized pretreatment on an obtained pathological full-section image of a thyroid undifferentiated carcinoma patient and carrying out feature screening on obtained proteomic data to obtain a pretreated image block and a protein feature vector; S2, calling a pre-trained visual basic large model as a teacher model, and carrying out global feature extraction on the preprocessed image blocks by using the teacher model to obtain global features of a teacher; S3, migrating the characterization capability of the global features of the teacher to a lightweight student model by calculating the feature distillation loss and outputting the distillation loss, and extracting morphological features of the preprocessed image block by using the lightweight student model to obtain a pathological form characterization vector; s4, performing multi-group feature fusion based on an attention mechanism on the pathological form characterization vector and the protein feature vector to obtain a fused patient-level feature vector; s5, inputting the fused patient-level feature vector into a pre-trained classifier to obtain a molecular typing prediction result.
2. The ATC histopathological molecular typing method based on knowledge distillation according to claim 1, wherein step S1 comprises: performing background rejection and sliding window cutting processing on the pathology full-slice image to obtain an original pathology image block set, and performing pathology image block color normalization on the original pathology image block set to obtain a preprocessed image block; Filling and normalizing the deletion value of the proteomics data to obtain a normalized proteomics matrix; And performing feature sparse selection and dimension reduction treatment on the standardized proteomics matrix to obtain protein feature vectors.
3. The ATC histopathological molecular typing method based on knowledge distillation according to claim 1, wherein step S2 comprises: Loading a visual basic large model of frozen parameters as a teacher model, and performing forward propagation and depth feature mapping on the preprocessed image blocks by using an image encoder in the teacher model to obtain a teacher model image block feature set; and carrying out average pooling aggregation of slice-level global features on the image block feature set of the teacher model to obtain the global features of the teacher.
4. The ATC histopathological molecular typing method based on knowledge distillation according to claim 1, wherein step S3 comprises: Carrying out feature coding on the preprocessed image block by using a lightweight student model to obtain an original feature vector of the student model; Calculating distillation loss and characteristic distillation loss between the original characteristic vector of the student model and global characteristics of a teacher, and updating parameters of the lightweight student model by using a back propagation algorithm to obtain a trained student model; and deducing the preprocessed image block by using the trained student model to obtain a pathological form characterization vector containing depth morphology information.
5. The ATC histopathological molecular typing method based on knowledge distillation according to claim 1, wherein step S4 comprises: Based on a pre-constructed modal feature projector, carrying out multi-modal feature projection alignment and unification on the pathological form characterization vector and the protein feature vector to obtain an aligned pathological embedding vector and an aligned protein embedding vector; Calculating a modal attention weight vector between the Ji Bingli embedded vectors and the aligned protein embedded vectors; Based on the modal attention weight vector, cross-modal feature fusion is performed on the aligned pathology embedding vector and the aligned protein embedding vector to obtain a fused patient-level feature vector.
6. The ATC histopathological molecular typing method based on knowledge distillation according to claim 1, wherein step S5 comprises: based on the fully connected classification layer, carrying out inference calculation on molecular subtype probability distribution on the fused patient-level feature vector to obtain original logarithmic probability; Carrying out probability normalization processing on the original logarithmic probability to obtain subtype probability distribution vectors; and carrying out extremum index retrieval and label mapping processing on the subtype probability distribution vector to obtain a molecular typing prediction result.
7. The knowledge distillation based ATC histopathological molecular typing method of claim 5, wherein cross-modal feature fusion of aligned pathology embedding vectors and aligned protein embedding vectors based on modal attention weight vectors to obtain fused patient-level feature vectors comprises: Extracting cross-modal bilinear interaction characteristics of the alignment pathology embedded vector and the alignment protein embedded vector to obtain a bilinear interaction characteristic vector; determining a cooperative gating vector based on the bilinear interaction feature vector; based on the modal attention weight vector and the collaborative gating vector, collaborative modulation calculation and residual fusion are carried out on the aligned pathology embedding vector and the aligned protein embedding vector to obtain a fused patient-level feature vector.
8. An ATC histopathological molecular typing computing device based on knowledge distillation, comprising: The image standardization processing module is used for executing S1, carrying out standardization pretreatment on the acquired pathological full-section image of the thyroid undifferentiated carcinoma patient and carrying out feature screening on the acquired proteomic data so as to obtain a pretreatment image block and protein feature vectors; the global feature extraction module is used for executing S2, calling a pre-trained visual basic large model as a teacher model, and carrying out global feature extraction on the preprocessed image blocks by using the teacher model to obtain global features of a teacher; The morphological feature extraction module is used for executing S3, namely migrating the characterization capability of the global features of the teacher to a lightweight student model by calculating feature distillation loss and output distillation loss, and carrying out morphological feature extraction on the preprocessed image block by utilizing the lightweight student model to obtain a pathological form characterization vector; the multi-group feature fusion module is used for executing S4, namely carrying out multi-group feature fusion based on an attention mechanism on the pathological form characterization vector and the protein feature vector to obtain a fused patient-level feature vector; And the molecular typing prediction module is used for executing S5, and inputting the fused patient-level feature vector into a pre-trained classifier to obtain a molecular typing prediction result.

Description

ATC tissue pathology molecular typing method and computing equipment based on knowledge distillation Technical Field The application relates to the technical field of medical artificial intelligence, in particular to an ATC tissue pathology molecular typing method and computing equipment based on knowledge distillation. Background Thyroid undifferentiated carcinoma (ATC) is a very malignant and poorly predicted subtype of thyroid carcinoma, and clinical diagnosis and treatment are highly dependent on accurate molecular typing to guide targeting and selection of immunotherapeutic regimens. However, current gold standard typing methods typically rely on high throughput proteomic or genomic assays, which are not only costly, time consuming, and demanding in terms of sample quality, but also difficult to meet the urgent clinical demands for immediate diagnosis and rapid decision making. Although the pathological full-section image has the advantages of convenient and quick acquisition and low cost as clinical routine diagnosis data and contains abundant phenotype information reflecting tumor microenvironment, morphological characterization highly related to deep molecular features is difficult to be extracted from the pathological full-section image only by naked eyes or a traditional image analysis method due to the extreme complexity and heterogeneity of the ATC tissue pathological morphology. In recent years, a visual basic large model based on deep learning shows strong generalization capability in terms of feature extraction, but the huge parameter and calculation cost of the visual basic large model make the visual basic large model difficult to be directly deployed in a conventional medical calculation terminal, so that the popularization and application of the visual basic large model in clinical actual scenes are limited. The existing multi-modal auxiliary diagnostic scheme attempts to combine pathology images with molecular data, but has significant limitations in feature processing and fusion mechanisms. In order to adapt to clinical hardware environment, part of schemes have to sacrifice model performance and adopt a lightweight network, so that the extracted pathological features are difficult to fully represent complex tumor forms, while in the multi-mode fusion layer, the prior art mostly adopts a simple self-adaptive weighting or linear superposition mode to generate patient-level features. The processing mode implies a mathematical assumption that the pathological morphological characteristics and the microscopic molecular characteristics are mutually independent contribution factors in the characteristic space, and the effective aggregation of the information can be completed only by weighting and summing. However, in the complex pathological diagnosis scenario of ATC, there are often extremely strong nonlinear coupling relationships and condition dependencies between macroscopic tissue morphology and microscopic protein expression. For example, high expression of certain cancer stem cell markers may only be predictive of a very poor prognostic risk, accompanied by a particular pattern of tumor cell cluster arrangement or necrotic region characteristics. The simple linear fusion cannot effectively capture the synergic and exclusive effects of one mode to activate or inhibit the response of the other mode, so that high-order interaction information is lost, and deep identification features are difficult to mine when the model faces the difficult subtype at the classification boundary, so that further improvement of the classification precision is limited. Thus, an optimized ATC histopathological molecular typing scheme based on knowledge-based distillation is desired. Disclosure of Invention The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides an ATC tissue pathology molecular typing method based on knowledge distillation and computing equipment. According to one aspect of the present application, there is provided a method for ATC histopathological molecular typing based on knowledge-based distillation, comprising: S1, carrying out standardized pretreatment on an obtained pathological full-section image of a thyroid undifferentiated carcinoma patient and carrying out feature screening on obtained proteomic data to obtain a pretreated image block and a protein feature vector; S2, calling a pre-trained visual basic large model as a teacher model, and carrying out global feature extraction on the preprocessed image blocks by using the teacher model to obtain global features of a teacher; S3, migrating the characterization capability of the global features of the teacher to a lightweight student model by calculating the feature distillation loss and outputting the distillation loss, and extracting morphological features of the preprocessed image block by using the lightweight student model to obtain a pathological form ch