CN-121998908-A - Pathological prediction method based on pathological image-text large model

CN121998908ACN 121998908 ACN121998908 ACN 121998908ACN-121998908-A

Abstract

The invention provides a pathology prediction method based on a pathology image-text large model, which comprises the steps of obtaining WSI images and corresponding diagnosis text data, reading multi-multiplying power WSI images, extracting image features through a vim4path model, screening information which can be represented by the images in the diagnosis text by utilizing a GPT model, constructing a custom vocabulary library, inputting the image features and the screened text into the pathology image-text large model, completing training based on tokenizer word segmentation of the custom vocabulary library, synchronously generating tokens corresponding thermodynamic diagrams, and taking tokens of a to-be-detected pathology image output by the pathology image-text large model as a retrieval key, and matching a report with highest similarity in a pathology report RAG retrieval library as final output. The invention can complete training by only needing a small amount of limited WSI images, has low video memory requirement, high efficiency, low resource consumption and strong interpretability, and solves the technical pain points of strong data dependence and high video memory requirement of the traditional pathological image-text model.

Inventors

ZHENG ZHONGXI
PAN YUNFENG
Zhang Chukang
ZHANG JINSHENG
HUANG JIANBO

Assignees

武汉现代病理工程研究院有限公司

Dates

Publication Date: 20260508
Application Date: 20251224

Claims (10)

1. A pathology prediction method based on a pathology image-text large model is characterized by comprising the following steps: acquiring full-scan images of a plurality of pathological sections and corresponding diagnosis text data; reading a first multi-magnification image of each pathological section full-scan image, and extracting a first image feature vector of the first multi-magnification image based on a vim4path model, wherein the vim4path model is a feature extraction network pre-trained by a pathological image data set; Screening information which can be represented by an image in each piece of diagnosis text data based on a GPT model to obtain screened diagnosis text data; constructing a training set based on a first image feature vector corresponding to each pathological section full-scan image and the screened diagnosis text data, and training a pathological image-text large model based on the training set; Reading a second multi-magnification image of a full-scan image of a pathological section to be detected, extracting a second image feature vector of the second multi-magnification image, inputting the second image feature vector into the trained pathological image-text large model, and outputting a predicted text sequence and a corresponding image region thermodynamic diagram; matching the predicted text sequence with each pathology report in a pathology report RAG retrieval library, and outputting a pathology report with highest similarity with the predicted text sequence; And returning a pathology report corresponding to the full-scan image of the pathological section to be detected and a corresponding image area thermodynamic diagram.
2. The pathology prediction method based on the large pathology image-text model according to claim 1, wherein the obtaining a plurality of pathology section full scan images and corresponding diagnosis text data comprises: Acquiring a full scan image dataset of a pathological section Corresponding diagnostic text data set , wherein, Represent the first Zhang Bingli slices of the full-scan image, Represent the first Zhang Bingli slicing diagnostic text data corresponding to the full-scan image, wherein M is less than or equal to 10000; reading a first multi-magnification image of each pathological section full-scan image, and extracting a first image feature vector of the first multi-magnification image based on a vim4path model, wherein the method comprises the following steps: reading each pathological section full-scan image 10-Magnification image of (2) Extracting a first image feature vector of the 10-multiplying-power image through a vim4path model Wherein , Is a feature dimension.
3. The pathology prediction method based on the large pathology graph model according to claim 1, wherein the GPT model is used for screening information which can be represented by an image in each piece of diagnosis text data, so as to obtain screened diagnosis text data, and the method comprises the following steps: Construction of screening function based on GPT model For each of the diagnostic text data based on a screening function Screening to obtain screened diagnosis text data ; Based on all the screened diagnostic text data Constructing custom vocabulary libraries 。
4. A pathology prediction method based on a large model of pathology graphics according to claim 1, characterized in that the training set constructed is represented as , Represent the first Zhang Bingli slice the image feature vector of the full scan image, Is the first The number of the diagnostic text data after screening, M represents the number of training samples in the training set; Training the pathology image-text large model based on the training set comprises the following steps: Will be Inputting a large pathological image-text model based on tokenizer tool pairs Text word segmentation is carried out to obtain a token sequence after word segmentation , wherein, , Represent the first A sequence of the number of token's, Representation of The first of (3) The number of the token to be used in the process, Is the sequence length; Output of Each of (a) Predictive probability of (a) Corresponding image area thermodynamic diagrams ; Based on Each of (a) Predictive probability of (a) Corresponding image area thermodynamic diagrams Calculating a loss function value; And adjusting model parameters of the pathological image-text large model based on the loss function value until the loss function value is minimum, and obtaining the trained pathological image-text large model.
5. The pathology prediction method based on a large pathology graph model according to claim 4, wherein the loss function of the large pathology graph model is a weighted sum of masked cross entropy loss and thermodynamic diagram constraint terms, and the expression is: Wherein, the Large model pair representing pathology image and text Real label of individual time steps Is used for predicting the probability of (1); Is that Mask entry, valid token position Filling position ; Is the total number of effective positions, meets the following conditions , ; Is that A corresponding map of the image area thermodynamic diagram, Is a balance coefficient.
6. The pathology prediction method based on the pathology graph large model according to claim 1, wherein the steps of reading a second multi-magnification image of a full-scan image of a pathological section to be detected, extracting a second image feature vector of the second multi-magnification image, inputting the second image feature vector into the trained pathology graph large model, outputting a predicted text sequence and a corresponding image region thermodynamic diagram, and include: Reading a full-scan image of a pathological section to be detected Acquiring a corresponding 10-multiplying-power image, and extracting a second image feature vector of the 10-multiplying-power image ; The second image feature vector Inputting the trained pathology image-text large model, and outputting a predicted token sequence Thermodynamic diagram of corresponding image region ; Matching the predicted text sequence to each pathology report in a pathology report RAG retrieval library, and outputting a pathology report with highest similarity with the predicted text sequence, wherein the method comprises the following steps: Construction of 10 ten thousand-level pathology report RAG search library Calculation of And each is connected with Semantic similarity of (c): Wherein the method comprises the steps of For the purpose of a semantic embedding function, Is a cosine function; Output and output Pathological report with highest similarity And 。
7. The pathology prediction method based on the large pathology image-text model according to claim 2, wherein the vim4path model is a pathology image feature extraction model based on a transducer, and the feature dimension is output 384.
8. A pathology prediction method based on a large model of pathology graphics context according to claim 3, wherein the custom vocabulary library Comprises special terms for pathological diagnosis, focus description vocabulary and anatomical nouns, and adopts word frequency threshold to filter low occurrence frequency semantic units 。
9. The pathology prediction method based on a large model of pathology graphics context according to claim 4, wherein the image region thermodynamic diagram The expression is as follows: Wherein, the Is the pixel coordinates of a 10-magnification image, ) Is the characteristic vector of the image with 10 multiplying power, Is the first Personal token Is used for the weight coefficient of the (c), For the attention weighting calculation function, Is that Is a sequence length of (a) in a sequence.
10. The pathology prediction method based on a large pathology image according to claim 6, wherein the construction of the pathology report RAG search library comprises: Collecting 10 ten thousand grade pathology reports Reporting for each pathology Performing structural analysis, extracting a core semantic unit, wherein the core semantic unit comprises a diagnosis conclusion, focus characteristics and pathology classification, and combining the extracted core semantic unit with the diagnosis conclusion, focus characteristics and pathology classification And (5) after the formats are aligned, storing.

Description

Pathological prediction method based on pathological image-text large model Technical Field The invention relates to the technical field of intersection of artificial intelligence, medical image processing and natural language processing, in particular to a pathology prediction method based on a pathology image-text large model. Background The existing training process for the pathological image-text large model has the following technical defects: 1) The data dependence is strong, the training effect can be ensured only by tens of thousands of WSI images, and the acquisition cost of high-quality pathological section data is high and the period is long; 2) The demand of the video memory is extremely high, and the training process usually needs more than hundred G video memories, so that the popularization and application of the model are limited; 3) The interpretability is poor, the association between the model output and the image characteristics is not clear, and the strict requirement of medical diagnosis is difficult to meet; 4) The diagnostic text output lacks normalization of clinical reports and relies on secondary corrections by the physician. Disclosure of Invention Aiming at the technical problems in the prior art, the invention provides a pathology prediction method based on a pathology image-text large model, which solves the problems of high data cost caused by the need of large data volume and high video memory in the existing pathology image-text large model training process. According to a first aspect of the present invention, there is provided a pathology prediction method based on a pathology image-text large model, comprising: acquiring full-scan images of a plurality of pathological sections and corresponding diagnosis text data; reading a first multi-magnification image of each pathological section full-scan image, and extracting a first image feature vector of the first multi-magnification image based on a vim4path model, wherein the vim4path model is a feature extraction network pre-trained by a pathological image data set; Screening information which can be represented by an image in each piece of diagnosis text data based on a GPT model to obtain screened diagnosis text data; constructing a training set based on a first image feature vector corresponding to each pathological section full-scan image and the screened diagnosis text data, and training a pathological image-text large model based on the training set; Reading a second multi-magnification image of a full-scan image of a pathological section to be detected, extracting a second image feature vector of the second multi-magnification image, inputting the second image feature vector into the trained pathological image-text large model, and outputting a predicted text sequence and a corresponding image region thermodynamic diagram; matching the predicted text sequence with each pathology report in a pathology report RAG retrieval library, and outputting a pathology report with highest similarity with the predicted text sequence; And returning a pathology report corresponding to the full-scan image of the pathological section to be detected and a corresponding image area thermodynamic diagram. The pathology prediction method based on the pathology image-text large model provided by the invention realizes high-efficiency feature extraction under low data volume through the special pre-training capability of the feature extraction model on the pathology image, screens high-relevance texts by means of the GPT model, improves data quality and training efficiency, avoids invalid data dilution model performance, builds a custom vocabulary library as tokenizer, does not need independent training, and greatly reduces the data source threshold of model training. The multi-multiplying power image feature extraction reduces the original image data volume (reduced by 16 times compared with the 40-multiplying power image data volume), reduces the hardware threshold in the pathological image-text large model training process, and reduces the video memory requirement. Drawings FIG. 1 is a flowchart of a pathology prediction method based on a large pathology image-text model according to an embodiment of the present invention; fig. 2 is a schematic structural diagram of a pathology prediction system based on a large pathology graph-text model according to an embodiment of the present invention; Fig. 3 is a schematic hardware structure of one possible electronic device according to the present invention; fig. 4 is a schematic hardware structure of a possible computer readable storage medium according to the present invention. Detailed Description For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embo