CN-122025083-A - Intelligent diagnosis method for spine infection and tuberculosis based on image large model
Abstract
The invention relates to an intelligent diagnosis method for spine infection and tuberculosis based on an image large model, which has the core ideas of pre-training a base and layering fine adjustment, namely, firstly utilizing massive non-labeling spine MRI data, training a general spine image vision large model based on DINOv algorithm to master the anatomical structure and basic pathological characteristics of the spine, and then designing a two-stage cascade diagnosis network on the base to respectively solve the problems of infection and infection. The method solves the problems of lack of labeling data and insufficient feature extraction capability in the differential diagnosis of spinal infection, utilizes DINOv self-supervision pre-training technology to construct a visual feature base model, realizes automatic screening of spinal infectious diseases and accurate differential diagnosis of infection subtypes through a hierarchical double-stage reasoning architecture, greatly reduces the dependence on labeling data, and remarkably improves the identification accuracy of difficult cases.
Inventors
- CHEN JIE
- CHEN XIAOHUA
- LI HU
- ZHANG RUIPENG
- ZHU YUEQI
Assignees
- 上海市第六人民医院
Dates
- Publication Date
- 20260512
- Application Date
- 20260115
Claims (5)
- 1. An intelligent diagnosis method for spinal infection and tuberculosis based on an image large model is characterized by comprising the following steps: Step one, multi-mode data construction and standardization preprocessing, namely constructing a pre-training data set Dataset A to obtain general spine MRI examination data without any diagnosis tag, constructing a downstream task data set Dataset B to obtain spine MRI examination data of spine infection cases and contrast groups, wherein the data set Dataset B is required to be marked finely; step two, constructing a backbone image base model based on DINOv < 2 >: the spine image base model specifically comprises: The backbone network adopts ViT-Large, captures the long-distance dependence relationship between pixels of the input 2D MRI slice through Self-Attention mechanism; the student-teacher asymmetric network architecture is to construct two networks with identical structures, namely a student network and a teacher network; The teacher network does not conduct back propagation, and the parameters are index moving average EMA of the parameters of the student network; A multi-view clipping strategy, namely clipping a high-resolution image covering a large range of anatomical structures and inputting the high-resolution image into a teacher network, and a local view, namely clipping a low-resolution image only comprising part and inputting the low-resolution image into a student network; training targets, namely forcing a student network to predict the feature distribution of global information seen by a teacher network only by local information; training with ultra-high resolution, namely, in the later stage of pre-training, performing fine adjustment by using a high-resolution image for a short time, so that the model is suitable for the high-frequency texture details of MRI; Thirdly, layering double-stage diagnosis tasks, namely constructing a downstream task network by using a trained DINOv-based spine image base model as a feature extractor; the first stage is an infection detection network, namely a 2.5D classification network, which extracts basic image characteristics from each slice of a patient based on the first step, and then carries out overall classification of multiple slices through a transformer decoder layer of 2 layers; Input a multi-modality MRI image comprising a T2, T1, STIR sequence; task definition, namely classifying tasks, namely non-infection and infection, wherein the non-infection comprises common confusion lesions of normal spine, degenerative change and spine tumor; Design logic, in this stage, high sensitivity is concerned, and the model captures the sensitive features of "edema signal" and "endplate destruction" infection; a loss function, namely adopting weighted cross entropy loss to increase punishment strength for missed diagnosis; outputting, namely judging that the infection is suspected to be infected and entering a second stage if the infection probability is greater than a set threshold value 1, otherwise outputting a normal/degeneration report; a second stage, namely an etiology identification network, wherein the structure is the same as that of the first stage, and the first stage and the second stage share the base; Inputting, namely, only aiming at positive cases screened by the first stage; task definition, namely classifying tasks, suppurative infection and spinal tuberculosis; Feature fusion and enhancement: extracting the Attention maps of ViT-Large different layers, wherein shallow layer features concern edges, and deep layer features concern semantics; authentication logic: Tuberculosis feature learning, wherein the model can focus on the morphology of the paravertebral abscess, whether a plurality of discontinuous vertebral bodies are involved and whether the intervertebral discs are relatively reserved; suppurative feature learning, namely, paying attention to damage degree of intervertebral disc, limitation of abscess and thick-wall feature of the model; Outputting tuberculosis probability, setting threshold value 2 if the tuberculosis probability is greater, diagnosing spinal tuberculosis, otherwise diagnosing suppurative spondylitis.
- 2. The intelligent diagnosis method for spine infection and tuberculosis based on image large model as described in claim 1, wherein the first step specifically comprises: step 1.1, constructing a data queue: The pre-training dataset Dataset A, which is used for allowing the model to learn normal spine morphology, degeneration, lateral curvature and various common lesions, gathers general spine MRI examination data without any diagnostic tag; a downstream task dataset Dataset B, wherein the spine infection cases which are confirmed through pathological biopsy, operation record or clinical follow-up visit are divided into suppurative groups and tuberculosis groups and matched control groups, and normal or non-infectious lesions; Step 1.2, image standardization processing: N4 bias field correction, namely eliminating image brightness deviation caused by non-uniform magnetic field of MRI equipment; resampling and registering, namely resampling all DICOM images to uniform voxel spacing, carrying out rigid registration of multiple sequences T1 and T2, registering by using ANTsPy tools, carrying out fine multiple sequence registration by adopting a SyNRA registration mode, and ensuring the alignment of anatomical structures; Intensity normalization, namely, adopting Z-score normalization to adjust the pixel intensity distribution to be standard normal distribution with mean value 0 and variance 1, and eliminating dimension differences among different scanners.
- 3. The intelligent diagnosis method for spine infection and tuberculosis based on image large model as described in claim 1, wherein in the second step, DINOv2 contains an optimization mechanism: a loss function, namely measuring the difference between student output probability distribution and teacher output probability distribution by adopting cross entropy loss; KoLeo regularization, namely introducing Kozachenko-Leonenko differential entropy regularization term to encourage the features to be uniformly distributed in the batch and prevent the features from collapsing; patch level objective function DINOv introduces Patch level feature alignment in addition to global feature alignment of the token.
- 4. The intelligent diagnosis method for spinal infection and tuberculosis based on the image large model as claimed in claim 1, wherein in the third step, the feature fusion and enhancement in the etiology discrimination network of the second stage further comprises the step of embedding clinical priori knowledge, namely encoding clinical indexes of patients including age, heating duration and erythrocyte sedimentation rate into vectors, and carrying out Concat fusion with image features.
- 5. The intelligent diagnosis system for spine infection and tuberculosis based on the image large model is characterized by being used for realizing the intelligent diagnosis method for spine infection and tuberculosis based on the image large model according to any one of claims 1-4, and comprises the steps of firstly training a general spine image vision large model based on DINOv algorithm by utilizing massive non-labeling spine MRI data to master anatomical structures and basic pathological characteristics of a spine, then designing two-stage cascade diagnosis networks on a base to respectively solve the problems of 'whether infection exists' and 'what infection exists', and comprising five functional modules: The data acquisition and intelligent preprocessing module is responsible for cleaning and standardizing multi-source MRI data and extracting an ROI (region of interest); The self-supervision pre-training base construction module is used for training a ViT-Large feature extractor on unmarked data based on DINOv framework; the primary infection screening module is used for identifying whether the spine infection exists or not with high sensitivity; a secondary pathogen identification module for distinguishing tuberculosis from suppurative inflammation in cases of infection with high specificity; And a visualization and auxiliary decision-making interaction module for generating a diagnosis report and focus attention thermodynamic diagram.
Description
Intelligent diagnosis method for spine infection and tuberculosis based on image large model Technical Field The invention relates to a medical image processing technology, in particular to an intelligent analysis method for spinal diseases based on Magnetic Resonance Imaging (MRI) data. Background Clinical background Spinal Infection (Spinal Infection) is a serious and potentially devastating disease in orthopaedics clinics, mainly including vertebral osteomyelitis, discositis and epidural abscesses. With aging population, wide use of immunosuppressants and increase of drug-resistant bacteria, the incidence of spinal infection tends to rise year by year. In the clinical classification, suppurative spondylitis (Pyogenic Spondylitis, PS) and specific infections (mainly spinal tuberculosis, spinal Tuberculosis/Pott's Disease, STB) are mainly classified. Although Magnetic Resonance Imaging (MRI) has become the "gold standard" for diagnosing spinal infections by virtue of its excellent soft tissue resolution and high sensitivity to bone marrow edema, in actual clinical work, distinguishing suppurative infections from spinal tuberculosis remains a great challenge for radiologists. These two diseases have extremely high overlap in imaging performance: common features are that both are represented by vertebral endplate destruction, vertebral bone marrow edema (T2 weighted image high signal, T1 weighted image low signal), disc signal changes, swelling of paraspinal soft tissue or abscess formation. The difficulty in identifying is that although textbooks indicate certain specific features (e.g., tuberculosis tends to "jump" lesions, paravertebral abscess, bone destruction is more important than disc destruction; suppuration tends to destroy disc early, form liquefied necrosis), these features are often ambiguous in early or atypical cases of disease. Clinical pain spots the consequences of misdiagnosis are disastrous. Suppurative infection usually requires powerful antibiotic treatment and strict braking, while spinal tuberculosis must be subjected to long-range combined antitubercular chemotherapy. If the tuberculosis is misdiagnosed as suppurative infection and antibiotics are simply used, the tuberculosis is not effective, and can cause diffusion and drug resistance of tubercle bacillus and even spinal deformity and paraplegia of patients, otherwise, unnecessary hepatorenal toxicity can be caused by overuse of antituberculosis drugs. At present, clinical identification mainly depends on puncture biopsy, but puncture is invasive operation, and the positive rate is only 50% -70%. Therefore, there is an urgent need for a non-invasive, objective, accurate auxiliary diagnostic tool. State of the art related art and limitation analysis Traditional machine learning and artificial feature engineering early aided diagnosis systems rely mainly on manually designed features (Hand-crafted Features). Researchers classify by extracting the texture features of the image (such as gray level co-occurrence matrix GLCM, wavelet transform coefficients), shape descriptors and histogram statistics, in combination with Support Vector Machines (SVMs) or random forests. The defects are that the artificial features mainly reflect the shallow texture information, and complex three-dimensional anatomical structures (such as vertebral sequences and non-Euclidean geometric forms of the paraspinal soft tissues) and deep pathological semantics in MRI images are difficult to capture. In addition, feature engineering is highly dependent on expert experience, has poor generalization ability, and once MRI scan parameters (e.g., TE/TR time) change, model performance tends to drop significantly. Convolutional Neural Network (CNN) based on supervised learning In recent years, deep learning (particularly CNN) has made a breakthrough in the field of medical imaging. In the prior art, there have been studies on classification of spinal MRI (e.g., distinguishing tumor from infection) using a classical architecture such as ResNet, denseNet. Defect one, long tail dilemma of data annotation. The performance of the supervised learning (Supervised Learning) model is positively correlated to the scale and labeling quality of its training data. However, diseases such as spinal tuberculosis are among the "small sample" data relative to lung nodules, fractures. It is extremely difficult to acquire a large number of pathologically confirmed (gold-standard) MRI images of spinal infections. Training a deep CNN model on small sample data is extremely prone to overfitting (Overfitting), i.e., the model remembers the noise of the training set rather than the essential features of the disease. Defect two, migration learning "Domain Shift". Most medical AI models are currently initialized with weights pre-trained on ImageNet (natural image dataset, such as cat, dog, car). However, natural images and medical images have great differences in imaging principles, gray scale