CN-121983318-A - Osteoporosis compression fracture screening method based on multi-mode large language model

CN121983318ACN 121983318 ACN121983318 ACN 121983318ACN-121983318-A

Abstract

The invention relates to the technical field of early screening of osteoporosis vertebral compression fracture, and provides a multi-mode large language model-based osteoporosis compression fracture screening method, which takes unstructured visual data such as a patient body image, an action video and the like as input, the multi-mode large language model is introduced to quantitatively evaluate the key functions of posture alignment, movement coordination, pain related reaction and the like of a patient from images and videos, and the key functions are output in a standardized scoring form, so that the structural features with clear clinical significance are automatically extracted. And a machine learning model is constructed based on the structural features, the occurrence risk of OVCF is evaluated, key discrimination factors and the action direction thereof are defined by combining SHAP feature contribution analysis and a decision tree visualization method, and interpretable expression of the prediction process is realized. The method provides a safe, low-cost, interpretable and easy-to-popularize technical scheme for OVCF early screening, and is suitable for various application scenes such as community and home screening.

Inventors

TANG CHAO
JIN XIAOQING
ZHONG DEJUN
LIAO YEHUI
TANG QIANG
HUANG XINLIN

Assignees

西南医科大学附属医院

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (10)

1. The osteoporosis compression fracture screening method based on the multi-mode large language model is characterized by comprising the following steps of: s1, data acquisition and posture sequence construction, namely acquiring patient information for osteoporosis vertebral compression fracture screening, preprocessing image data and video data in the patient information to obtain image and video data, performing human body posture analysis on the preprocessed image and video data based on a posture estimation algorithm, extracting space coordinate information of key points of human bones and time sequence characteristics of the key points along with time, and constructing posture sequence data containing the marks of the key points of the human bodies, wherein the posture sequence data are used for representing static posture characteristics and dynamic functional action characteristics of the patient; S2, multi-modal semantic feature extraction, namely designing a structured prompt word with definite clinical semantic constraint for the gesture sequence data, inputting the gesture sequence data into a multi-modal large language model, and comprehensively analyzing gesture structural features and movement behavior features of a patient in the gesture maintenance and functional action execution process by the multi-modal large language model to generate a structured quantitative scoring result capable of reflecting the functional state of the patient; s3, feature aggregation and patient-level feature vector construction, namely summarizing a plurality of groups of structured scoring results obtained by the same patient under images of different visual angles and different functional action tasks, and carrying out aggregation treatment on scoring values corresponding to the same semantic indexes by adopting a robust statistical method to form a patient-level feature vector representing the overall posture and functional state of the patient; S4, machine learning risk modeling and probability prediction are carried out, namely the patient-level feature vector and basic clinical information are taken as model input, a supervised learning classification model is constructed and trained, modeling analysis is carried out on whether the patient has the osteoporosis compression vertebral fracture, and a probability prediction value of the patient for the osteoporosis vertebral compression fracture is output; S5, calculating screening performance indexes under different classification thresholds based on the probability prediction value, determining an optimal classification threshold for the osteoporosis centrum compression fracture screening by combining the Youden index, and outputting an osteoporosis centrum compression fracture risk assessment result or risk classification result of the patient according to the optimal classification threshold.
2. The method for screening osteoporotic compression fracture based on the multi-modal large language model according to claim 1, wherein in the data acquisition and posture sequence construction, patient information comprises standard standing still posture photos, functional videos and basic clinical information of the patient; the standard standing position static state photo of the patient comprises front, back and side standing position standard standing position static state photos of the patient; the functional videos comprise dynamic functional videos of patient supination, left turn-over, right turn-over and sitting-up; The basic clinical information includes patient age, sex, height, weight.
3. The method for screening osteoporotic compression fracture based on the multi-modal large language model according to claim 2, wherein the human body posture information is extracted and posture sequence data containing human body key point marks are generated through openpose posture estimation processing on standard standing position static state photos and functional videos of the patient.
4. The method for screening osteoporotic compression fracture based on a multimodal large language model according to claim 1, wherein in the multimodal semantic feature extraction, the gesture sequence data is semantically scored from at least four dimensions based on the structured prompt word: Posture alignment, posture symmetry, motion coordination, and pain response; Wherein, the The posture alignment is used for representing the arrangement state of the overall spinal force line and the trunk posture of the patient; the gesture symmetry is used for representing the consistency of left and right limbs and trunk of a patient in the static structure and dynamic movement process; the motion coordination is used for representing the cooperative control capability of a patient when the multi-joint participation is completed; the pain response is used to characterize the patient's protective posture, tardy motion, or abnormal movement patterns due to pain during performance of the motion.
5. The multi-modal large language model based osteoporosis compression fracture screening method of claim 4, wherein the multi-modal large language model is constrained to output a structured scoring result in JSON format, the scoring result being mapped to normalized continuous values 0-1, wherein 0 is normal and 1 is abnormal for input features of subsequent machine learning model analysis.
6. The method for screening the osteoporosis compression fracture based on the multi-mode large language model according to claim 1, wherein in the feature aggregation and patient-level feature vector construction, structural scoring results from images with different visual angles and videos with different functional tasks are summarized for each patient, robust aggregation processing is carried out, and noise caused by model randomness is reduced by adopting a median or mean aggregation mode for scoring results of the same index under multiple generation or multiple visual angles.
7. The method for screening the osteoporosis compression fracture based on the multi-modal large language model according to claim 1, wherein in the machine learning risk modeling and probability prediction, a plurality of supervised learning classification models are constructed, including logistic regression, support vector machines, random forests, gradient lifting and decision tree models, the models are input as multi-modal scoring results, and the probability prediction value of whether the osteoporosis centrum compression fracture exists is output.
8. The method for screening osteoporotic compression fracture based on a multi-modal large language model according to claim 7, wherein in machine learning risk modeling, model evaluation is performed by adopting layered K-fold cross validation for ensuring that different compromise cases are consistent with comparison proportions, and each of the folded models is subjected to feature standardization and missing value filling only on a training set, and is independently tested on a validation set, machine learning model performance is evaluated by using area under a curve, accuracy, F1 value, sensitivity and specificity indexes, diagnostic capabilities of different models are compared, and a machine learning model with both performance and interpretability is selected as a final scheme.
9. The method for screening osteoporotic compression fracture based on a multimodal large language model according to claim 1, wherein the step of screening for fracture and classifying risk comprises the steps of: Model output probability calculation, wherein the machine learning model outputs an osteoporosis centrum compression fracture prediction probability mark p to each patient, and calculates sensitivity and specificity under a plurality of thresholds tau/tau; And calculating a Youden index, selecting an optimal classification threshold, and adjusting the threshold according to clinical screening requirements.
10. The method for screening for osteoporotic compression fracture based on a multimodal large language model according to claim 9, wherein in the Youden index calculation, the Youden index calculation for each threshold is given by: J(τ)=Sensitivity(τ)+Specificity(τ)−1; selecting a threshold value maximizing J (τ) as an optimal classification threshold value; Wherein J (τ) is a Youden index corresponding to the threshold τ, and Sensitivity (τ) is a Sensitivity/true rate (TPR) at the threshold τ, namely a true positive rate, and the calculation formula is: ; wherein TP is true positive number, FN is false negative number; SPECIFICITY (τ) is the specificity/True Negative Rate (TNR), i.e. true negative rate, at the threshold τ; ; wherein TN is true negative number, and FP is false positive number.

Description

Osteoporosis compression fracture screening method based on multi-mode large language model Technical Field The invention relates to the technical field of early screening of osteoporosis vertebral compression fracture, in particular to a method for screening osteoporosis compression fracture based on a multi-mode large language model. Background The osteoporosis compression fracture (OVCF) is also called as osteoporosis vertebral body compression fracture, is one of the most common serious complications of osteoporosis, and mainly causes compression deformation of vertebral bodies under slight external force due to reduction of bone density and destruction of bone microstructure, so that chronic pain, spinal deformity and movement dysfunction are often caused, the life quality of patients is seriously affected, and the risk of secondary fracture is increased. Currently OVCF has three main clinical pain points, namely ① high hidden misdiagnosis risk ② high disabling disease progress and high fatal complication chain ③ diagnosis resource barriers. However, the early diagnosis of OVCF mainly depends on Magnetic Resonance Imaging (MRI), X-ray or CT imaging, wherein MRI is sensitive to early lesions such as bone marrow edema, but has the problems of expensive equipment, high examination cost, complex operation, low popularization rate of basic medical institutions and the like, and is difficult to meet the screening requirements of large-scale crowds, and X-ray has low cost, but has insufficient sensitivity to early minor fracture and is easy to miss diagnosis. In recent years, for osteoporotic vertebral compression fracture, physical examination methods based on functional action-induced pain response have been clinically proposed for preliminary screening of fracture risk under imaging-free conditions. The method generally guides the patient to finish specific actions such as supination, turning over, sitting up and the like, and judges whether the risk of vertebral compression fracture exists according to pain scoring results, and has the advantages of simplicity and convenience in operation and lower cost. However, such physical examination methods have not been widely popularized in community or home settings, and their application is still mainly limited to medical institutions. The method is mainly characterized in that firstly, the prior physical examination method highly depends on subjective scores of patients on pain degree in the evaluation process, different patients have larger difference on perception and expression of the pain, score deviation is easy to generate under the condition of lacking professional guidance so as to influence the reliability of a screening result, secondly, the examination method generally has higher requirements on the action sequence and posture standardization, patients are difficult to accurately master standard actions in a home environment so as to cause the examination result to be greatly influenced by action quality, and thirdly, in the process of completing functional actions such as turning over, sitting up and the like, the patients possibly have obvious pain response, even have risks of aggravating potential fracture or causing secondary injury, and lack of professional real-time evaluation and intervention in a non-medical environment so that the safety is difficult to ensure. Based on the problems, the physical examination mode which is finished by relying on manual work alone is difficult to meet the actual demands of home self-test and basic-level large-scale screening. With the development of artificial intelligence technology, the physical examination process is assisted and enhanced by utilizing intelligent perception and data analysis means, so that the method becomes an important direction for improving screening safety, objectivity and accessibility. The artificial intelligent model can make up for subjectivity deficiency of manual evaluation to a certain extent by analyzing the action gesture, the movement process and the pain related behavior reaction of a patient in real time, so as to realize timely identification and prompt of high-risk actions and abnormal pain reactions, and meanwhile, the subjective experience in the traditional physical examination is converted into quantifiable and reproducible objective indexes by comprehensively analyzing the multidimensional behavior characteristics, so that basis is provided for subsequent risk evaluation and medical decision. Therefore, how to comprehensively analyze the posture characteristics, the action quality and the pain related reaction of a patient in the physical examination process by utilizing an artificial intelligence technology under the condition of no need of professional imaging equipment and the whole-course participation of doctors, and to construct a safe, standardized and clinically significant early-stage screening method and system for the compression fracture of the osteoporos