CN-121983293-A - Multi-mode agent-based cerebral vascular lesion auxiliary evaluation system and method

CN121983293ACN 121983293 ACN121983293 ACN 121983293ACN-121983293-A

Abstract

The invention relates to a cerebral vascular lesion auxiliary evaluation system and method based on a multi-mode intelligent body, belongs to the technical field of artificial intelligence and intelligent medical data processing, and solves the problems that in the prior art, depth fusion and conflict verification cannot be carried out with clinical texts on the premise of keeping complete spatial information of three-dimensional medical images, and deterministic safety guarantee is lacking. The system comprises a data processing module, a cross-mode consistency check agent, a visual query instruction, a visual refocusing and feature inverse check agent and a multi-factor collaborative reasoning and decision agent, wherein the data processing module is used for acquiring three-dimensional image data and electronic medical record text data of a patient and generating structured image feature data and structured clinical feature data, the cross-mode consistency check agent is used for carrying out medical logic consistency evaluation to obtain a consistency evaluation result, the visual query instruction is generated if the consistency evaluation result meets a preset condition, the visual refocusing and feature inverse check agent generates a structured inverse check result, and the multi-factor collaborative reasoning and decision agent is used for generating an auxiliary evaluation result through an inverse fact reasoning and safety gating mechanism. The auxiliary evaluation of the cerebrovascular diseases is realized.

Inventors

SONG XIAOWEI
WU JI
WU JIAN
LI MIAO
YANG SHAOSHUAI

Assignees

北京清华长庚医院
清华大学

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (10)

1. A multi-modal agent-based auxiliary assessment system for cerebrovascular lesions, comprising: The data processing module is used for acquiring three-dimensional non-enhanced CT image data of the cerebral vessels of the patient and corresponding electronic medical record text data and generating structured image feature data and structured clinical feature data; The cross-modal consistency check agent is used for carrying out medical logic consistency evaluation on the structural image feature data and the structural clinical feature data to obtain a consistency evaluation result; The visual refocusing and characteristic inverse checking agent is used for carrying out directional rechecking on the three-dimensional non-enhanced CT image according to the visual query instruction to generate a structured inverse checking result; And the multi-factor collaborative reasoning and decision-making agent is used for fusing the structural image feature data, the structural clinical feature data and/or the structural inverse checking result and generating a final auxiliary evaluation result through an inverse fact reasoning and safety gating mechanism.
2. The multi-modal agent based assisted assessment system of cerebrovascular disease of claim 1, wherein the data processing module comprises: The data receiving module is used for acquiring three-dimensional non-enhanced CT image data of the cerebral vessels of the patient and corresponding electronic medical record text data; The primary image interpretation agent is used for carrying out structural analysis on the three-dimensional non-enhanced CT image, extracting image features related to cerebrovascular lesions and generating structural image feature data; And the clinical information structuring agent is used for carrying out semantic analysis on the electronic medical record text and converting unstructured medical record information into structured clinical feature data of predefined fields.
3. The multi-modal agent based assisted assessment system of cerebrovascular disease of claim 2, wherein the primary image interpretation agent comprises: The preprocessing unit is used for performing physical value conversion, spatial standardization and three-dimensional data construction on the three-dimensional non-enhanced CT image to generate a standardized three-dimensional image tensor; an anatomical partitioning unit for partitioning the normalized three-dimensional image tensor into a plurality of anatomical regions based on predefined brain anatomical partitioning rules, generating region feature data with anatomical region identification; and the multi-mode large model reasoning unit is used for carrying out feature extraction on each anatomical region based on the region feature data with the anatomical region identification and generating structural image feature data.
4. The multi-modality agent based cerebrovascular disease auxiliary assessment system as claimed in claim 3, wherein the structured image feature data comprises at least one abnormal item, each abnormal item comprising structured image feature data of abnormal item cues, anatomical locations, abnormal coordinate ranges, density features and morphological features; The multi-modal large model reasoning unit is obtained by training the multi-modal large model, wherein a mixed source multi-task training mechanism is adopted for training, and training tasks are tasks of report generation, visual question-answering, target positioning and cross-modal alignment.
5. The multi-modal agent-based assisted assessment system of cerebrovascular diseases according to claim 1, wherein the clinical information structuring agent is constructed based on a large language model, extracts key information from electronic medical record text by means of prompt word engineering or supervised fine tuning, and converts the key information into standardized structured clinical feature data, wherein the structured clinical feature data comprises patient basic demographic information, stroke related key time axis, clinical manifestation and nervous system assessment, past history and risk factors, contraindications and medication history, laboratory and imaging examination results, logical consistency and data quality assessment.
6. The multi-modal agent-based assisted assessment system for cerebrovascular diseases according to claim 1, wherein the cross-modal consistency verification agent is constructed based on a large language model, a natural language reasoning framework is adopted, the structural clinical feature data is used as a reasoning premise, the structural image feature data is used as a reasoning hypothesis, the logic matching degree in neuroanatomy is assessed, and a consistency assessment result is output, wherein when the consistency assessment result meets a preset condition, the cross-modal consistency verification agent generates a visual query instruction based on a neuroanatomy priori rule and the structural clinical feature data, and the visual query instruction comprises a target side, a target anatomy region identifier and an image symptom type to be verified.
7. The multi-modal agent based assisted assessment system of cerebrovascular disease of claim 1, wherein said visual refocusing and feature review agent comprises: The pre-sequence data processing module is used for positioning and extracting a three-dimensional region of interest to be rechecked from the three-dimensional non-enhanced CT image based on the visual query instruction, and generating a sub-tensor of the region; The fine-grained multi-mode reasoning module is used for carrying out directional feature extraction and semantic understanding on the three-dimensional region of interest to be checked based on the visual query instruction and the sub-tensor of the three-dimensional region of interest to generate a structured anti-checking result, wherein the structured anti-checking result comprises refined image features and local state evaluation of the checked region.
8. The multi-modal agent based assisted assessment system of cerebrovascular disease of claim 7, wherein the pre-data processing module generates sub-tensors of the three-dimensional region of interest to be reviewed by: extracting a target side, a target anatomical region identifier and an image symptom type to be verified in the visual query instruction, and obtaining a standardized three-dimensional image tensor based on the three-dimensional non-enhanced CT image; Determining a three-dimensional coordinate range of the target anatomical region in the standardized three-dimensional image tensor according to the target anatomical region identifier; And cutting out a corresponding sub-tensor from the standardized three-dimensional image tensor according to the three-dimensional coordinate range, carrying out high-resolution resampling on the sub-tensor, and outputting the enhanced sub-tensor as the sub-tensor of the three-dimensional region of interest to be checked.
9. The multi-modal agent based assisted assessment system of cerebrovascular disease of claim 7, wherein the multi-factor collaborative reasoning and decision agent comprises: The information fusion unit is used for splicing the structural image characteristic data, the structural clinical characteristic data and/or the structural inverse checking result to form multi-source heterogeneous information evidence; the system comprises a multi-source heterogeneous information evidence and a counterfactual reasoning unit, wherein the multi-source heterogeneous information evidence is used for judging whether the multi-source heterogeneous information evidence is in a preset clinical diagnosis and treatment standard or not, and if so, the multi-source heterogeneous information evidence is in a preset clinical diagnosis and treatment standard; The safety gating unit is used for traversing the contraindication related fields in the structural clinical characteristic data, comparing the contraindication related fields with a preset clinical treatment safety rule and generating a structural contraindication verification result; The result generation unit is used for generating a final auxiliary evaluation result based on the multi-source heterogeneous information evidence, the preliminary diagnosis conclusion and the verification result thereof and the structured tabu check result, wherein the final auxiliary evaluation result comprises a final diagnosis conclusion, a severity grade, a final treatment suggestion, an reasoning path description, a logic confidence degree, a safety early warning list and whether manual review is needed or not.
10. The auxiliary evaluation method for the cerebrovascular diseases based on the multi-mode intelligent agent is characterized by comprising the following steps of: Acquiring three-dimensional non-enhanced CT image data of a patient cerebral blood vessel and corresponding electronic medical record text data, and generating structured image feature data and structured clinical feature data; medical logic consistency assessment is carried out on the structural image feature data and the structural clinical feature data to obtain a consistency assessment result, and if the consistency assessment result meets the preset condition, a visual query instruction is generated; and fusing the structural image feature data, the structural clinical feature data and/or the structural inverse checking result, and generating a final auxiliary evaluation result through an inverse fact reasoning and safety gating mechanism.

Description

Multi-mode agent-based cerebral vascular lesion auxiliary evaluation system and method Technical Field The invention relates to the technical field of artificial intelligence and intelligent medical data processing, in particular to a cerebral vascular disease auxiliary evaluation system and method based on a multi-mode intelligent body. Background Stroke (Stroke) is an acute cerebrovascular disease caused by interruption of cerebral blood flow due to sudden rupture or blockage of cerebral blood vessels. The disease has extremely high time sensitivity, and the clinical treatment often emphasizes that the time is the brain. In the emergency procedure, the doctor must accurately identify the ischemic stroke and the hemorrhagic stroke by rapidly combining the imaging examination (mainly based on non-enhanced CT) and clinical history information of the patient within a very narrow time window (usually within 4.5 hours after the onset of the disease), so as to formulate a targeted treatment scheme (such as intravenous thrombolysis). This process places stringent demands on the accuracy and efficiency of the diagnosis. Current auxiliary diagnostic techniques are mainly deployed around single-modality data. In terms of medical image analysis, most studies focus segmentation or classification of NCCT images based on convolutional neural networks (e.g., U-Net, resNet) or vision transformers, for example, using 3D CNN to identify large vessel occlusions. In the aspect of clinical text analysis, information such as symptoms, complaints, neural function scores and the like is extracted from electronic medical records mainly by means of natural language processing technologies (such as BERT and RNN). To overcome the single-mode limitation, the multi-mode fusion method is receiving attention. In the early technology, static fusion of images and texts is realized by adopting modes such as feature stitching and the like, but cross-mode deep interaction is lacked. In addition, with the development of multi-modal large language models, auxiliary diagnosis using a general model (such as GPT-4V) has been attempted, but the model is mainly directed to natural image design and has limited support capability for medical 3D images. To adapt to such models, the 3D image is often subjected to a dimension reduction process (e.g., selecting a middle slice or maximum intensity projection), resulting in loss of spatial information along the scan direction. The existing method is insufficient in sensitivity of the model which only depends on images in cerebral apoplexy emergency scenes to ultra-early fine signs (such as gray matter boundary blurring, island band signs and the like), missed diagnosis is easy to occur, and misjudgment risk is high due to the fact that the model which only depends on texts is similar to ischemic apoplexy symptoms, and erroneous treatment is possibly caused. Secondly, the traditional feature stitching mode lacks dynamic cross-mode verification capability, when the image performance obviously conflicts with clinical symptoms, the model is difficult to realize a 'recheck' reasoning closed loop similar to a clinician, and the diagnosis robustness is limited. In addition, the existing general multi-mode model is mainly optimized for 2D images, is weak in support of spatial continuity of 3D medical images, is easy to lose cross-layer micro focus characteristics in dimension reduction processing, and is limited in application in fine diagnosis. And when the universal large language model is directly adopted for auxiliary diagnosis, strict medical safety gating is often lacking, contraindications cannot be systematically arranged, the reasoning process does not accord with a clinical guideline path, the risks of 'phantom' output and poor interpretation exist, and the clinical credibility is insufficient. Disclosure of Invention In view of the above analysis, the embodiment of the invention aims to provide a cerebral vascular lesion auxiliary evaluation system and a cerebral vascular lesion auxiliary evaluation method based on a multi-mode intelligent agent, which are used for solving the problems that the existing method cannot carry out deep fusion and conflict verification with clinical texts on the premise of keeping the complete spatial information of a three-dimensional medical image and lacks of certainty safety guarantee conforming to medical specifications. In one aspect, an embodiment of the present invention provides a multi-modal agent-based auxiliary evaluation system for cerebrovascular diseases, including: The data processing module is used for acquiring three-dimensional non-enhanced CT image data of the cerebral vessels of the patient and corresponding electronic medical record text data and generating structured image feature data and structured clinical feature data; The cross-modal consistency check agent is used for carrying out medical logic consistency evaluation on the structural image featur