CN-122025086-A - Diagnosis and treatment decision reasoning method and system integrating LLM (logical link model) and knowledge graph

CN122025086ACN 122025086 ACN122025086 ACN 122025086ACN-122025086-A

Abstract

The invention relates to the technical field of diagnosis and treatment decision reasoning of brain diseases, in particular to a diagnosis and treatment decision reasoning method integrating LLM and a knowledge graph, which comprises the following steps of extracting a plurality of standard entities at least comprising symptom entities and case entities in brain disease cases by adopting a language model; the method comprises the steps of taking a symptom entity and a case entity as a plurality of suspected diseases and candidate diseases corresponding to a retrieval knowledge graph, calculating the disease similarity between each suspected disease and each candidate disease, defining the candidate diseases with the disease similarity being more than or equal to a similarity threshold value and all the suspected diseases as potential diseases to form a potential disease set, setting each potential disease in the potential disease set as a starting node, building a directed correlation subgraph by means of the knowledge graph, defining a plurality of checking means corresponding to each potential disease in the directed correlation subgraph as evaluation means, and screening the evaluation means with effective discrimination capability to build the evaluation means set.

Inventors

Sun Leyong
REN XIAOHUI
ZHAO YANLEI

Assignees

众保健康科技服务(济南)有限公司

Dates

Publication Date: 20260512
Application Date: 20260123

Claims (10)

1. A diagnosis and treatment decision reasoning method integrating LLM and knowledge graph is characterized by comprising the following steps: s1, extracting a plurality of standard entities at least comprising symptom entities and case entities from brain disease cases by adopting a language model; s2, searching a plurality of suspected diseases and candidate diseases corresponding to the knowledge patterns by taking the symptom entity and the case entity as retrieval knowledge patterns respectively to form a corresponding suspected disease set and a corresponding candidate disease set; defining candidate diseases with the disease similarity greater than or equal to a similarity threshold value in the candidate disease set, and constructing a plurality of potential diseases together into a potential disease set by using all the suspected diseases in the suspected disease set as potential diseases; S3, setting each potential disease in the potential disease set as an initial node in sequence, and constructing a directed associated subgraph comprising potential diseases, checking means and identification features by means of a knowledge graph; And S4, calling out each potential disease in each directed associated subgraph and a plurality of inspection means corresponding to each potential disease, defining each inspection means as an evaluation means in turn, carrying out targeted analysis on the evaluation means, screening the evaluation means with effective discrimination capability to construct an evaluation means set, and analyzing the priority of each evaluation means in the evaluation means set.
2. The diagnosis and treatment decision reasoning method for fusing the LLM and the knowledge graph according to claim 1, wherein the step S1 extracts a plurality of standard entities including at least symptomatic entities and case entities, and the specific working steps are as follows: preprocessing unstructured text, wherein the preprocessing at least comprises word segmentation, stop word removal and medical term word segmentation correction, so as to obtain a text sequence; step two, using the text sequence obtained after pretreatment as input, positioning fragments belonging to medical entities in the text sequence, and outputting an original entity set comprising a plurality of original entities; step three, matching the original entity in the original entity set with the standard entity in the standardized medical term library, wherein the mapped and matched original entity is the standard entity conforming to the medical specification.
3. The diagnosis and treatment decision reasoning method for fusing LLM and the knowledge graph according to claim 2, wherein the step S2 is characterized in that for a plurality of entities in the step S1, a potential disease set is constructed by screening potential diseases related to brain diseases, specifically, the symptom entity in the step S1 is received, the symptom entity is adopted to search brain disease nodes directly related to the symptom entity in the knowledge graph, each brain disease node directly related to the symptom entity is defined as a suspected disease, and all the suspected diseases are jointly constructed as the suspected disease set; And step S2, after the suspected disease set is constructed, introducing the case entity in the step S1 to construct a candidate disease set, specifically, receiving the case entity in the step S1, searching brain disease nodes taking the case entity as risk factors in the knowledge graph by adopting the case entity again, defining each brain disease node obtained by searching the case entity as a candidate disease, and constructing the candidate disease set by all candidate diseases together.
4. The diagnosis and treatment decision inference method based on LLM and knowledge graph fusion according to claim 3, wherein step S2 sequentially calculates the disease similarity between each candidate disease in the candidate disease set and each suspected disease in the suspected disease set, sets a similarity threshold, and compares the disease similarity with the similarity threshold: If the disease similarity is more than or equal to a similarity threshold, the candidate disease and the corresponding suspected disease are indicated to have a multiple causal and co-occurrence relationship; If the disease similarity is less than the similarity threshold, it is indicated that the candidate disease is significantly different from the suspected disease in the diagnostic dimension.
5. The diagnosis and treatment decision reasoning method based on the LLM and the knowledge graph fusion of claim 4, wherein the step S2 is characterized in that a plurality of potential diseases are jointly constructed into a potential disease set, specifically, after the calculation of the disease similarity between each candidate disease and each suspected disease is completed, candidate diseases with the disease similarity greater than or equal to a similarity threshold in the candidate disease set and all suspected diseases in the suspected disease set are defined as potential diseases, so as to form the potential disease set.
6. The diagnosis and treatment decision making reasoning method based on the LLM and the knowledge graph, as set forth in claim 5, wherein the step S3 is characterized in that a directed association subgraph is constructed, specifically, each potential disease in the set of potential diseases is set as an initial node in sequence, the directed association subgraph is constructed by relying on the directed side association relationship among the potential diseases, the checking means and the identifying characteristics which are already stored in a structured manner in the knowledge graph, and the specific steps are as follows: sequentially inputting each potential disease in the potential disease set to a knowledge graph, matching corresponding disease nodes in the knowledge graph, traversing all checking means nodes which are directly and directionally associated with the disease nodes in the knowledge graph from the disease nodes, establishing a first-level directional link between the potential disease and the checking means, and reserving association attributes corresponding to the links; step two, for each associated checking means, further traversing all authentication feature nodes directly and directionally associated with the checking means in the knowledge graph, establishing a secondary directed link between the checking means and the authentication features, and synchronously reserving link association attributes; And thirdly, integrating the potential diseases, the checking means, the identification features and the complete directed links corresponding to the single potential diseases to form local directed subgraphs exclusive to each potential disease, and summarizing and merging all the local subgraphs to finally form the directed associated subgraphs comprising all the potential diseases, the corresponding checking means, the exclusive identification features and the complete associated logic.
7. The diagnosis and treatment decision reasoning method for fusing LLM and knowledge graph according to claim 6, wherein the step S4 is characterized by adapting the inspection means of different diseases in the directed associative subgraph by calling, defining each inspection means as an evaluation means in turn, performing targeted analysis on the evaluation means, screening the evaluation means with effective discrimination capability to construct an evaluation means set, and analyzing the priorities of the evaluation means, and the specific working steps are as follows: Step one, receiving the potential disease set output in the step S2 and a plurality of inspection means corresponding to all potential diseases in the potential disease set, counting the number of the potential diseases which can be covered by each inspection means, sequencing all the inspection means according to the number of the coverage, sequentially selecting the inspection means to be evaluated according to the sequencing, and defining the selected inspection means as the evaluation means; Step two, aiming at the evaluation means, randomly selecting one potential disease from the potential diseases covered by the evaluation means as a target disease, and defining the rest potential diseases which are not selected as control diseases; setting an exclusive intensity threshold value, screening the evaluation means of the exclusive intensity > intensity threshold value, and counting the number of the control diseases corresponding to the evaluation means; Calling out the number of the corresponding comparison diseases of each evaluation means and each comparison disease, taking the set which has the least selected evaluation means and completely covers the potential diseases as a constraint, and selecting part of evaluation means to form an evaluation means set; step five, calling out each evaluation means in the evaluation means set to the corresponding potential diseases to form a potential disease subset, and distributing the priority of each evaluation means in the evaluation means set according to the number of the potential diseases included in the potential disease subset.
8. The diagnosis and treatment decision reasoning method for fusing LLM and knowledge graph according to claim 7, wherein the evaluating means in step S4 distinguishes the exclusive intensity of the target disease-each control disease, specifically: Determining the specific identification characteristics of the evaluation means, and analyzing the specific identification characteristics of the target diseases and the control diseases by the evaluation means; Calculating a feature specificity score for counting the occurrence probability of the specific identification feature in the control disease; calculating a disease discrimination score for judging whether the specific discrimination characteristics can clearly discriminate the target disease from the control disease; the weighted sum integrates the feature specificity score and the disease differentiation score to obtain the exclusive intensity.
9. The diagnosis and treatment decision reasoning method based on the LLM and the knowledge graph fusion of claim 8 is characterized in that in the step S4, specific identification features of the assessment means are determined, specifically, all identification features corresponding to the assessment means in the directed associative subgraph are received, and the specific identification features of the target disease and the control disease are distinguished by the analysis and assessment means, namely, specific features of the target disease which are specifically recommended in reference to authoritative medical guidelines and clinical consensus are analyzed.
10. The diagnosis and treatment decision reasoning system fusing LLM and a knowledge graph as set forth in claim 1 is applied to any one of the diagnosis and treatment decision reasoning methods fusing LLM and a knowledge graph as set forth in claims 1-9, and is characterized by comprising: A standard entity extraction module (100) for extracting a plurality of standard entities including at least symptomatic entities and case entities in brain disease cases by using a language model; The potential disease set construction module (200) respectively takes the symptom entity and the case entity as a plurality of suspected diseases and candidate diseases corresponding to the retrieval knowledge graph to form a corresponding suspected disease set and a candidate disease set; defining candidate diseases with the disease similarity greater than or equal to a similarity threshold value in the candidate disease set, and constructing a plurality of potential diseases together into a potential disease set by using all the suspected diseases in the suspected disease set as potential diseases; The directed associated sub-graph construction module (300) sequentially sets each potential disease in the potential disease set as an initial node, and constructs a directed associated sub-graph comprising the potential disease, the checking means and the identification characteristic serving as a core link by relying on a knowledge graph; And the evaluation means screening and sorting module (400) calls out each potential disease in each directed associated subgraph and a plurality of inspection means corresponding to each potential disease, sequentially defines each inspection means as an evaluation means, carries out targeted analysis on the evaluation means, screens the evaluation means with effective discrimination capability to construct an evaluation means set, and analyzes the priority of each evaluation means in the evaluation means set.

Description

Diagnosis and treatment decision reasoning method and system integrating LLM (logical link model) and knowledge graph Technical Field The invention relates to the technical field of diagnosis and treatment decision reasoning of brain diseases, in particular to a diagnosis and treatment decision reasoning method and system integrating LLM and a knowledge graph. Background The diagnosis and treatment decision reasoning is a process of combining clinical information, medical knowledge and evidence-based evidence of a patient, defining potential diseases of the patient through systematic reasoning, screening core examination means and assisting in making a diagnosis and treatment scheme; In the diagnosis and treatment decision-making reasoning process for the brain diseases, the traditional means firstly inquires the corresponding symptom of the patient through a doctor, then combines the symptom described by the patient and the historical cases in the brain disease cases, and the doctor judges the possible diseases of the patient according to the clinical experience of the doctor and recommends routine examination items (such as general imaging examination, basic laboratory examination and the like) for the judgment result; However, because the pathological mechanism of brain diseases is complex, clinical representation of various diseases is highly overlapped (for example, single limb weakness can correspond to various diseases such as cerebral infarction, cerebral hemorrhage, brain tumor and the like), although doctors can further diagnose by means of historical cases in the diagnosis process, the historical cases are in unstructured text form, if the historical cases are too many, the doctors need to spend a great deal of time to manually screen key information related to the current patient disease from massive texts in the analysis process, core identification points of similar cases and evidence-based medical evidence support are difficult to quickly locate, and meanwhile, the association logic of symptoms-pathology-diseases in the cases cannot be integrated efficiently, key information omission or focus dispersion judgment is easily caused by information overload, the conditions of low case matching efficiency, insufficient identification basis and prolonged diagnosis period occur, and diagnosis omission risks caused by subjectivity and limitation of manual analysis are even possible, so that the accuracy and efficiency of diagnosis and diagnosis are influenced; So the prior means is to extract standardized medical entities (including symptomatic entities, case entities, etc.) from unstructured text of brain disease cases (such as patient complaints, inquiry records, examination report descriptions, etc.) more quickly, so the knowledge graph is constructed (integrating structural association of symptom-pathology-disease, medical history-risk disease, disease-examination means-identifying features, labeling associated strength, specificity, etc.) by training language models (relying on authoritative clinical diagnosis and treatment guidelines, evidence-based medical documents, medical entity identification and text understanding ability of standardized case library optimization models) of the adaptation medical field; The specific working principle of training the adaptive medical field language model is that authoritative medical data (clinical diagnosis and treatment guidelines, evidence-based medical documents, standardized case libraries and medical term libraries) is used as a core training corpus, firstly, the language materials are preprocessed (word segmentation, stop word removal, medical term correction and standardization, and non-medical irrelevant content is filtered), then the field adaptation fine tuning is carried out based on a general pre-training language model (such as ERT and GPT series), the model learns the semantics of the medical term, clinical text logic and entity association rules through tasks such as mask language modeling, medical entity identification and text classification, and then special optimization of the medical field (the supervised fine tuning is carried out by combining the entity and relation consistency output by a knowledge graph constraint model and adopting high-quality data marked by medical experts) is introduced, and finally, the model performance is verified through indexes such as medical entity identification accuracy, clinical text understanding accuracy, medical question answering accuracy and the like, and the iterative optimization is carried out until the requirements of clinical application are met, so that the adaptive medical field language model is formed. The working principle of constructing the knowledge graph is that evidence-based medical evidence (such as ohrne database, core medical journal research) and clinical diagnosis and treatment guide are taken as core authoritative data sources, unstructured guide text and document content are