CN-122025171-A - Intelligent diabetes question-answering system based on knowledge graph and large language model
Abstract
The invention discloses a diabetes intelligent question-answering system based on a knowledge graph and a large language model, which relates to the technical field of medical health intelligent service, and comprises a data integration module, a data processing module and a data processing module, wherein the data integration module is connected with authoritative resources and user data and forms a data pool after processing; the system comprises a pattern construction module, a model adaptation module, an interactive management and control module, an interactive optimization module, a dynamic upgrading knowledge pattern, algorithm parameters and a model, wherein the pattern construction module generates a multidimensional knowledge pattern according to the multidimensional knowledge pattern, the model adaptation module extracts a corpus fine tuning model and embeds a constraint mechanism, the interactive management and control module completes risk assessment and scheme adaptation, the iterative optimization module monitors updated information, the knowledge pattern, the algorithm parameters and the model are dynamically upgraded, and the system capacity is improved.
Inventors
- YANG WENYANG
- Dun peng
Assignees
- 西安石油大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. The intelligent diabetes question-answering system based on the knowledge graph and the large language model is characterized by comprising: the data integration module is used for accessing authoritative medical resources in the diabetes field and dynamic heterogeneous data of users, respectively carrying out format unification, semantic mapping and isomerism elimination treatment on the two types of data, and integrating the two types of data to form a data pool; the map construction module is used for defining the relation between the core semantic type and the entity based on the standardized data pool output by the data integration module, completing knowledge extraction, fusion and storage, and generating and outputting a diabetes multi-dimensional knowledge map fused with the real-time information of the user; The model adaptation module extracts special training corpus from the output knowledge graph, performs light fine adjustment on the general large language model by utilizing the special training corpus, and simultaneously embeds a constraint mechanism based on the knowledge graph to output a special large language model adapting to the diabetes field; the interaction management and control module is used for receiving user interaction input, calling standardized data and knowledge graph parameters, calculating the comprehensive risk value of the drug-diet-physiological state-complications of the user through a multidimensional sugar disease risk algorithm, calculating the fitness score of a personalized intervention scheme through a sugar intervention adaptation algorithm, combining two calculation results to complete risk assessment and scheme adaptation, and outputting an interaction result; And the iterative optimization module monitors the interactive data information synchronized by the external medical knowledge updating and interactive management and control module, performs supplementary updating on the generated knowledge graph, performs parameter calibration on the algorithm, performs iterative optimization on the output special large language model, and synchronizes the updated knowledge graph, algorithm parameters and the model to the corresponding modules to realize dynamic upgrading of system knowledge and capacity.
- 2. The intelligent diabetes question-answering system based on the knowledge graph and the large language model according to claim 1, wherein in the data integration module, the accessed authoritative medical resources in the diabetes field comprise diabetes diagnosis and treatment specifications, blood glucose reducing medicine specifications, food nutrition component data, medical research documents in the diabetes field and clinical diagnosis and treatment cases of diabetes, and the accessed dynamic heterogeneous data of the user comprise diabetes typing, duration of disease, records of past complications, details of daily meals, dietary time periods, blood glucose data, heart rate data, daily exercise data, night sleep duration and pressure state records.
- 3. The diabetes intelligent question-answering system based on the knowledge graph and the large language model is characterized in that in the data integration module, format unification, semantic mapping and isomerism elimination processing are respectively carried out on two types of data, wherein the format unification is to convert resources in different original formats into a structured text format and standardize field naming and field arrangement sequences, the semantic mapping is to extract a relation between a core entity and an entity in the resources, a semantic mapping table is established to map the same medical concept of different expressions into unified semantic tags, the isomerism elimination is to unify different term expressions of the same entity in different source authoritative resources into standard medical terms, the format unification is to uniformly reserve the same decimal numbers for numerical physiological data, the time-related data are unified into the same timestamp format, the text data are unified into a fixed description specification, the semantic mapping is to convert unstructured text user data into structured items containing entity, attribute values and corresponding semantic attribute tags in association with the numerical user data, the isomerism elimination is to unify acquired user data of different acquired equipment, different acquired user data and different types are unified into standard marked data, and the standard data of the same type is marked with the standard data is eliminated.
- 4. The diabetes intelligent question-answering system based on the knowledge graph and the large language model according to claim 1 is characterized in that in the graph construction module, a core semantic type and entity relation are defined, knowledge extraction, fusion and storage are completed, wherein the core semantic type comprises diseases, medicines, foods, complications, movement types, diet time periods, physiological indexes, high-incidence crowd, diagnosis and treatment means and data acquisition equipment, the entity relation comprises treatment relation of diseases and medicines, concurrence relation of diseases and complications, tabulation relation of medicines and foods, influence relation of foods and physiological indexes, regulation relation of movement types and physiological indexes, incidence relation of diseases and high-incidence crowd and corresponding relation of physiological indexes and normal reference values, the knowledge extraction adopts a mode of combining structured data direct mapping with unstructured data semantic analysis, the core entity attribute and entity relation are extracted from a standardized data pool to form a structured knowledge triplet, the knowledge fusion is realized through data source priority judgment, knowledge consistency verification and redundancy knowledge rejection, data source priority is divided according to authority degree, the incidence relation of medicines and physiological indexes is verified, the incidence relation of the movement types and physiological indexes is adjusted, the incidence relation of the diseases and the physiological indexes is stored in a partition and storage and index storage and real-time index information is divided into a partition mode.
- 5. The diabetes intelligent question-answering system based on a knowledge graph and a large language model according to claim 1, wherein the structure of the diabetes multi-dimensional knowledge graph in the graph construction module comprises a core entity layer, a relation association layer, an attribute description layer, a user association layer and an index structure, and the specific contents are that the core entity layer comprises diseases, medicines, foods, complications, exercise types, diet periods, physiological indexes, high-altitude population, diagnosis and treatment means and data acquisition equipment core entities, the relation association layer comprises association relation data among the core entities, specifically comprises treatment relation of diseases and medicines, concurrence relation of diseases and complications, tabu relation of medicines and foods, influence relation of foods and physiological indexes, regulation relation of exercise types and physiological indexes, association relation of diseases and high-altitude population, correspondence relation of physiological indexes and normal reference values, the attribute description layer comprises intrinsic data of the core entities, the disease entities comprises disease names, pathogenesis, and medical standards, the medicine entities comprises medicine names, components, dosage forms, application method dosages, the food entities comprises food names, nutrition components, elevation sugar, the entity comprises measurement unit comprises the corresponding relation between the core entities, tabu relation between the food and physiological indexes, the influence relation between the food and physiological indexes, regulation relation between the exercise types and physiological indexes comprises the regulation relation between the physiological indexes, the association relation between the exercise types and the physiological indexes comprises the normal reference values, the physical index is recorded according to the real-time relation between the user attribute information, the physical index is recorded, and the physical index is recorded by the real-time relation between the physical index is recorded, and the physical index is recorded by the physical relation between the physical entity is, and (5) storing the relation data in an associated way.
- 6. The diabetes intelligent question-answering system based on the knowledge graph and the large language model is characterized in that in the model adaptation module, the special training corpus is utilized to conduct light-weight fine tuning on the general large language model, the special training corpus is preprocessed, invalid corpus is removed, corpus labeling is conducted, the labeled corpus is divided into a training set and a verification set, a light-weight fine tuning strategy is then determined, bottom network parameters of the general large language model are frozen, only parameters of a full-connection layer and an attention layer of a top layer of the model are adjusted, meanwhile, an adaptation layer is introduced to assist fine tuning, reasonable training parameters are set, the learning rate is dynamically adjusted by adopting a gradient descent strategy, fine tuning training is conducted on the basis of the divided training set, the model performance is monitored in real time by the verification set in the training process, training is stopped when performance indexes on the verification set are continuously and repeatedly not lifted, and light-weight fine tuning of the general large language model is completed.
- 7. The diabetes intelligent question-answering system based on the knowledge graph and the large language model according to claim 1 is characterized in that the specific content of the constraint mechanism embedded based on the knowledge graph is that a constraint rule base is built based on the diabetes multi-dimensional knowledge graph, the content of the constraint rule base covers entity association constraint, attribute range constraint and relationship rationality constraint in the knowledge graph, wherein the entity association constraint is clear of legal association types and forbidden association types among different entities, the attribute range constraint limits a legal value interval of each entity attribute, the relationship rationality constraint defines establishment conditions of association relationships among the entities, the constraint rule base is then converted into constraint characterization identifiable by the model, the constraint rule base is integrated into a general large language model framework in a mode of adding the constraint adaptation layer, the model can be called in a training and reasoning process, the constraint matching degree is used as an additional training target in a model training stage, the model output result violating constraint punishment rule is conducted, the model is guided to learn to conform to an output mode of knowledge graph logic, in a model reasoning stage, the constraint verification link is conducted on the constraint rule, and finally output is conducted in real time according to the constraint rule correction result if the constraint rule violation exists, and filtering result is additionally conducted according to the rule.
- 8. The intelligent diabetes question-answering system based on knowledge graph and large language model according to claim 1, wherein the mathematical expression of the multidimensional diabetes risk algorithm in the interactive control module is: Wherein, the method comprises the steps of, In order to integrate the risk value(s), As a basis weight for drug-food interactions, As a factor of disturbance of drug-food metabolism, As the GI value of the current diet, For a safe Gl threshold for diabetics, For the weight of the physiological index, For the current physiological index value of the user, Is the normal reference value of the physiological index, In order to amplify the coefficient of risk of complications, For a synergistic risk factor for complications with diet/medication, As a time-decay factor, The time interval between the administration of the medicine and the diet for the user.
- 9. The intelligent diabetes question-answering system based on knowledge graph and large language model according to claim 1, wherein the mathematical expression of the sugar intervention adaptation algorithm in the interactive management and control module is: Wherein, the method comprises the steps of, For the intervention plan fitness score, A comprehensive risk value calculated for a multidimensional sugar disease risk algorithm, 、 、 Is a weight coefficient and satisfies , For the dietary alternative to be viable to score, For the sports plan to be a feasibility score, For the exercise-glycemic control co-factor, For the user history scheme compliance factor, The matching degree between the scheme and the knowledge graph is obtained.
- 10. The diabetes intelligent question-answering system based on the knowledge graph and the large language model according to claim 1 is characterized in that the specific content of completing risk assessment and scheme adaptation by combining two calculation results in the interaction management and control module is that a comprehensive assessment score calculation rule is firstly set, the comprehensive risk value weight is determined to be 0.6 and the adaptation degree scoring weight is 0.4 according to the influence priority of two indexes on diabetes risk management and control, the comprehensive assessment score is calculated by a weighted sum formula that the comprehensive assessment score = comprehensive risk value x 0.6+ adaptation degree score x 0.4, a three-level threshold is preset, the low risk threshold is less than or equal to 60 minutes, the middle risk threshold is 61-85 minutes, the high risk threshold is more than or equal to 86 minutes, the comprehensive assessment score is compared with the threshold, the low risk is judged to pass adaptation, the corresponding personalized intervention scheme and risk prompt are output, the middle risk is judged to be optimized, the adaptation degree is recalculated after the detail of the intervention scheme is regulated by combining the knowledge graph parameters until the comprehensive assessment score is reduced to be within the low risk threshold, the high risk is judged to fail, the high risk is judged, the high risk is controlled to be higher, and the risk is recommended to be carried out after the risk management and control is reduced.
Description
Intelligent diabetes question-answering system based on knowledge graph and large language model Technical Field The invention relates to the technical field of medical health intelligent service, in particular to a diabetes intelligent question-answering system based on a knowledge graph and a large language model. Background Diabetes is a chronic disease requiring long-term intervention, health management of the diabetes is related to multiple dimensions of diagnosis and treatment standard compliance, reasonable use of medicines, diet movement regulation, physiological index monitoring and the like, extremely high demands are made on continuity and accuracy of professional medical guidance, along with deep development of medical informatization, the diabetes field accumulates massive authoritative medical resources, multiple contents including diagnosis and treatment standards, medicine information, nutrition data, clinical cases and the like are covered, meanwhile, the popularization of various health monitoring equipment enables acquisition of personal dynamic health data of a user to be convenient, the acquisition of the physiological index, diet movement records, complications and the like, knowledge graph technology can effectively integrate scattered field knowledge by means of strong structural knowledge representation and association capability, support is provided for knowledge management in the medical field, a large language model shows outstanding advantages in aspects of natural language interaction, professional content generation, information extraction and the like, a technical foundation is laid for constructing an intelligent diabetes service system by fusion application of the two, and important consultation and response and assessment of diabetes quality are achieved by means of intelligent technology, and the important consultation and the diabetes response and the individual intervention quality are improved. The related technology and service of traditional diabetes have many limitations in practical application, in the data processing level, authoritative medical resources and dynamic data of users often show heterogeneous characteristics, sources are dispersed, formats are different, semantic expressions are different, an effective integration mechanism is lacked, knowledge and data are difficult to form cooperative support, unified and comprehensive data bases cannot be provided for subsequent intelligent analysis, in the knowledge application level, the knowledge representation form of a traditional system is relatively single, complex correlations among various entities such as diseases, medicines, foods and physiological indexes in the diabetes field are difficult to fully describe, dynamic fusion of real-time information of users is lacked, the pertinence of knowledge services is insufficient, in the model adaptation level, a general language model is not optimized in the special field, professional constraints in the medical field are lacked, the problem of lack of standardization and inaccuracy exists in output results, requirements of the diabetes field on the speciality and the strict performance are difficult to meet, in the service mode level, the traditional risk assessment is limited in the single dimension, the history of intervention scheme is lacked on comprehensive amounts of factors such as individual differences of users, compliance and the like, the system is difficult to update, the medical research is not advanced, the accurate and the requirements of the accurate dynamic management of diabetes is difficult to fully manage the diabetes in time, and the accurate requirements of the diabetes are difficult to fully manage. Disclosure of Invention The invention aims to make up the defects of the prior art, provides a diabetes intelligent question-answering system based on a knowledge graph and a large language model, integrates authoritative medical resources in the diabetes field and user dynamic data, builds a multidimensional knowledge graph fusing real-time information of a user after standardized processing, extracts a special structured corpus to carry out light fine adjustment on a general large language model, embeds a knowledge graph constraint mechanism to form a field special model, links related data and graph parameters through an interaction management and control module, completes user comprehensive risk assessment and personalized intervention scheme adaptation, tracks external medical knowledge update and user interaction data by means of an iteration optimization module according to a risk level dynamic optimization scheme, realizes dynamic upgrading of system knowledge, algorithm and model, can provide diagnosis and treatment assistance for medical staff, provides health management guidance for patients, and improves the speciality, accuracy and persistence of diabetes management. The invention provides a technical scheme for solving the techni