CN-122025100-A - Traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing

CN122025100ACN 122025100 ACN122025100 ACN 122025100ACN-122025100-A

Abstract

The invention discloses a traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing, and particularly relates to the technical field of medical information processing; the method comprises the steps of performing segmentation on a Chinese medical record text of a patient to construct a symptom statement sequence data set, analyzing the co-occurrence relation and appearance position distribution of the symptom statement in the description context of the patient to generate context correlation characteristics of the symptom statement, identifying functional role differences of the same symptom under different description contexts according to the context correlation characteristics of the symptom statement to generate symptom context role marking data, dynamically reassigning participation degrees of symptom characteristics in different symptom type reasoning processes based on the symptom context role marking data, restraining and updating a Chinese medicine symptom language model, inputting the Chinese medical record text of the patient to be analyzed into the trained Chinese medicine symptom language model in a reasoning stage, and outputting a corresponding Chinese medicine symptom type judgment result.

Inventors

CUI HAIYANG
CUI PEIYUAN
WANG LIE
WEI LINYOU
WEI QUNLI

Assignees

河南经方云科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260130

Claims (7)

1. The traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing is characterized by comprising the following steps: s1, segmenting clauses and symptomatic sentences of a patient Chinese medical record text to construct a symptomatic sentence sequence data set containing a patient description context; S2, analyzing the co-occurrence relation and the appearance position distribution of the symptom sentences in the description context of the patient based on the symptom sentence sequence data set, and generating the context correlation characteristics of the symptom sentences; S3, identifying the function role difference of the same symptom under different description contexts according to the context association characteristics of the symptom sentences, distinguishing the core symptom roles from the accompanying symptom roles, and generating symptom context role marking data; S4, dynamically reassigning participation degrees of symptom features in different symptom reasoning processes based on symptom context role marking data to generate symptom weight adjustment data; S5, according to the symptom weight adjustment data, carrying out constraint updating on the traditional Chinese medicine dialectical language model, and establishing a mapping relation of the symptom weight along with the change of the narration context; S6, inputting the text of the medical record of the patient to be analyzed into a trained Chinese medicine dialectical language model, dynamically adjusting symptom weights based on the mapping relation, and outputting a corresponding Chinese medicine symptom type judgment result.
2. The method for training and reasoning the dialect language model of the traditional Chinese medicine based on natural language processing as claimed in claim 1, wherein S1 is specifically: carrying out syntactic boundary recognition on continuous natural language descriptions in the patient Chinese medical record text, and dividing a medical record text sentence set by taking complete semantic expression as a boundary; extracting symptom related sentences expressing subjective feeling and objective expression of a patient or recording and judging by a doctor based on the combination relation of Chinese medicine symptom words and state description words in a medical record text sentence set; and sequentially arranging the symptom related sentences according to the original description sequence in the medical record text of the patient to form a symptom sentence sequence data set.
3. The method for training and reasoning the dialect language model of the traditional Chinese medicine based on natural language processing as claimed in claim 2, wherein S2 is specifically: Counting the adjacent relation and interval distribution of symptom related sentences in the symptom sentence sequence data set to obtain sentence co-occurrence characteristics among the symptom related sentences; determining the appearance position distribution characteristics of symptom-related sentences in the patient description context based on the sequence of the symptom-related sentences in the symptom-sentence sequence data set; Based on the sentence co-occurrence feature and the appearance position distribution feature, context-related features of the symptomatic sentence are generated.
4. The method for training and reasoning the dialect language model of traditional Chinese medicine based on natural language processing as claimed in claim 3, wherein S3 is specifically: Based on the context-related features of the symptom sentences, extracting Chinese medicine symptom words in the symptom-related sentences and carrying out synonymous merging to form the same symptom set; Comparing statement co-occurrence characteristics of the corresponding symptom related statements of the same symptom set with appearance position distribution characteristics, and judging functional role differences of the same symptom set under different patient description contexts; and marking the core symptom roles or the accompanying symptom roles of the same symptom set according to the function role differences, and generating symptom context role marking data.
5. The method for training and reasoning the dialect language model of traditional Chinese medicine based on natural language processing as claimed in claim 4, wherein S4 is specifically: Based on symptom context role marking data, core symptom role marking and accompanying symptom role marking corresponding to the same symptom set under different patient description contexts are collected; combining the core symptom character labels and the distribution differences of the accompanying symptom character labels in the patient description context to determine the participation degree change of the same symptom set in the evidence reasoning process; Based on the participation degree change of the same symptom set in the evidence reasoning process, differential weight distribution is carried out on symptom features corresponding to the same symptom set, and symptom weight adjustment data reflecting the participation degree change of the symptom features are generated.
6. The method for training and reasoning the dialect language model of traditional Chinese medicine based on natural language processing of claim 5, wherein S5 is specifically: Associating the differentiated weight distribution in the symptom weight adjustment data with the corresponding patient description context, and determining the change trend of the symptom weight under different patient description context conditions; according to the change trend of the symptom weight under different patient description contexts, establishing a corresponding mapping relation between the symptom weight and the patient description contexts of the same symptom set under different patient description contexts; And the corresponding mapping relation is used for restraining and updating the traditional Chinese medicine dialectical language model to obtain the trained traditional Chinese medicine dialectical language model.
7. The method for training and reasoning the dialect language model of traditional Chinese medicine based on natural language processing of claim 6, wherein S6 is specifically: Executing segmentation of clauses and symptomatic sentences on the Chinese medical record text of the patient to be analyzed to form a symptomatic sentence sequence data set to be analyzed; Inputting the symptom sentence sequence data set to be analyzed into a trained traditional Chinese medicine dialectical language model, and dynamically adjusting symptom weights corresponding to symptom sentences in the symptom sentence sequence data set to be analyzed based on the corresponding mapping relation between the symptom weights and the description context of the patient; Based on the dynamically adjusted symptom weight, comprehensively reasoning the symptom statement sequence data set to be analyzed, and outputting a traditional Chinese medicine syndrome type judgment result matched with the traditional Chinese medicine medical record text of the patient to be analyzed.

Description

Traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing Technical Field The invention relates to the technical field of medical information processing, in particular to a traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing. Background The diagnosis and treatment of traditional Chinese medicine is used as a key link of clinical diagnosis and treatment, and generally relies on subjective inquiry and observation of a doctor on the condition of a patient to obtain the symptom expression of the patient, and then the correlation and the importance degree between the symptoms are judged through experience to determine the corresponding traditional Chinese medicine syndrome type. The physician often has subjective difference in understanding the symptoms, and clinically, the symptoms of the patients are complex in content and high in expression uncertainty, so that the physician is difficult to unify the standard for identifying and determining the weight of the symptoms, and the accuracy and consistency of the traditional Chinese medicine syndrome differentiation are further reduced. The prior art lacks an effective method to realize accurate depiction of the dynamic change of symptom weights along with the description context of patients, so that the differentiation result of traditional Chinese medicine is disjointed with the actual clinical diagnosis and treatment requirements. Disclosure of Invention In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present invention provide a method for training and reasoning a dialectical language model of traditional Chinese medicine based on natural language processing to solve the problems set forth in the background art. In order to achieve the above purpose, the present invention provides the following technical solutions: the traditional Chinese medicine dialectical language model training and reasoning method based on natural language processing comprises the following steps: s1, segmenting clauses and symptomatic sentences of a patient Chinese medical record text to construct a symptomatic sentence sequence data set containing a patient description context; S2, analyzing the co-occurrence relation and the appearance position distribution of the symptom sentences in the description context of the patient based on the symptom sentence sequence data set, and generating the context correlation characteristics of the symptom sentences; S3, identifying the function role difference of the same symptom under different description contexts according to the context association characteristics of the symptom sentences, distinguishing the core symptom roles from the accompanying symptom roles, and generating symptom context role marking data; S4, dynamically reassigning participation degrees of symptom features in different symptom reasoning processes based on symptom context role marking data to generate symptom weight adjustment data; S5, according to the symptom weight adjustment data, carrying out constraint updating on the traditional Chinese medicine dialectical language model, and establishing a mapping relation of the symptom weight along with the change of the narration context; S6, inputting the text of the medical record of the patient to be analyzed into a trained Chinese medicine dialectical language model, dynamically adjusting symptom weights based on the mapping relation, and outputting a corresponding Chinese medicine symptom type judgment result. In a preferred embodiment, S1 is specifically: carrying out syntactic boundary recognition on continuous natural language descriptions in the patient Chinese medical record text, and dividing a medical record text sentence set by taking complete semantic expression as a boundary; extracting symptom related sentences expressing subjective feeling and objective expression of a patient or recording and judging by a doctor based on the combination relation of Chinese medicine symptom words and state description words in a medical record text sentence set; and sequentially arranging the symptom related sentences according to the original description sequence in the medical record text of the patient to form a symptom sentence sequence data set. In a preferred embodiment, S2 is specifically: Counting the adjacent relation and interval distribution of symptom related sentences in the symptom sentence sequence data set to obtain sentence co-occurrence characteristics among the symptom related sentences; determining the appearance position distribution characteristics of symptom-related sentences in the patient description context based on the sequence of the symptom-related sentences in the symptom-sentence sequence data set; Based on the sentence co-occurrence feature and the appearance position distribution feature, context-related features of the symp