Search

CN-121983330-A - Knowledge graph construction method and system based on traditional Chinese medicine classical prescription

CN121983330ACN 121983330 ACN121983330 ACN 121983330ACN-121983330-A

Abstract

The invention relates to the technical field of knowledge graph construction, and provides a knowledge graph construction method and a knowledge graph construction system based on a traditional Chinese medicine classical prescription, wherein the method comprises the steps of obtaining multi-source heterogeneous data consisting of a digital text of a classical ancient book, teaching materials, a traditional Chinese medicine authoritative dictionary, academic journal literature and structural clinical data; the standard text data is extracted through mixed strategy knowledge to separate out discrete knowledge elements including entity, relation, attribute, etc. the discrete knowledge elements constitute the original fact unit of the knowledge map, the extracted discrete knowledge elements are modeled and expressed in form to constitute multidimensional semantic network with square, certificate, disease and machine as core, and the semantic similarity comprehensive judgment is performed through calculating the character string similarity of entity name and combining with the context characteristic. The system comprises a heterogeneous data processing module, an entity identification module, a semantic network construction module and a similarity judgment module. The invention realizes the structural organization, deep association and intelligent utilization of classical prescription knowledge.

Inventors

  • BAI WEIMIN

Assignees

  • 安顿健康科技有限公司

Dates

Publication Date
20260505
Application Date
20251212

Claims (10)

  1. 1. The knowledge graph construction method based on the traditional Chinese medicine classical prescription is characterized by comprising the following steps of: Standard text data composed of multi-source heterogeneous data is extracted through mixed strategy knowledge, and discrete knowledge elements forming a knowledge map original fact unit are separated out; extracting the entity in the fixed mode, carrying out relationship extraction, distinguishing the types, extracting monarch, minister, assistant and guide compatibility relationship and extracting attributes; The extracted discrete knowledge elements are subjected to ontology modeling and formal representation to construct a multi-dimensional semantic network taking a square-evidence-disease-mechanism as a core, wherein the multi-dimensional semantic network is a model layer of a knowledge graph and a data layer of a first-class scale, so that the structured organization of the knowledge graph is realized; and calculating the similarity of the character strings of the entity names and comprehensively judging the semantic similarity by combining the context characteristics of the character strings, so that the linking and merging of the different-name entities in different data sources are realized.
  2. 2. The knowledge graph construction method based on classical prescription of traditional Chinese medicine according to claim 1, wherein the process of separating out discrete knowledge elements of entities, relations and attributes comprises the following steps: The method comprises the steps of adopting a parallel two-way extraction mechanism to process standard text data, wherein a first path is based on a mode rule base predefined by a fixed expression sentence pattern for describing prescriptions and syndromes in ancient books of the induction Chinese medicine, directly extracting entities contained in the formulas and giving type labels through matching expression sentence patterns; Acquiring an entity set and the appearance position of the entity set in an original text, positioning sentences or adjacent paragraphs which commonly appear by two entities in the original text as an analysis window, and processing the relationships with explicit language indication characteristics of the composition and the main indications by adopting a relationship classifier which is subjected to remote supervision training; for the monarch, minister, assistant and guide complex relationship, starting a syntactic-semantic-based joint analysis, and deducing the functional compatibility relationship between the entities by analyzing the dependency structure and the deep semantic roles of the text; The method comprises the steps of carrying out attachment of attribute information on a relation candidate pair set, predefining relevant attribute slots for each relation type, identifying numerical values, units and descriptive phrases relevant to relation mention in a text through a conditional random field model, filling the numerical values, the units and the descriptive phrases into corresponding attribute slots, packaging each relation instance and complete attributes thereof into a structured knowledge unit according to a core structure of a head entity-relation type-tail entity and all verified attribute information, and constructing a set formed by all packaged knowledge units.
  3. 3. The knowledge graph construction method based on the classical prescription of traditional Chinese medicine according to claim 1, wherein the process of realizing the structured organization of the knowledge graph comprises the following steps: the method comprises the steps of carrying out systematic reconstruction on discrete knowledge elements according to the logical relationship of theory, method, prescription, medicine and dialectical treatment of a traditional Chinese medicine prescription, establishing prescription, traditional Chinese medicine, syndrome and pathogenesis as core concept categories, constructing a hierarchical structure of traditional Chinese medicine and syndrome according to efficacy classification and dialectical system of the traditional Chinese medicine, defining relationship types of composition, main treatment and corresponding pathogenesis according to the prescription theory and clinical practice of the traditional Chinese medicine, and presetting an extended attribute slot for the relationship and entity to obtain a formalized domain ontology mode; The method comprises the steps of taking a domain ontology mode as a conversion rule, attributing each entity identifier to a concept node according to a concept category defined in the domain ontology mode, instantiating relation assertion among entities into connection with definite semantic labels according to a relation type defined in the domain ontology mode and constraint thereof, filling and associating attribute information according to a predefined attribute slot, outputting a triplet set formed by a main concept, a relation type, an object concept or an attribute value, logically forming a semantic network taking a square-evidence-disease-machine as a framework; The method comprises the steps of taking a triplet set as a data source, mapping a logical concept category to a node label in a graph database, mapping a relationship type to an edge type, mapping an attribute of an entity and an attribute of the relationship to an attribute key value pair of a node and an edge, reading each RDF triplet, creating or positioning a corresponding head node and a corresponding tail node in the graph database, establishing an edge with the type label between the head node and the tail node and attaching related attributes to the node or the edge, carrying out explicit expression on a concept hierarchy relationship defined in an ontology mode in a graph structure through the edge type, converting the logical multidimensional semantic network to an initial graph data layer, and completing conversion of a knowledge graph from the logic model to physical storage.
  4. 4. A method for constructing a knowledge graph based on classical prescription of traditional Chinese medicine according to claim 3, wherein the process of filling and associating according to predefined attribute slots comprises the following steps: Matching each entity identifier with a concept category defined in an ontology mode, and carrying out consistency check and final category judgment according to category information preliminarily possessed by an entity in an original knowledge element and against definition and hierarchical constraint of the concept category in the ontology mode; For each original relation assertion, checking whether the concept category of the head and tail entity related to the original relation assertion completely accords with the definition domain and the value domain concept category specified by the relation type to be instantiated or not; The method comprises the steps of carrying out association and encapsulation on a normalized relation triplet set and extracted and discrete attribute information, designating additional attribute types for each type of relation or entity according to preset attribute slot definition, integrating the relation, entity and attribute into an information unit with complete structure, and outputting the triplet set of the multidimensional semantic network taking a square-evidence-disease-machine as a framework.
  5. 5. The knowledge graph construction method based on classical prescription of traditional Chinese medicine according to claim 4, wherein the process of checking the concept class to which the related head-tail entity belongs comprises the following steps: acquiring concept categories corresponding to the head entity and the tail entity related to each assertion from the entity set marked by the concept categories through searching operation, and adding the concept category information to the relational assertions; Extracting a definition domain concept category and a value domain concept category allowed by each relation assertion from a relation type constraint part defined in the domain ontology mode, and associating the two types of constraints with the relation assertion as verification standards; And performing verification operation on each relation assertion, namely comparing the concept category of the head entity with the associated domain constraint category, and simultaneously comparing the concept category of the tail entity with the associated value domain constraint category, wherein the relation assertion is considered to pass verification only when the concept category of the head entity is completely affiliated to the domain constraint category and the category of the tail entity is completely affiliated to the value domain constraint category, and the relation assertion passing verification forms an output set for being instantiated into a normalized relation triplet.
  6. 6. The knowledge graph construction method based on the classical prescription of traditional Chinese medicine according to claim 5, wherein the process of extracting the definition domain concept category and the value domain concept category allowed by each relation assertion type comprises the following steps: Searching and retrieving in a structural definition library stored in an ontology mode according to the relationship type indicated in the relationship assertion, positioning a complete mode definition segment corresponding to the relationship type, and binding the complete mode definition segment with the relationship assertion; Resolving a definition domain and a value domain part of an explicit declaration from a complete pattern definition fragment of a binding relationship type, wherein the definition domain and the value domain part are specified by a predefined ontology pattern modeling language specification, and a concept class identifier which is allowed to be used as a starting point or definition domain and an end point or value domain of the relationship is indicated in a machine-readable form; Traversing in the concept hierarchy of the ontology mode by taking the category identifier in the explicit constraint as a starting point, acquiring all subcategories of the category represented by the category identifier to form a complete category set allowed by the constraint, and processing the complete allowed category set generated for each relation assertion.
  7. 7. The knowledge graph construction method based on the classical prescription of traditional Chinese medicine according to claim 6, wherein the process of using the resolved concept category identifier pair as an explicit constraint comprises the following steps: performing class identifier separation operation, respectively extracting and independently storing two class identifiers in a class identifier pair to form two independent data items of a definition domain identifier and a value domain identifier, and simultaneously maintaining subordinate association with the original relationship assertion; executing constraint binding operation, and binding the definition domain identifier as constraint conditions of relation assertion on definition domain dimension; the binding operation establishes a direct pointing relation between a relation assertion instance and two types of constraint conditions of the relation assertion instance at the data structure level; the method comprises the steps of executing standardized packaging operation, packaging domain constraints and value domain constraints according to a predefined constraint expression specification to generate a structured constraint description object, associating the constraint description object with relation assertions as an integral attribute, wherein the relation assertions carry an explicit constraint set.
  8. 8. The knowledge graph construction method based on classical Chinese medicine prescription according to claim 7, wherein the process of generating a structured constraint description object comprises the following steps: Analyzing the specification document, extracting field definition, type requirement and relation logic required by the constraint description structure defined in the specification document, and generating a format template for guiding the packaging process; Performing type conversion or format verification on the definition domain identifier according to the type requirement specified by the template for the definition domain field according to the field definition in the format template to generate a definition domain constraint field meeting the requirement; organizing and packaging the definition domain constraint fields and the value domain constraint fields according to a logical relation according to the structural relation logic defined in the format template to form a structural constraint description object; and performing association operation, and attaching the constraint description object as a whole attribute to the original relation assertion.
  9. 9. The knowledge graph construction method based on traditional Chinese medicine classical prescription as claimed in claim 1, wherein the method is characterized by obtaining multi-source heterogeneous data consisting of digitalized texts of classical ancient books, teaching materials, chinese medicine authoritative dictionary, academic journal literature and structured clinical data, adopting a BiLSTM-CRF-based sequence labeling model to perform automatic sentence reading and preliminary recognition, normalizing prescription names, chinese medicine names and syndrome names of different sources according to the pre-constructed Chinese medicine authoritative dictionary, and carrying out text cleaning to obtain standard text data with unified structure and terms.
  10. 10. A knowledge graph construction system based on a traditional Chinese medicine classical prescription, for implementing the knowledge graph construction method based on the traditional Chinese medicine classical prescription according to any one of claims 1 to 9, characterized in that the knowledge graph construction system based on the traditional Chinese medicine classical prescription comprises: The heterogeneous data processing module is used for acquiring multi-source heterogeneous data consisting of a digitalized text of classical ancient books, teaching materials, a traditional Chinese medicine authoritative dictionary, academic journal documents and structured clinical data, adopting a BiLSTM-CRF-based sequence labeling model to perform automatic sentence reading and preliminary identification, normalizing prescription names, traditional Chinese medicine names and syndrome names of different sources according to a pre-constructed traditional Chinese medicine authoritative dictionary, and carrying out text cleaning to obtain standard text data with unified structure specification and terminology; The entity identification module is used for extracting standard text data through mixed strategy knowledge, separating out discrete knowledge elements of entities, relations and attributes, wherein the discrete knowledge elements form an original fact unit of a knowledge graph; The semantic network construction module is used for constructing a multidimensional semantic network taking a square-evidence-disease-machine as a core through ontology modeling and formal representation of the extracted discrete knowledge elements, wherein the multidimensional semantic network is a model layer and a data layer of a first scale of a knowledge graph, and the structured organization of the knowledge graph is realized; And the similarity judging module is used for comprehensively judging the semantic similarity by calculating the similarity of the character strings of the entity names and combining the context characteristics of the character strings, so that the different-name entities in different data sources are linked and combined.

Description

Knowledge graph construction method and system based on traditional Chinese medicine classical prescription Technical Field The invention relates to the technical field of knowledge graph construction, in particular to a knowledge graph construction method and system based on a traditional Chinese medicine classical prescription. Background Classical prescriptions of traditional Chinese medicine are core components of the theoretical system of traditional Chinese medicine and bear thousands of years of clinical experience and academic ideas. However, the current knowledge management of traditional Chinese medicine classical formulas still has many challenges, including firstly, the dispersion and isolation of the formulas in different classical ancient books, modern teaching materials and research documents such as typhoid treatises, golden-deficiency treatises, and the like, the lack of systematic integration, secondly, the insufficient excavation and expression of complex association relations between formulas and symptoms, pathogenesis and traditional Chinese medicines, and furthermore, the difficulty in effectively supporting intelligent applications such as the calculation of formulas similarity, the analysis of class Fang Yan and clinical auxiliary decision by the traditional knowledge representation method. The application of the existing knowledge graph technology in the field of traditional Chinese medicine is mainly focused on the basic expression of entity relations, the special knowledge structure of traditional Chinese medicine prescription is lacking, such as the depth modeling of monarch, minister, assistant and guide compatibility, dosage addition and subtraction and decoction methods, and the consistency processing and conflict resolution capability of multi-source heterogeneous data in the knowledge fusion process are insufficient. In addition, the knowledge graph constructed by the traditional method has obvious limitation in supporting prescription knowledge discovery and clinical reasoning. In the prior art, publication No. CN120353938A discloses a traditional Chinese medicine acupuncture knowledge visual display system based on a knowledge graph, which comprises a traditional Chinese medicine ancient book data acquisition unit, a clinical case data acquisition unit and an ultrasonic image data acquisition unit which acquire data and integrate the data into traditional Chinese medicine acupuncture data, preprocessing the traditional Chinese medicine acupuncture data to obtain perfect traditional Chinese medicine acupuncture data, constructing the knowledge graph based on the perfect traditional Chinese medicine acupuncture data, constructing an optimized network model to optimize the knowledge graph to obtain a high-quality knowledge graph, constructing an incremental learning frame and updating the high-quality knowledge graph based on the incremental learning frame, outputting a real-time updated knowledge graph, constructing an interactive three-dimensional human model based on the real-time updated knowledge graph, outputting an executable medical strategy for an application case based on the real-time updated knowledge graph and the interactive three-dimensional human model, and sending the executable medical strategy to a student terminal. The knowledge visualization method is characterized in that knowledge visualization is realized, teaching quality is improved, a data preprocessing link does not specially aim at unstructured or semi-structured texts such as classical ancient books and prescription documents of traditional Chinese medicine, term normalization and text cleaning are difficult to ensure uniformity and standardization of core entities such as prescriptions and traditional Chinese medicines on term level, a fine extraction strategy aiming at complex relations of the traditional Chinese medicine prescriptions is lacking, a mixed extraction method aiming at specific knowledge (such as monarch, minister, assistant, guide, compatibility relation, medicine dosage attribute, decoction method description and the like) in the traditional Chinese medicine prescriptions is not specifically designed, an extraction model of specific relations and complex semantic relations is not distinguished, the integrity and accuracy of the relations and the attributes in a knowledge graph are possibly affected, and the problem of semantic conflict such as entity different names, dose conflict, misinterpretation of a pathogenesis and the like in multi-source data is not specifically explained, and the logic consistency of the knowledge graph is possibly affected. In the second prior art, publication number CN120386831A discloses an acute lung injury traditional Chinese medicine informatization processing system and a method thereof, wherein the system comprises a traditional Chinese medicine information slice storage module, a traditional Chinese medicine knowledge graph construction module,