CN-121388238-B - Medication guidance system and method based on knowledge graph and vector retrieval fusion
Abstract
The application provides a medication guidance system and a medication guidance method based on knowledge graph and vector retrieval fusion, which relate to the technical field of artificial intelligence and comprise an application access layer, a service interface layer and a hybrid retrieval engine layer; the system comprises an application access layer, a service interface layer, a hybrid search engine layer, a graph database query statement and a query semantic vector, wherein the application access layer is used for receiving external demand information and transmitting the external demand information to a large model platform, the service interface layer is used for receiving a query instruction of the large model platform and transmitting the query instruction to the hybrid search engine layer, and transmitting a search result to the large model platform so as to enable the large model platform to output the search result, the hybrid search engine layer is used for generating the graph database query statement and the query semantic vector according to the query instruction, carrying out structural query on the graph database according to the graph database query statement, carrying out semantic similarity search on the vector database based on the query semantic vector, and generating a graph search result and a vector search result to the large model platform. The specialized medication guidance can be realized through the medication guidance system.
Inventors
- ZHANG LI
- Rao Fenghao
- WANG JUNKE
- ZHANG LONGQIAN
- SHU JIE
- SHAO TONG
- WU JINJING
Assignees
- 中南民族大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251225
Claims (6)
- 1. The medication guidance system based on the integration of knowledge graph and vector retrieval is characterized by comprising a data preprocessing layer, a data processing layer, an application access layer, a service interface layer and a mixed retrieval engine layer; The data preprocessing layer comprises a document processing module and an intelligent partitioning module, wherein the document processing module is used for extracting data of a medicine material file, sequentially carrying out data cleaning and unified coding format processing on the extracted data, converting a field at a preset position of the medicine material file into a predefined standard field based on a mapping mechanism of a standard field, and converting the obtained data into structured data in a target format; The data processing layer comprises a medical data enhancement module, a knowledge graph construction module and a vector coding module; the medical data enhancement module is used for receiving the document fragments output by the data preprocessing layer, carrying out automatic identification and standardization processing on variant names of the document fragments based on a standardized mapping dictionary to obtain medical enhancement document fragments, the knowledge graph construction module is used for adding a structured label for each medical enhancement document fragment, identifying medical entities and relationship types based on medicine field knowledge, constructing standard triples and embedding the standard triples into a graph database, the vector coding module is used for carrying out semantic vectorization processing on the medical enhancement document fragments by adopting a pre-trained semantic embedding model to obtain semantic vector representation, and embedding the semantic vector representation into a vector database, wherein the structured label is used for supporting subsequent accurate retrieval and result traceability verification, and the application access layer is used for receiving external demand information and transmitting the external demand information to a large model platform; The service interface layer provides a unified tool interface based on an interface protocol, and is used for receiving a query instruction of the large model platform, transmitting the query instruction to the hybrid search engine layer, receiving a search result of the hybrid search engine layer, and transmitting the search result to the large model platform so that the large model platform outputs reliable medicine interaction analysis, medicine safety assessment and personalized medicine suggestion; The mixed search engine layer is used for generating a graph database query statement and a query semantic vector according to the query instruction, carrying out structural query on the graph database according to the graph database query statement, carrying out semantic similarity search on the vector database based on the query semantic vector, and generating a graph search result and a vector search result to the large model platform, wherein the graph query supports complex multi-drug relationship reasoning, a multi-drug query template is predefined by a system, and the query template is used for processing complex drug interaction analysis.
- 2. The medication instruction system based on knowledge-graph and vector retrieval fusion of claim 1, wherein the hybrid retrieval engine layer comprises a graph query engine and a vector query engine; The map query engine is used for converting the query instruction into a map database query statement by adopting an embedded statement conversion model, carrying out structural query on a map database according to the map database query statement, and carrying out structural conversion of a target format on a query result to obtain the map retrieval result; The vector query engine is used for converting the query instruction into a query semantic vector by adopting a vectorization model, carrying out semantic similarity retrieval on a vector database based on the query semantic vector, screening semantic retrieval results, and carrying out structural conversion of a target format on the screening results to obtain the vector retrieval results.
- 3. The medication instruction system based on knowledge-graph and vector retrieval fusion of claim 2, wherein the vector query engine comprises a retrieval module and a reordering module; The search module is used for converting the query instruction into a query semantic vector by adopting a vectorization model, calculating cosine similarity between the query semantic vector and each document vector in the vector database, and returning top-k documents with highest similarity based on a search data threshold value to obtain a preliminary screening result; The reordering module is used for secondarily ordering vector retrieval results by adopting a BGE-Reranker-Base model, the reordering process calculates the relevance score of the query and each candidate document, the query and the document are coded simultaneously by using a Cross-Encoder architecture, and the relevance score is output through a full connection layer so as to reorder the preliminary screening results based on the scoring results, and a final screening result is obtained.
- 4. A medication guidance method based on a combination of knowledge graph and vector search, wherein the medication guidance method is performed based on the medication guidance system according to any one of claims 1 to 3, the medication guidance method comprising: Acquiring a medicine material file, extracting data of the medicine material file, sequentially carrying out data cleaning and unified coding format processing on the extracted data, converting a field at a preset position of the medicine material file into a predefined standard field based on a mapping mechanism of the standard field, and converting the obtained data into structured data in a target format; overlapping and blocking the structured data in the target format based on preset overlapping parameters to obtain document fragments, and carrying out piece-by-piece segmentation on array type fields in the document fragments according to standard separators in medical documents to disassemble the composite medical concept into independent semantic units; receiving the document fragment, and carrying out automatic identification and normalization processing on the variant names of the document fragment based on a standardized mapping dictionary to obtain a medical enhanced document fragment; adding a structured label for each medical enhanced document fragment, identifying medical entities and relationship types based on medicine field knowledge, constructing a standard triplet, and embedding the standard triplet into a graph database, wherein the structured label is used for supporting subsequent accurate retrieval and result traceability verification; carrying out semantic vectorization processing on the medical enhanced document fragments by adopting a pre-trained semantic embedding model to obtain semantic vector representations, and embedding the semantic vector representations into the vector database; the structured data in the target format is subjected to blocking processing, the document fragments after the blocking processing are converted into semantic vector representations to be embedded into a vector database, and the semantic vector representations are converted into standard triples to be embedded into the graph database; receiving external demand information, and transmitting the external demand information to a large model platform; Receiving a query instruction of the large model platform, and generating a graph database query statement and a query semantic vector according to the query instruction; And carrying out structural query on the graph database according to the query statement of the graph database, carrying out semantic similarity retrieval on the vector database based on the query semantic vector, and generating a graph retrieval result and a vector retrieval result to the large model platform so as to enable the large model platform to output reliable medicine interaction analysis, medicine safety assessment and personalized medicine suggestion, wherein the graph query supports complex multi-medicine relation reasoning, a multi-medicine query template is predefined by a system, and the query template is used for processing complex medicine interaction analysis.
- 5. An electronic device comprising a processor and a memory, the memory having a stored computer program, wherein the computer program when executed by the processor implements the knowledge-graph and vector retrieval fusion-based medication instruction method of claim 4.
- 6. A computer storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the medication instruction method based on knowledge-graph and vector retrieval fusion of claim 4.
Description
Medication guidance system and method based on knowledge graph and vector retrieval fusion Technical Field The application relates to the technical field of artificial intelligence, in particular to a medication guidance system and method based on knowledge graph and vector retrieval fusion. Background In modern medical practice, accurate drug administration guidance and safe drug administration consultation are critical for making personalized treatment schemes and improving medical quality. With the development of medical technology, drug categories are becoming increasingly abundant, and drug instructions (Drug Labels) have become a major way for medical institutions to manage and utilize drug information. The drug instruction contains abundant drug data such as indication, usage, drug interaction, contraindications, adverse reactions and the like, and the data provides precious information resources for the prediction of safe use of the drug. However, how to extract valuable information from these huge and complex data and translate it into knowledge that helps clinical decisions remains a great challenge. In recent years, with the rapid development of Natural Language Processing (NLP) technology and deep learning, a method based on a large-scale language model is gradually applied to the medical field to mine potential value in the medicine specification. Through the strong characterization capability of the large language model, text content related to medical treatment can be understood and generated, and a new way is provided for automatic medicine medication guidance. However, traditional text retrieval methods have difficulty in handling complex drug-related queries and multi-drug interaction analysis, limiting their deployment in a practical medical environment. Disclosure of Invention In view of the above, the application provides a medication guidance system and a medication guidance method based on the fusion of knowledge graph and vector retrieval. In a first aspect, the application provides a medication guidance system based on knowledge graph and vector retrieval fusion, which comprises an application access layer, a service interface layer and a hybrid retrieval engine layer; The application access layer is used for receiving external demand information and transmitting the external demand information to the large model platform; The service interface layer provides a unified tool interface based on an interface protocol and is used for receiving a query instruction of the large model platform, transmitting the query instruction to the hybrid search engine layer, receiving a search result of the hybrid search engine layer and transmitting the search result to the large model platform so that the large model platform outputs the search result; the mixed search engine layer is used for generating a graph database query statement and a query semantic vector according to the query instruction, carrying out structural query on the graph database according to the graph database query statement, carrying out semantic similarity search on the vector database based on the query semantic vector, and generating a graph search result and a vector search result to the large model platform. In one embodiment, the medication guidance system based on the fusion of the knowledge graph and the vector retrieval further comprises a data preprocessing layer and a data processing layer; The data preprocessing layer is used for preprocessing the medicine material file to generate structured data in a target format; The data processing layer is used for receiving the structured data in the target format, performing blocking processing on the structured data in the target format, converting the document fragments after the blocking processing into semantic vector representations, embedding the semantic vector representations into the vector database, and converting the semantic vector representations into standard triples, and embedding the standard triples into the graph database. In one embodiment, the data preprocessing layer comprises a document processing module and an intelligent blocking module; The document processing module is used for extracting the data of the medicine material file, sequentially carrying out data cleaning and unified coding format processing on the extracted data, converting the field of the preset position of the medicine material file into a predefined standard field based on a mapping mechanism of the standard field, and converting the obtained data into structured data in a target format; the intelligent blocking module is used for carrying out overlapping blocking processing on the structured data in the target format based on preset overlapping parameters, and carrying out piece-by-piece segmentation on the array type fields in the document fragments according to standard separators in medical documents, so as to disassemble the composite medical concept into independent semantic