CN-121979929-A - Information retrieval method, medium and equipment
Abstract
The invention relates to the technical field of information retrieval, in particular to an information retrieval method, medium and equipment, which are characterized in that a substance relation knowledge graph comprising multi-dimensional nodes and association relations is pre-constructed by supporting retrieval of three types of keywords and integrating the multi-type nodes in the graph, a logic chain between a medicine and each association element is visually displayed by combining a substance information table and visual output of the association sub-graph, association query requirements of a user in multiple scenes can be met without cross-platform multi-retrieval, the association of the medicine and a treatment scheme with diseases is provided with clinical rationality and authority by integrating authoritative data in the graph construction process, meanwhile, the association sub-graph is extracted by association expansion, false association and irrelevant information are effectively filtered, accuracy and reliability of a retrieval result are improved, and a first search result page and a second search result page are output in stages, so that the requirements of the user on quick screening of target substances are met, depth detailed information is provided, and retrieval efficiency and information depth are balanced.
Inventors
- YANG YUMEI
- LI YUECHAO
- MAO JIE
- LI LINHAO
- HAN CAIYUAN
Assignees
- 愚扬智数(北京)科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260128
Claims (10)
- 1. An information retrieval method, characterized in that the information retrieval method comprises the steps of: S1, determining at least one target substance node associated with a keyword from a pre-constructed substance relation knowledge graph according to a retrieval request initiated by a user through at least one keyword in a substance name, a gene name or a disease name, wherein the substance relation knowledge graph at least comprises a preset substance node, a target node, a gene node, a passage node, a disease node and a treatment scheme node; S2, according to the target substance node, retrieving corresponding marketing substance data and/or research substance data from a preset substance information database, and generating a structured substance information table; S3, outputting a first search result page presenting the substance information table to a user; S4, responding to the selection operation of the user on the target substance in the first search result page, and acquiring detailed information of the target substance from the substance information database, wherein the detailed information at least comprises substance basic information, related substance information and related target point information; S5, carrying out association expansion in the substance relation knowledge graph by taking a substance node corresponding to the target substance as a center, and extracting to obtain an association sub-graph containing an association treatment scheme node; and S6, outputting a second search result page presenting the associated sub-map and the detailed information to a user.
- 2. The information retrieval method as recited in claim 1, wherein the substance-relationship knowledge graph is constructed by: S10, carrying out entity identification and relation extraction on original data in a plurality of data sources to obtain an entity node set and an association relation set between the entity nodes, wherein the data sources comprise a substance database, a genomics database, a disease database and normalized diagnosis and treatment guide data, and the entity node set at least comprises a substance node, a target node, a gene node, a path node and a disease node; S20, determining the confidence weight of the current association according to the reliability level of the data source corresponding to the current association aiming at each association in the association set; s30, constructing corresponding treatment scheme nodes and association relations between each treatment scheme node and disease nodes according to treatment recommendation information extracted from the standardized diagnosis and treatment guide data; S40, setting the patient applicable conditions contained in the treatment recommendation information as constraint attributes on the association; s50, generating the substance relation knowledge graph according to the entity node set, the treatment scheme nodes, the confidence degree weight and the constraint attribute corresponding to each association relation.
- 3. The information retrieval method as recited in claim 2, wherein S20 includes the steps of: s201, obtaining a reliability grade and a basic weight coefficient corresponding to each data source; s202, determining at least one data source of a current incidence relation source according to each incidence relation; and S203, calculating to obtain the confidence weight corresponding to the current association relation according to each data source, the reliability level and the basic weight coefficient corresponding to the current association relation.
- 4. The information retrieval method as recited in claim 2, wherein S30 includes the steps of: S301, analyzing treatment recommendation information from the standardized diagnosis and treatment guide data, wherein the treatment recommendation information at least comprises a treatment scheme description, an applicable disease identifier and one or more substance identifiers contained in the treatment scheme; s302, creating or updating corresponding treatment plan nodes in the entity node set according to the treatment plan description; s303, establishing a first association relationship between the treatment scheme node and a disease node determined by the applicable disease identifier; S304, establishing a second association relationship between the treatment scheme node and the substance node determined by each substance identifier.
- 5. The method of claim 4, wherein the recommended treatment information further includes patient applicable conditions, the patient applicable conditions including at least one of disease stage, physical stamina score, specific biomarker status, genetic variation type, prior treatment history, S40 comprising the steps of: S401, converting the patient applicable conditions into a structured logic expression which can be analyzed by a map query engine; S402, storing the structured logic expression as constraint attributes of the first association relation.
- 6. The information retrieval method as recited in claim 2, wherein S50 includes the steps of: s501, mapping each entity in the entity node set into one node in the substance relation knowledge graph; S502, mapping each association in the association set into a directed edge or a undirected edge connecting two corresponding nodes; And S503, constructing and obtaining the substance relation knowledge graph by taking the confidence weight and the constraint attribute as attributes corresponding to the directed edge or the undirected edge.
- 7. The information retrieval method as recited in claim 2, wherein S1 includes the steps of: when the keyword is a gene name, inquiring a gene node corresponding to the gene name in the substance relation knowledge graph; If the gene node has the target functional attribute, determining a substance node directly connected with the gene node as the target substance node according to the association relation between the target node and the substance node, wherein the target functional attribute is judged by a preset target gene database; if the gene node does not have the target functional attribute, a plurality of target nodes which are positioned on the same path with the gene node are all used as candidate nodes according to the association relation between the gene node and the path node; and determining the substance node directly connected with each candidate node as the target substance node according to the association relation between the target node and the substance node.
- 8. The information retrieval method as recited in claim 2, wherein S5 includes the steps of: traversing along a relation edge in the substance relation knowledge graph by taking a substance node corresponding to the target substance as a starting point, and determining a target node, a gene node and a disease node which are directly connected with the substance node corresponding to the target substance and a treatment scheme node which is related through the disease node and has constraint attribute on the related relation matched with the current context as related nodes.
- 9. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the information retrieval method of any one of claims 1-8.
- 10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
Description
Information retrieval method, medium and equipment Technical Field The present invention relates to the field of information retrieval technologies, and in particular, to an information retrieval method, medium, and apparatus. Background In the scenes of drug research and development, clinical diagnosis and treatment, medical scientific research and the like, drug information retrieval is a core basic link, and a user needs to quickly acquire accurate drug related data. In the prior art, the information retrieval method of the medicine mainly depends on keyword matching or simple association query of a single database, and has a plurality of remarkable defects: Firstly, most methods only support single-dimension retrieval of drug names or disease names, so that cross-dimension association query requirements of gene-target spot-drug-disease under an accurate medical scene are difficult to meet, and integrated retrieval of multidimensional information cannot be realized. Secondly, the search results are presented in a pure text list or isolated data items, visual display of association logic among multiple elements such as medicines, genes, paths, diseases, treatment schemes and the like is lacking, and a user can comb a core logic chain only by manually integrating the multiple source data, so that the efficiency is extremely low. In addition, the existing method does not fully integrate the authoritative basis such as clinical diagnosis and treatment guidelines, the association of medicines with diseases and treatment schemes lacks clinical rationality support, and the association strength is not quantitatively distinguished, so that false association or irrelevant information interference is easy to occur. In addition, the query of the prior art on non-target genes mostly returns the result of no related drugs, and indirect drug association at the gene path level is not mined, so that the exploration requirement of scientific researchers and clinicians on potential drug targets cannot be met. Therefore, how to support multi-dimensional keyword search and integrate authoritative clinical basis to improve accuracy and reliability of information search is a urgent problem to be solved. Disclosure of Invention Aiming at the technical problems, the technical scheme adopted by the invention is an information retrieval method, which comprises the following steps: s1, determining at least one target substance node associated with a keyword from a pre-constructed substance relation knowledge graph according to a retrieval request initiated by a user through at least one kind of keyword in a substance name, a gene name or a disease name, wherein the substance relation knowledge graph at least comprises a preset substance node, a target node, a gene node, a passage node, a disease node and a treatment scheme node. S2, according to the target substance node, searching and obtaining corresponding marketing substance data and/or research substance data from a preset substance information database, and generating a structured substance information table. S3, outputting a first search result page presenting the substance information table to the user. S4, responding to the selection operation of the user on the target substance in the first search result page, and acquiring detailed information of the target substance from a substance information database, wherein the detailed information at least comprises substance basic information, related substance information and related target point information. And S5, carrying out association expansion in the substance relation knowledge graph by taking a substance node corresponding to the target substance as a center, and extracting to obtain an association sub-graph containing the association treatment scheme node. And S6, outputting a second search result page presenting the associated sub-map and the detailed information to the user. The invention also provides a non-transitory computer readable storage medium, in which at least one instruction or at least one program is stored, the at least one instruction or the at least one program being loaded and executed by a processor to implement the above-mentioned information retrieval method. The invention also provides an electronic device comprising a processor and the non-transitory computer readable storage medium described above. The method has the advantages that through supporting cross-dimensional search of three key words of a substance name, a gene name and a disease name, simultaneously integrating multiple types of nodes in a substance relation knowledge graph, the method can meet the requirements of associated query under multiple scenes of a user, improves the comprehensiveness of search dimensions, can acquire complete information without cross-platform multiple search, through pre-constructing a substance relation knowledge graph containing multiple-dimensional nodes and association relations, and combining the vis