CN-122024912-A - Free implementation analysis knowledge base construction method based on intelligent agent and workflow
Abstract
The invention relates to the technical field of chemical informatics, in particular to a method for constructing a free implementation analysis knowledge base based on an agent and workflow, which comprises the steps of firstly constructing a free implementation knowledge map by utilizing chemical structure data and patent text data, and establishing association relations among molecular nodes, claim nodes and target nodes in a graph; generating Markush templates based on the claims with structural formulas in the knowledge graphs, calculating free implementation distance vectors for each molecule and each Markush template, writing the Markush templates and the distance vectors into the knowledge graphs, constructing target analysis workflow on the basis, driving candidate molecule generation intelligent agents and free implementation evaluation intelligent agents to perform free implementation risk evaluation and recommendation on the candidate molecules, further establishing a causal analysis model by using workflow logs, patent examination results and infringement judgment results, and calibrating an aggregation mode and a risk evaluation rule of the distance vectors to realize continuous optimization and traceable management of free implementation analysis.
Inventors
- JIN XIA
- TIAN KAIGE
- LIU WEI
Assignees
- 杭州慧医道科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260128
Claims (10)
- 1. A method for constructing a free implementation analysis knowledge base based on an agent and a workflow is characterized by comprising the following steps: constructing a free implementation knowledge graph, and establishing an association relationship among a molecular node, a claim node and a target node based on chemical structure data and patent text data by the free implementation knowledge graph; Generating a Markush template based on the claim with the structural formula in the free implementation knowledge graph, calculating a free implementation distance vector of each molecule relative to each Markush template, and writing the Markush template and the free implementation distance vector into the free implementation knowledge graph; Constructing a target analysis workflow based on the free implementation knowledge graph and the free implementation distance vector, and calling a candidate molecule to generate an agent and a free implementation evaluation agent to obtain a candidate molecule free implementation risk evaluation result and a recommendation result; And establishing a causal analysis model based on the workflow log, the patent examination result and the infringement judgment result, and calibrating an aggregation mode of the free implementation distance vector and a free implementation risk assessment rule according to the causal analysis model.
- 2. The method of claim 1, wherein constructing the free-form knowledge graph comprises creating molecular nodes based on chemical structure data, recording molecular structural formulas and molecular identifiers by the molecular nodes, creating claim nodes based on patent text data, recording claim text and publication numbers by the claim nodes, and creating association relationships among the molecular nodes, claim nodes and target nodes through a graph database.
- 3. The method of claim 1, wherein generating a Markush template based on the structured claim in the free-form knowledge-graph comprises identifying a structured reference tag in the claim text, extracting a structured image corresponding to the structured reference tag, converting the structured image to a molecular skeleton map and a set of substituent positions by a structured recognition algorithm, and storing the molecular skeleton map and the set of substituent positions as the Markush template.
- 4. The method of claim 1, wherein the freely implemented distance vectors include a symbol constraint distance calculated based on a result of matching substituent definitions in the Markush template with molecular structure, an embedding space distance calculated based on a distance between a molecular vector generated by molecular diagram representation and a corresponding molecular vector of the Markush template, and a structure editing distance calculated based on a minimum number of editing steps required for performing a structure editing operation on the pair of molecular figures to satisfy the definition of the Markush template.
- 5. The method of claim 1, wherein writing the Markush template and the free-form distance vector into the free-form knowledge graph includes representing the Markush template as Markush template nodes, creating a relationship edge between the molecular node and the Markush template nodes, recording the free-form distance vector in an attribute of the relationship edge, and providing a query interface for reading the free-form distance vector based on the relationship edge attribute.
- 6. The method of claim 1, wherein constructing a target analysis workflow based on the free-implementation knowledge-graph and the free-implementation distance vector comprises receiving a target identification, generating a target-associated set of molecules based on the molecular nodes associated with the target nodes in the free-implementation knowledge-graph and the claim nodes, generating a workflow context, recording the target identification and the target-associated set of molecules, writing the target identification and the target-associated set of molecules in the workflow context, and configuring a call interface to access the free-implementation distance vector in the workflow context.
- 7. The method of claim 6, wherein the candidate molecule generation agent generates the candidate molecule set by performing a structure editing operation on a molecular structure corresponding to a molecular node in the workflow context, the structure editing operation including at least one of adding substituents, replacing substituents, and changing a type of chemical bond on the molecular skeleton.
- 8. The method of claim 1, wherein the free-implementation assessment agent performs weighted aggregation on the free-implementation distance vectors stored in the free-implementation knowledge graph and the set of candidate molecules generated by the candidate molecule generation agent based on the candidate molecules, and the free-implementation distance vectors corresponding to a set of Markush templates for each candidate molecule, to obtain a candidate molecule free-implementation risk assessment result.
- 9. The method of claim 1, wherein the causal analysis model uses candidate molecule free-run risk assessment results, candidate molecule recommendation results, patent censoring results, and infringement decision results as nodes in a causal graph, and constructs statistical relationships between candidate molecule free-run risk assessment results, patent censoring results, and infringement decision establishment results based on workflow logs.
- 10. The method of claim 9, wherein calibrating the aggregate manner of the free-running distance vectors and the free-running risk assessment rules according to the causal analysis model includes adjusting aggregate weights for each distance component in the free-running distance vectors and decision thresholds for the free-running risk assessment results based on deviations between candidate molecule free-running risk assessment results output by the causal analysis model and patent censoring results and infringement decision satisfaction results, and writing the adjusted aggregate weights and decision thresholds into a configuration library for subsequent target analysis workflow execution.
Description
Free implementation analysis knowledge base construction method based on intelligent agent and workflow Technical Field The invention relates to the technical field of chemical informatics, in particular to a method for constructing a free implementation analysis knowledge base based on an agent and workflow. Background In innovative drug development, free-running analysis (FTO) has become an important link in determining whether candidate molecules can enter the clinic and come into the market. With the rapid increase of the number of global patents, a large number of structural patents adopting Markush claims are interwoven in a plurality of countries and legal systems, potential coverage areas are difficult to comprehensively identify by manual retrieval and experience judgment, omission or excessive conservation is easy to generate, and project decision is influenced. Most of the existing information systems only support retrieval according to documents or structural fragments, lack the capability of uniformly modeling chemical structures, claim texts, targets and historical legal results, lack measurable and comparable free implementation risk depiction modes, and are difficult to support continuous analysis requirements from molecular design to standing decisions. Disclosure of Invention Aiming at a plurality of problems existing in the prior art, the invention provides a method for constructing a free implementation analysis knowledge base based on an agent and workflow. One or more embodiments of the present disclosure provide a method for constructing a free-implementation analysis knowledge base based on an agent and a workflow, including the following steps: constructing a free implementation knowledge graph, and establishing an association relationship among a molecular node, a claim node and a target node based on chemical structure data and patent text data by the free implementation knowledge graph; Generating a Markush template based on the claim with the structural formula in the free implementation knowledge graph, calculating a free implementation distance vector of each molecule relative to each Markush template, and writing the Markush template and the free implementation distance vector into the free implementation knowledge graph; Constructing a target analysis workflow based on the free implementation knowledge graph and the free implementation distance vector, and calling a candidate molecule to generate an agent and a free implementation evaluation agent to obtain a candidate molecule free implementation risk evaluation result and a recommendation result; And establishing a causal analysis model based on the workflow log, the patent examination result and the infringement judgment result, and calibrating an aggregation mode of the free implementation distance vector and a free implementation risk assessment rule according to the causal analysis model. Compared with the prior art, the invention has the advantages that: By constructing a freely implemented knowledge graph based on molecular nodes, claim nodes and target nodes, the method realizes the association expression of chemical structures, patent texts and target information in the same graph model, and solves the problems of structural data and legal information dispersion and difficulty in linkage analysis in the prior art. By extracting the Markush template from the claim with the structural formula and calculating the multi-component free implementation distance vector, the quantifiable depiction of the molecular and patent coverage relationship is realized, and the defect that the free implementation space size of different candidate molecules is difficult to compare mainly through experience judgment is overcome. By constructing a target analysis workflow on the knowledge graph and introducing candidate molecules to generate an agent and freely implementing and evaluating the agent, the automatic risk screening and recommendation from the target to the candidate molecules are realized, and the workload of repeated manual new and comparison is reduced. By utilizing the workflow log, the patent examination result and the infringement judgment result to establish a causal analysis model and regularly calibrating the distance aggregation mode and the evaluation rule, the self-adaptive updating of the freely implemented evaluation rule is realized, and the evaluation result is consistent with the actual legal risk and has traceability. Drawings FIG. 1 is a schematic flow chart of the method of the present invention; FIG. 2 is a schematic diagram of a HER2 free-running analysis flow in an embodiment of the invention; FIG. 3 is a comparison chart of hit ratios of risk levels before and after calibration according to an embodiment of the present invention. Detailed Description Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the descrip