CN-122023068-A - LLM-based medical patent infringement risk multidimensional early warning method
Abstract
Analyzing a user attention level, configuring risk dimension weights in combination with decision scenes, acquiring medical patent technical features and target products, generating a multi-dimensional substructure map by using BERT analysis claims, combining atomic stability and attention moment array quantification technical features, evaluating evading feasibility by calculating candidate site key energy disturbance values and distribution density, fusing the technical features with multi-source information, constructing a multi-dimensional risk index set, generating a multi-dimensional risk early warning result by weighting aggregation, constructing the substructure map by using BERT, combining atomic stability quantification contribution weights, calculating key energy evading feasibility, fusing multi-source data mapping by using molecular fingerprints, executing weighting aggregation according to scenes, eliminating subjective deviation and mining microscopic differences, and realizing medical patent infringement risk accurate quantitative early warning.
Inventors
- JIN XIA
- TIAN KAIGE
- LIU WEI
Assignees
- 杭州慧医道科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260202
Claims (10)
- 1. The medical patent infringement risk multidimensional early warning method based on LLM is characterized by comprising the following steps of: s1, acquiring technical characteristics of medical patents and target products to be evaluated, performing user layering analysis through user interaction behavior data, determining risk attention levels, configuring a corresponding risk dimension weight table according to decision scene categories, extracting chemical reaction description and structural characteristics by using BERT analysis patent claim texts, and generating a multidimensional substructure map; S2, calculating an atomic stability gradient value and an adjacent bond reactivity parameter based on the multidimensional substructure map, and generating a technical feature key score by combining attention moment array quantization structure site technical contribution weight of BERT analysis; S3, calculating key energy disturbance values of candidate substitution sites based on the technical feature key scores, screening the replaceable sites meeting a preset threshold, and counting distribution density and energy cost to generate evasion design feasibility scores; S4, invoking the evasion design feasibility score, calculating the similarity between a target product and a patent reference structure by adopting a molecular fingerprint algorithm, and performing feature mapping with external legal state, market data and multisource business intelligence data to construct a multi-dimensional risk index set; and S5, calling the multi-dimensional risk index set, and executing weighted aggregation operation on the multiple risk index data according to the risk dimension weight table to generate a multi-dimensional risk early warning result adapting to the decision scene and the user level.
- 2. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 1, wherein the multi-dimensional substructure map comprises reaction center atomic nodes, molecular skeleton topological edges and substituent three-dimensional parameters, the technical feature key scores comprise atomic thermodynamic stability parameters, chemical bond reactivity indexes and structural site importance weights, the evasion design feasibility scores comprise bond dissociation energy change amplitude, substitution site space density and reaction activation energy barriers, the multi-dimensional risk index set comprises valley similarity coefficients, patent law state data and market competition intelligence data, and the multi-dimensional risk pre-warning results comprise comprehensive risk ratings, decision confidence and risk pre-warning signals; The multi-dimensional risk index set comprises a molecular similarity index and a semantic coupling index which represent patent relativity, an indication consistency factor and a target point relevance factor which represent medicine characteristic matching degree, a family patent coverage factor which represents regional layout, and a legal stability factor, a market value factor, a technical quality factor, a strategic aggressiveness factor, a medicine supervision relevance factor, a evidence risk factor and a timeliness factor which are based on external data mapping.
- 3. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 1, wherein the specific steps of S1 are as follows: S101, obtaining technical characteristics of a medicine patent to be analyzed and a target product to be evaluated through a medicine enterprise patent database, analyzing and extracting a chemical structure, constructing an atomic node set and a key connection set, comparing atomic state parameters with key level parameters based on key level judging reference values, converting comparison data into a matrixing structure, and generating a structure bonding matrix; s102, calling the structural bonding matrix, acquiring interaction behavior data of a current user, analyzing a risk concern level to which the user belongs, combining a risk dimension weight table corresponding to a decision scene category index, adopting BERT to analyze a claim text feature sequence, searching a chemical reaction description fragment, performing mapping comparison between fragment parameters and bonding parameters of the structural bonding matrix, screening a mapping vector group according to a vector consistency judging reference value, and acquiring a reaction association feature set; s103, calling the reaction association feature set, performing cross comparison on substituent parameter sets in the structural bonding matrix, performing site index intersection operation, obtaining a site association data set, performing topological serialization operation on the data set, and establishing a multidimensional substructure map.
- 4. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 3, wherein the key level decision benchmark value is based on a value determined by a chemical bond type statistics result corresponding to a disclosed chemical structural general formula in a patent database of a pharmaceutical enterprise; The vector consistency judging reference value is a value determined based on feature vector similarity distribution data of bonding parameters in a structural bonding matrix and fragment parameters in a chemical reaction description fragment; the risk dimension covered by the risk dimension weight table comprises the following quantitative definitions of parameter dimensions besides the patent relevance: the technical quality dimension is a quantized value determined based on the coverage breadth of the cited frequency of the patent and the IPC class number to which the patent belongs; strategic aggressiveness dimension, namely a quantitative value determined based on the regional coverage rate of patent rights in the same family patent layout and the infringement litigation history; a medicine supervision relevance dimension is a quantitative value determined based on a patent and medicine registration approval file or a relevance mark of a patent linkage system; Evidence risk dimension, namely a quantized value determined based on hidden characteristics of infringement and evidence obtaining technology difficulty coefficients; and the timeliness dimension is a quantized value determined based on an overlapping interval of a patent application date and a target product research and development time axis.
- 5. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 3, wherein the specific steps of S2 are as follows: S201, retrieving atomic node parameters based on the multidimensional substructure map, executing attribution conversion according to the number of key connection of key level parameters, executing attribution conversion according to the occupation distribution of valence electron occupation ratios, and aggregating two kinds of conversion data according to atomic indexes to generate an atomic steady-state factor set; S202, calling the atomic steady-state factor set, performing discrete splitting according to the number of adjacent bonds for the adjacent bond reaction activity parameters, performing combination of discrete activity values according to the atomic index and the atomic steady-state factor set sequence, and performing matrixing arrangement of the combined sequence according to the node sequence to generate an atomic gradient mapping matrix; s203, calling the atomic gradient mapping matrix, analyzing a rank index matching gradient sequence of the attention matrix according to BERT, rearranging according to a node index, performing weighted integration on the arranged sequence and attention moment matrix weight distribution, calculating an integrated attention weight average value, quantifying structural site technology contribution weight, and generating a technical feature key score.
- 6. The LLM-based medical patent infringement risk multi-dimensional pre-warning method according to claim 5, wherein the specific steps of S3 are as follows: S301, calling the key scores of the technical features, analyzing the chemical bond types and bond energy attributes of the atomic nodes in the map, constructing candidate substitution site bond energy parameter sets, calculating site bond energy disturbance values, comparing the site bond energy disturbance values with preset bond energy disturbance thresholds item by item, screening sites smaller than the preset bond energy disturbance thresholds, and generating a compliance site index set; S302, invoking the index set of the compliance site, performing amplitude adjustment on the position belonging to the index set in the key score of the technical feature, performing reassignment on weight elements not belonging to the index set of the compliance site according to a normalized reference value, and aggregating the adjusted weight elements according to a vector sequence to obtain an adjusted weight distribution matrix; S303, calling the adjusted weight distribution matrix, counting the spatial distribution density of the compliance sites in the matrix, calculating the energy level cost by combining the reciprocal of the site bond energy disturbance value, and carrying out weighted summation on the distribution density and the energy level cost to obtain the evasion design feasibility score.
- 7. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 6, wherein the specific steps of S4 are as follows: S401, calling the evasion design feasibility score, calling an adjustment weight matrix with a mapping relation of the score, performing item-by-item comparison on weight parameters in the matrix and a molecular structure bonding matrix of a target product to be evaluated, and marking a region with consistent comparison to generate a weighted structure indication vector; s402, calling the weighted structure indication vector, executing intersection and position-collecting operation on a molecular fingerprint position set of a target product to be evaluated and technical characteristics of a patent chemical structure according to a molecular fingerprint algorithm, and executing normalization weighting on the molecular fingerprint position set and dimension parameters of a text semantic embedding vector to generate basic risk data; S403, calling the basic risk data, obtaining legal state, market data and multisource business information data, mapping the legal state, the market data and the multisource business information data into corresponding risk factors, fusing each risk factor with the basic risk data, and constructing a multidimensional risk index set.
- 8. The LLM-based medical patent infringement risk multidimensional pre-warning method of claim 7, wherein the obtaining legal status, market data and multisource business intelligence data and mapping the legal status, market data, peer patent layout data, quotation data, medicine supervision registration data, right person litigation data, medicine clinical attribute data, infringement evidence feature data and research and development time axis data as corresponding risk factors means obtaining legal status, market data, peer patent layout data, quotation data, technical quality factors, medicine supervision registration data, adaptation risk consistency factors and target point relevance factors through a data interface, infringement evidence feature data is mapped to evidence factors, research and development time axis data is mapped to timeliness factors, the factors and basic risk data are mapped to legal stability factors by using a preset mapping model, market data is mapped to market value factors, peer patent layout data is mapped to peer patent coverage factors, quotation data is mapped to technical quality factors, medicine supervision registration data is mapped to medicine supervision relevance factors, right person litigation data is mapped to strategic aggression factors, medicine clinical attribute data is mapped to adaptation risk consistency factors and target point relevance factors, infringement evidence feature data is mapped to evidence factors, and research and development time axis data is mapped to timeliness factors, and vector splicing and normalization processing are completed.
- 9. The LLM-based medical patent infringement risk multi-dimensional pre-warning method of claim 7, wherein the specific steps of S5 are as follows: s501, invoking a risk feature vector group in the multi-dimensional risk index set, reading a risk dimension weight table configured based on a risk attention level and a decision scene, extracting a weight coefficient corresponding to a dimension, and performing aggregation operation on numerical components in the vector group according to the weight coefficient to generate a feature hotspot matrix; S502, calling the feature hotspot matrix, aiming at a semantic embedded value of a patent text term sequence, executing Euclidean distance calculation on the embedded value and the feature hotspot matrix vector, judging a distance relation according to a semantic matching reference value, recording a position with the Euclidean distance smaller than the reference value, and serializing and integrating to obtain a semantic consistency index sequence; And S503, according to the semantic consistency index sequence, executing sequencing operation on the risk vector value group pointed by the sequence, judging whether the sequencing result enters a warning interval according to a risk warning reference value corresponding to the decision scene, recording indexes and aggregating to obtain a multidimensional risk early warning result adapting to the decision scene and a user level.
- 10. The LLM-based medical patent infringement risk multi-dimensional early warning method of claim 1, wherein the analyzing the risk attention level to which the user belongs, combining the risk dimension weight table corresponding to the decision scene category index means increasing the weight ratio of the market value factor, the strategic aggressiveness factor, and the legal stability factor when the risk attention level is the "strategic planning layer"; When the risk concern level is a technology execution level, the weight ratio of the technical feature key score and the evasion design feasibility score is improved; On the basis, the two-level adjustment is performed in combination with the decision scene category: when the decision scene category is investment screening, setting the weight ratio of the market value factors and the technical feature key scores as a first priority; When the decision scene category is technical research and development, setting the weight ratio of the evasion design feasibility score and the technical feature key score as a first priority; When the decision scene category is compliance review, setting the weight ratio of the legal stability factor and the technical feature key score as a first priority on the basis of the current level weight template.
Description
LLM-based medical patent infringement risk multidimensional early warning method Technical Field The invention relates to the technical field of artificial intelligence, in particular to a medical patent infringement risk multi-dimensional early warning method based on LLM. Background The technical field of patent analysis and drug research and development decision support covers deep mining and association analysis of global medicine patent information, aims to provide objective basis for stand evaluation, technical path planning and market avoidance strategies of new drug research and development by analyzing elements such as compound structure, preparation process, pharmacological action, legal state and the like in patent literature, and identifies potential technical barriers and research and development opportunities by constructing a patent information network. The traditional medicine patent infringement risk multi-dimensional early warning method is characterized in that a pointer evaluates the free implementation risk of a specific medicine project in a research and development or marketing period, a professional retriever combines a chemical structural formula and a classification number to obtain a target patent set in a database, independent claims and dependent claims in the claims are combed in a manual reading mode, a characteristic comparison table containing all necessary technical characteristics is constructed, the active ingredient structure, crystal form, preparation formula or indication application of the medicine to be analyzed and the characteristics of the retrieved patent claims are manually compared item by item, whether the specific medicine project falls into the protection scope of the patent claims is judged according to a full coverage principle or an equivalent principle, and risk early warning is finished by manually writing a full-time investigation report. The prior art is highly dependent on manual reading and carding of claims and construction of feature comparison table, screening efficiency is low when mass patent documents and complex chemical structural general formulas are processed, key technical features are easy to miss or judge standard differently by judging item by item simply by subjective experience, quantitative analysis on chemical bond energy disturbance and structural site technical contribution is lacking, static qualitative assessment mode is difficult to dynamically fuse external legal state and market information, design feasibility and multidimensional risk distribution cannot be objectively represented and avoided, and new medicine research and development decisions lack accurate data support and prospective guidance. Disclosure of Invention In order to solve the technical problems that the prior art is highly dependent on manual reading and carding of claims and construction of feature comparison table, screening efficiency is low when mass patent documents and complex chemical structural general formulas are processed, key technical features are easy to miss or different in judgment standard only by subjective experience item by item, quantitative analysis on chemical bond energy disturbance and structural site technical contribution is lacking, external legal state and market intelligence are difficult to dynamically fuse in a static qualitative assessment mode, design feasibility and multidimensional risk distribution cannot be objectively represented and avoided, and new medicine research and development decisions lack accurate data support and prospective guidance. In order to achieve the above purpose, the invention adopts a medical patent infringement risk multi-dimensional early warning method based on LLM, which comprises the following steps: s1, acquiring technical characteristics of medical patents and target products to be evaluated, performing user layering analysis through user interaction behavior data, determining risk attention levels, configuring a corresponding risk dimension weight table according to decision scene categories, extracting chemical reaction description and structural characteristics by using BERT analysis patent claim texts, and generating a multidimensional substructure map; S2, calculating an atomic stability gradient value and an adjacent bond reactivity parameter based on the multidimensional substructure map, and generating a technical feature key score by combining attention moment array quantization structure site technical contribution weight of BERT analysis; S3, calculating key energy disturbance values of candidate substitution sites based on the technical feature key scores, screening the replaceable sites meeting a preset threshold, and counting distribution density and energy cost to generate evasion design feasibility scores; S4, invoking the evasion design feasibility score, calculating the similarity between a target product and a patent reference structure by adopting a molecular fingerprint