CN-121998055-A - Causal federal element reinforcement learning intelligent decision-making method and system for benefit-to-people claim

CN121998055ACN 121998055 ACN121998055 ACN 121998055ACN-121998055-A

Abstract

The application discloses a causal federal element reinforcement learning intelligent decision-making method and a causal federal element reinforcement learning intelligent decision-making system for a Huiming claim, which relate to the field of insurance intelligent claim-making, and the method comprises the steps of constructing and adapting to generate a regional causal knowledge graph based on a multidimensional policy text; the method comprises the steps of adopting a causal federal element reinforcement learning framework, initializing through a element strategy network, carrying out fine adjustment on a small sample in an encryption mode on the premise that data is not output by federal learning, realizing local intention recognition model adaptation, carrying out intention recognition according to a user claim settlement consultation text by combining neural sign double-track reasoning with knowledge map causal chain matching, training a decision tree, generating a visual report through inverse fact reasoning, and assisting decision. The application enhances the accuracy of user intention recognition and the interpretability of rules, improves the response adapting efficiency of policy dynamic update, ensures the privacy on the premise that data does not go out of the domain, realizes small sample learning, further optimizes the processing efficiency of the benefit-to-people insurance claims and the customer satisfaction, and reduces the operation cost.

Inventors

MIAO WEI
WANG BUYI
CHANG BO
SUN ZHAOMIN
MA JIE
XING XIN
LI WENXIAO
SHEN YIHUI

Assignees

南京市智慧医疗投资运营服务有限公司

Dates

Publication Date: 20260508
Application Date: 20260107

Claims (10)

1. The causal federal element reinforcement learning intelligent decision-making method for the benefit of the people to protect the claims is characterized by comprising the following steps: Constructing a three-dimensional entity node set by utilizing a causal knowledge graph engine according to a policy text, constructing a loop-free knowledge graph by means of association screening, time sequence correction and inverse fact verification and detection, and performing region adaptation on the loop-free knowledge graph in a physical mapping mode to obtain a region-adaptive causal knowledge graph; Constructing and training a meta-strategy network, initializing local adaptation networks of all areas through the trained meta-strategy network, performing policy updating adaptation on the local adaptation networks by utilizing the local adaptation networks through federal learning encryption fine tuning under the condition that small sample data cannot go out of a domain, and obtaining a local intention recognition model after policy updating; Acquiring and analyzing the consultation text of the user claim settlement, carrying out intention recognition on the consultation text of the user claim settlement by adopting neural symbol double-track reasoning, traversing the region adaptation causal knowledge graph to carry out causal chain matching and conclusion integration, and outputting an intention recognition result; Training a decision tree based on relevant features of historical claim case, acquiring a diversion decision result from the trained causal decision tree, and constructing a causal path diagram through CausalML inverse facts reasoning based on an intention recognition result, the diversion decision result and a regional adaptation causal knowledge map to generate a visual report; and carrying out claim settlement decision on the benefit-to-people insurance claim according to the visual report.
2. The causal federal element reinforcement learning intelligent decision method for the Huimin claim 1, wherein the decision tree is trained based on relevant features of historical claim cases, a diversion decision result is obtained from the trained causal decision tree, a causal path diagram is constructed through CausalML anti-facts reasoning based on an intention recognition result, the diversion decision result and a regional adaptation causal knowledge map, and a visual report is generated, and the causal path diagram and a local adaptation network are adaptively adjusted according to feedback results of a manual auditing platform.
3. The causal federal element reinforcement learning intelligent decision-making method for the Huimin claim 1, wherein the causal federal element reinforcement learning intelligent decision-making method is characterized in that a three-dimensional entity node set is constructed by utilizing a causal knowledge graph engine according to policy texts, a loop-free knowledge graph is constructed by correlation screening, time sequence correction and inverse fact verification detection pruning loops, and regional adaptation is carried out on the loop-free knowledge graph in a physical mapping mode to obtain a regional adaptation causal knowledge graph, and the method specifically comprises the following steps: Inputting the policy text into a causal knowledge graph engine to construct a three-dimensional causal entity set; Performing preliminary association screening on the three-dimensional causal entity set based on the partial correlation coefficient to obtain a preliminary screening causal entity set; Performing time sequence association correction on the primary screening causal entity set by using time sequence effect to obtain a candidate causal link, wherein the time sequence effect is the time sequence matching of the policy execution time and the occurrence time of the claim case; The past disease nodes are removed through simulation, the accuracy rate change mode is identified through tracking of the claim rejection intention, and the candidate causal link is subjected to inverse fact reasoning verification to obtain the accuracy rate change quantity; Pruning the candidate causal links by using the accuracy rate variation to obtain pruned causal links; Calculating the confidence coefficient of the causal chain after pruning, screening causal chains with high confidence coefficient threshold values, and constructing an loop-free knowledge graph; Establishing a basic causal model and a regional expansion model in a layering manner based on loop-free knowledge graph and different policy terms of each region; And establishing entity mapping association of the basic causal model and the regional expansion model by utilizing an entity mapping table, and obtaining a regional adaptation causal knowledge graph through adaptation of different regions.
4. The causal federal element reinforcement learning intelligent decision-making method for the benefit of civil protection claim 1, wherein a meta-policy network is constructed and trained, a local adaptation network in each region is initialized through the trained meta-policy network, the local adaptation network is utilized to carry out policy updating adaptation on the local adaptation network based on small sample data through federal learning encryption fine tuning under the condition that data is not out of a domain, and a local intention recognition model after policy updating is obtained, and the method specifically comprises the following steps: constructing and training a meta-strategy network; Based on the policy commonality data and the local geographical claim data, pre-training the meta-policy network by adopting a general intention recognition policy to obtain a trained meta-policy network; initializing a local adaptation network through the universal strategy parameters output by the trained meta strategy network; Based on the newly added policy cases of each region, performing parameter fine adjustment on the local adaptation network by using a model-independent element learning algorithm; Encrypting and uploading gradient parameters of the local adaptation network by adopting a Paillier homomorphic encryption algorithm; the federal learning coordination node is utilized to receive encryption gradient parameters of each region, aggregation calculation is carried out, the coordination node feeds back the aggregated gradient parameters to local adaptation of each region, and an intention recognition model of the adaptation of each region is obtained; Collecting local geographical claim data and gradient parameters of each region; Storing the local geographic claim data in a local node, encrypting the gradient parameters of each region by adopting a Paillier homomorphic encryption algorithm, only sharing the encrypted gradient, and carrying out differential privacy noise enhancement on the encrypted gradient to obtain the encrypted gradient parameters; carrying out parameter optimization on the local adaptation network parameters through a causal driven reward function by utilizing the intention recognition accuracy and the causal consistency score to obtain optimized local adaptation network parameters; acquiring a medical insurance bureau network policy update notice and a manually uploaded newly-added policy file; the policy file analysis engine is adopted to extract the policy characteristics of the newly-added policy file which is manually uploaded and announced by the policy update notice of the medical insurance bureau network, so as to obtain a policy update characteristic vector; And triggering a fine tuning process of the local adaptation network by using the policy updating feature vector, and updating the local intention recognition model to obtain the local intention recognition model after policy updating.
5. The causal federal element reinforcement learning intelligent decision method for the benefit-to-people claim 4, wherein the element policy network is a 3-layer fully connected neural network, an input layer dimension=policy feature vector dimension+intention label vector dimension, a hidden layer dimension=512, and an output layer dimension=general intention recognition policy parameter dimension.
6. The causal federal element reinforcement learning intelligent decision method for the Huiming claim 1 is characterized by obtaining and analyzing a user claim settlement consultation text, carrying out intention recognition on the user claim settlement consultation text by adopting a neural symbol double-track reasoning, carrying out causal chain matching and conclusion integration by traversing a region adaptation causal knowledge graph, and outputting an intention recognition result, and specifically comprising the following steps: Acquiring and analyzing a consultation text of the claim settlement of the user; Inputting the consulting text of the claim of the user into a local adaptation network, predicting the intention of the user through neural track reasoning, and screening the intention of the user with low confidence coefficient by utilizing a confidence coefficient threshold value; decomposing the low-confidence user intention into a core intention and a sub-intention by using a first-order predicate logic rule base; traversing the region adaptation causal knowledge graph, and matching causal links corresponding to each sub-intention to obtain a causal link matching result; and integrating conclusions according to the core intention and the causal chain matching result, and outputting an intention recognition result.
7. The causal federal element reinforcement learning intelligent decision method for claim 6, wherein the first-order predicate logic rule base is constructed by converting business logic into a defined expression using predicates, adjectives, and conjunctions.
8. The causal federal element reinforcement learning intelligent decision method for the benefit-to-people claim 1, wherein training a decision tree based on relevant features of historical claim cases, obtaining a diversion decision result from the trained causal decision tree, and constructing a causal path graph based on an intention recognition result, the diversion decision result and a regional adaptation causal knowledge graph through CausalML inverse facts reasoning to generate a visual report, and specifically comprises: acquiring relevant characteristics of historical claim case; Constructing a decision tree; Training a decision tree by utilizing relevant characteristics of the historical claim case; Inputting CausalML the intention recognition result, the shunt decision result and the region adaptation causal knowledge graph into a counterfactual inference to generate a counterfactual conclusion; Based on the counterfactual conclusion, constructing an interactive causal path graph by using the D3. Js; A visual decision report is generated from the interactive causal path graph.
9. The causal federal element reinforcement learning intelligent decision method for the benefit of claim 8, wherein the process of constructing the decision tree is as follows: the causal link complexity is taken as a root node, wherein the causal link complexity = causal link length x average causal strength; The first division is carried out from the root node, and a first branch node is generated, wherein if the causal link complexity is greater than a complexity threshold, the causal link complexity is judged to be a high-risk complex case, and the high-risk complex case is directly classified into a manual auditing branch without further judgment; Entering the next layer of the decision tree to generate a second branch node, wherein if the complexity of the causal link is smaller than the complexity threshold and the single claim settlement amount is larger than the claim settlement amount threshold, the causal link still belongs to a manual auditing branch and does not need further judgment; Entering the next layer of the decision tree to generate a third branch node, wherein the third branch node belongs to an automatic claim settlement branch when the causal link complexity is smaller than a complexity threshold and the single claim settlement amount is larger than a claim settlement amount threshold; And integrating the division rules of the root node, the first branch node, the second branch node and the third branch node, and constructing a decision tree.
10. A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the causal federal element reinforcement learning intelligent decision method for a benefit-to-protection claim of any one of claims 1-9.

Description

Causal federal element reinforcement learning intelligent decision-making method and system for benefit-to-people claim Technical Field The application relates to the field of intelligent insurance claim settlement, in particular to a causal federal element reinforcement learning intelligent decision method and system for the benefit of the citizen claim settlement. Background Hui Minbao is taken as common Hui Xing commercial health insurance, has wide coverage crowd, large policy difference and complex claim settlement scene (containing special medicine reimbursement, prior symptom identification, cross-province medical record and other more than 50 types of subdivision intentions), and the claim settlement processing efficiency and the customer satisfaction degree directly influence the popularization and operation of insurance products. The existing Huimin claim intention identification and distribution technology has the following core defects: 1. The rule cause and effect logic fracture and the policy suitability are poor, namely the prior art adopts a correlation type knowledge graph (such as GraphRAG dynamic graph) or a fixed rule engine, only can encode an entity-correlation relationship, and cannot capture the logic constraint of cause and effect in the policy clause (such as correlation of special medicine admission and the necessary condition of material requirement). When the political strategy iterates (such as adding special medicine catalogs and adjusting the remote medical records taking rules), the original association rules are easy to fail, the intention recognition error rate is as high as 18%, the response period of the policy updating is more than 48 hours, and the dynamic adaptation requirement can not be met. 2. The contradiction between multi-region data privacy and sample sparsity is prominent, the benefit people keep obvious region policy difference, the claim data of each region cannot be shared intensively due to the privacy protection requirement, so that the newly increased cases of partial regions (especially the newly increased test point regions) are insufficient, the mode of the traditional incremental learning dependency centralized big data is completely invalid, the generalization capability of the intention recognition model is poor, and the regional differentiation requirement is difficult to adapt. 3. The intention recognition ' black box reasoning ' has low decision transparency, the existing intention recognition technology based on Graph Neural Network (GNN) and deep learning can only output intention labels and split results, the ' why the intention is judged ' why the split path is selected ' can not be explained, the manual auditing lacks visual logic support, customer disputes are easily caused, the dispute rate in the prior art reaches 15% -20%, and the time consumption of manual decision is 40% of the total time length of claim settlement. 4. The technical scheme is homogeneous, the innovation barriers are low, the existing patent multi-isolation application GraphRAG, causal graph, meta reinforcement learning and other technologies are adopted, for example, part of patents adopt causal graph for insurance fraud detection but do not make reasonable claim intention recognition, and part of patents adopt meta reinforcement learning for robot control and do not relate to policy adaptation scenes. In summary, the prior art cannot solve the four core contradictions of policy dynamic adaptation, data privacy protection, reasoning interpretable and decision high efficiency, and further cannot realize the accuracy, high efficiency and transparency of the destiny warranty and claim intention recognition and distribution. Disclosure of Invention The application aims to provide a causal federal element reinforcement learning intelligent decision-making method and system for the benefit-to-people claim, which can simultaneously enhance the accuracy of user intention identification and rule interpretability, improve the response adaptation efficiency of policy dynamic update, ensure privacy on the premise that data does not go out of a domain, realize small sample learning, further optimize the benefit-to-people claim processing efficiency and customer satisfaction, and reduce the operation cost. In order to achieve the above object, the present application provides the following. The first aspect of the application provides a causal federal element reinforcement learning intelligent decision-making method for a Huimin claim, which comprises the steps of constructing a three-dimensional entity node set by utilizing a causal knowledge graph engine according to policy texts, constructing an acyclic knowledge graph by means of association screening, time sequence correction and inverse fact verification detection pruning loops, and performing regional adaptation on the acyclic knowledge graph in a physical mapping mode to obtain a regional adaptation causal knowledge graph;