CN-121998089-A - Automatic causal structure generation method based on large language model semantic representation and logical reasoning
Abstract
The application discloses a causal structure automatic generation method based on large language model semantic representation and logic reasoning, which comprises the steps of obtaining input text, carrying out semantic coding and clustering on the obtained input text by utilizing a large language model, establishing a candidate causal variable set, carrying out causal relation detection on the established candidate causal variable set based on anti-facts intervention and do-algorithm, carrying out causal direction judgment, generating a directed acyclic causal graph meeting logic consistency, and generating natural language interpretation and carrying out logic consistency closed loop verification by utilizing the large language model based on the generated directed acyclic causal graph. The automatic generation from natural language to causal structure is realized, causal variable set is automatically extracted and constructed from unstructured natural language text, the defects that the existing causal modeling process needs to define variables manually and rely on field expert modeling are avoided, and the labor cost and professional threshold for causal structure construction are obviously reduced.
Inventors
- WU TONG
- KAN ZONGTING
- HU XU
Assignees
- 杭州团好猫科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20260122
Claims (10)
- 1. An automatic generation method of a causal structure based on semantic representation and logical reasoning of a large language model is characterized by comprising the following steps: Acquiring an input text; Carrying out semantic coding and clustering on the acquired input text by using a large language model, and establishing a candidate causal variable set; performing causal relation detection on the established candidate causal variable set based on the anti-facts intervention and the do operator; carrying out causal direction judgment to generate a directed acyclic causal graph meeting logic consistency; Based on the generated directed acyclic causal graph, natural language explanation is generated by using a large language model, and logic consistency closed-loop verification is performed.
- 2. The automatic causal structure generation method based on large language model semantic representation and logical reasoning according to claim 1, wherein the semantic coding and clustering are performed on the acquired input text by using a large language model, and a candidate causal variable set is established, comprising: Carrying out semantic coding on an input text by using a large language model to obtain word-level or subword-level vector representations, sentence vectors and corresponding attention weight matrixes, wherein the semantic association relations and context dependency structures among semantic units in the text are represented; Clustering semantic unit embedded vectors embedding in the text, and automatically inducing potential candidate causal variables from a semantic representation space; and carrying out preliminary screening and enhancement on the causal semantic validity of the candidate causal variables, and establishing a candidate causal variable set.
- 3. The automatic generation method of causal structure based on semantic representation and logical reasoning of large language model according to claim 2, wherein clustering semantic unit embedded vectors embedding in text, automatically inducing potential candidate causal variables from semantic representation space, comprises: Clustering semantic unit embedded vectors embedding in the text to form semantic clusters which are relatively independent in terms of semantics and represent potential candidate causal variable sets; Mapping each semantic cluster into a candidate causal variable representation in the form of a logical predicate, wherein the expression is as follows: Wherein, the Respectively representing Event types, attributes and time characteristics obtained by statistics and induction of co-occurrence semantic units in the semantic clusters, wherein the Event is represented in a logic predicate form; And performing normalization processing on candidate causal variables through semantic similarity measurement and word sense disambiguation rules to merge semantically equivalent variable expressions and eliminate redundant variables brought by synonyms or ambiguities.
- 4. A causal structure automatic generation method based on large language model semantic representation and logical reasoning according to claim 3, wherein the algorithm for clustering semantic units embedding in the text comprises at least one of the following: HDBSCAN algorithm based on density expansion, k-means algorithm.
- 5. The automatic causal structure generation method based on large language model semantic representation and logical reasoning according to claim 2, wherein the preliminary screening and enhancement of causal semantic validity of candidate causal variables comprises: locating a corresponding causal context region in the attention weight matrix based on a causal marker word set in the natural language; computing causal trigger functions for semantic units located within the causal context area The expression is as follows: Wherein, the To map a semantic unit to a causal semantic intensity function that is not a negative real number, for characterizing the extent to which the semantic unit assumes a causal role in natural language, Is a set of real numbers that is not negative, Is a set segmented by event phrases; retaining its semantic units to satisfy Candidate causal variables above a preset threshold or components thereof.
- 6. The automatic generation method of causal structure based on semantic representation and logical reasoning of large language model according to any one of claims 1-5, wherein causal relation detection of the set of candidate causal variables established based on anti-facts intervention and do operator comprises: The set of candidate causal variables established are expressed as: Wherein the method comprises the steps of Is an effective causal variable after consistency screening; building a structural causal model, mapping candidate causal variables into nodes in the structural causal model, and building a parameterized structural equation for each node variable: Wherein, the Representing the generated variable Is used as a causal function of (a), Representation of Is used to determine the set of parent variables of (c), Representing the j-th causal variable, Representing independent exogenous noise terms; and generating a structural causal graph from all structural equations: wherein V is a variable node set, E is an edge set, and accords with the following conditions: G is a Directed Acyclic Graph (DAG); intervention simulation based on do operator, and for any candidate causal edge Constructing a counterfactual intervention operation: Wherein, the To take the value of the constant of the negative real-time interference prognosis, replace the structural equation ; And (3) performing Pearl inverse fact logic three-step method calculation, and sequentially performing: Tracing, namely deducing potential noise variables; Intervention execution of the intervention ; Prediction of the variable distribution for calculating the prognosis of a patient ; Carrying out causal influence existence judgment, wherein the distribution change quantity of target variables before and after intervention meets the requirement of In the time-course of which the first and second contact surfaces, For the preset threshold value, judge Is a causal relationship and is output as a set of undirected candidate causal edges.
- 7. The method for automatically generating a causal structure based on semantic representation and logical reasoning of claim 6, wherein constructing a counterfactual intervention operation further comprises: Perturbation is applied to embedding vectors in the semantic representation space to form continuous do-intervention samples for aiding in estimating the target variable distribution variation before and after intervention.
- 8. The automatic generation method of causal structure based on semantic representation and logical reasoning of large language model according to claim 6 or 7, wherein the causal direction judgment is performed to generate the directed acyclic causal graph satisfying the logical consistency, comprising: for each undirected candidate causal edge Respectively constructing the inverse fact intervention: Wherein, the Constant value of the negative real intervention prognosis is carried out for the intervention variable i, Constant value of the negative real-time interference prognosis is carried out for observing the target variable j; And respectively executing the three-step reasoning of the counterfactual, and comparing the influence asymmetry of the intervention to determine the causal direction; the following conditional independence check is performed on the determined direction to verify that the direction satisfies the d-separation constraint: Wherein, the Is a candidate conditional variable set, does not contain ; And adding the constraint serving as a soft constraint term into a causal graph structure optimization objective function to generate a directed acyclic causal graph meeting logic consistency.
- 9. The method for automatically generating a causal structure based on semantic representation and logical reasoning of large language model according to claim 8, wherein the performing causal direction determination further comprises: based on embedding disturbance formed continuous anti-fact intervention results, the method is used as auxiliary evidence for causal direction judgment.
- 10. The automatic generation method of causal structure based on semantic representation and logical reasoning of large language model according to claim 9, wherein based on the generated directed acyclic causal graph, natural language interpretation is generated by using large language model and logical consistency closed-loop verification is performed, comprising: Based on the finally generated directed acyclic causal graph and the structural causal model, performing a counterfactual path analysis on the target variable Y focused by the user and the upstream causal path thereof, calculating the variation before and after the intervention by executing the counterfactual intervention, and generating a natural language causal interpretation text meeting the logic path constraint by using a large language model: Converting the directed acyclic causal graph and structural equation constraint into a first-order logic and satisfiability constraint set phi, and verifying whether phi has contradiction by adopting a propositional logic satisfiability problem SAT or a satisfiability module theoretical problem SMT solver: If the conflict is detected, backtracking to locate the conflict edge or equation, returning to causal direction judgment, and carrying out optimization reconstruction: Wherein SAT is a propositional logic satisfiability problem (SAT) solver, phi is a constraint set, G is a final causal graph, F is a constraint set of structural equations, the two form a first-order logic and satisfiability constraint set, valid indicates that assignment exists to enable all constraints to be simultaneously established, conflict indicates that logical contradiction does not exist, and logic contradiction occurs.
Description
Automatic causal structure generation method based on large language model semantic representation and logical reasoning Technical Field The application relates to the technical field of artificial intelligence and causal reasoning, in particular to an automatic causal structure generation method based on semantic representation and logical reasoning of a large language model. Background Current large language models, such as GPT series, gemini, BERT, etc., while capable of very powerful contextual understanding, reasoning, and language generation, rely primarily on statistical reasoning and contextual relevance for their reasoning process, and more on the basis of large amounts of textual data for training. They infer the answer to the question by analyzing patterns in the context. However, this approach has the problem of excessively paying attention to false correlation because the reasoning is greatly affected by data factors, and the generalized predictive effect on unknown data is relatively general. Because reasoning of large language models does not have an explicit causal structure. It does not explicitly express the relationships between variables in the form of a causal chain, nor does it build in specialized support for causal reasoning (such as intervention and anti-facts reasoning). It can only generate answers based on patterns and probabilities learned in training. The causal reasoning system focuses on the explicit modeling of causal relations among various variables in the system, can infer causal links among the variables, and clearly shows the relations through structures such as a causal graph. Its purpose is to enable the model to understand causal mechanisms, not just associations. Core advantages of causal reasoning system: the reasoning causal mechanism is independent of statistical correlation; intervention and counterfactual reasoning can be performed, so that different decision results are simulated; The causal graph is constructed so that the reasoning process is more interpretable. However, existing causal structure learning methods (such as PC algorithms, liNGAM, NOTEARS, etc.) rely mainly on structured numerical data, requiring artificial definition of node variables and independent verification. However, in unstructured natural language scenes, causal variables exist implicitly in the form of semantic units, have no explicit numerical structure, and are difficult to directly apply the traditional causal reasoning method. Disclosure of Invention The application aims to provide a causal structure automatic generation method based on large language model semantic representation and logical reasoning, which solves one or more technical problems in the prior art and at least provides a beneficial selection or creation condition. The application adopts the following technical scheme for realizing the purposes of the application: The application provides a causal structure automatic generation method based on large language model semantic representation and logical reasoning, which comprises the following steps: Acquiring an input text; Carrying out semantic coding and clustering on the acquired input text by using a large language model, and establishing a candidate causal variable set; carrying out causal relation detection on the established candidate causal variable set; carrying out causal direction judgment to generate a directed acyclic causal graph meeting logic consistency; Based on the generated directed acyclic causal graph, natural language explanation is generated by using a large language model, and logic consistency closed-loop verification is performed. Further, the semantic coding and clustering are carried out on the acquired input text by utilizing a large language model, and a candidate causal variable set is established, which comprises the following steps: Carrying out semantic coding on an input text by using a large language model to obtain word-level or subword-level vector representations, sentence vectors and corresponding attention weight matrixes, wherein the semantic association relations and context dependency structures among semantic units in the text are represented; Clustering semantic unit embedded vectors embedding in the text, and automatically inducing potential candidate causal variables from a semantic representation space; and carrying out preliminary screening and enhancement on the causal semantic validity of the candidate causal variables, and establishing a candidate causal variable set. Further, clustering semantic unit embedded vectors embedding in the text, automatically inducing potential candidate causal variables from the semantic representation space, including: Clustering semantic unit embedded vectors embedding in the text to form semantic clusters which are relatively independent in terms of semantics and represent potential candidate causal variable sets; Mapping each semantic cluster into a candidate causal variable represen