CN-121684070-B - Automatic theorem proving method based on neural network guidance and quantum optimization

CN121684070BCN 121684070 BCN121684070 BCN 121684070BCN-121684070-B

Abstract

The invention discloses an automatic theorem proving method based on neural network guidance and quantum optimization, which comprises the steps of extracting logic structural characteristics of a proposition logic task, predicting applicability weights of natural deduction rules by using a preconfigured neural network, executing a forward reasoning iterative process, dynamically identifying potential intermediate conclusions to determine a variable space, constructing a quadratic unconstrained binary optimization QUBO model according to the potential intermediate conclusions, adjusting punishment coefficients of rule constraint items by using the applicability weights in the construction process, remodelling energy topography of a solution space, adaptively scheduling calculation resources according to a problem scale, mapping the model to a coherent isooctane machine or classical simulation rear end solution thereof, and reconstructing a proving path through double verification. The invention effectively solves the problem that the traditional method is difficult to process dynamic multi-step reasoning by selecting blind rules and static optimization models, and realizes the improvement of proving efficiency and expandability.

Inventors

WANG KESONG
ZHANG XI

Assignees

中科南京人工智能创新研究院
中国科学院自动化研究所

Dates

Publication Date: 20260508
Application Date: 20260211

Claims (9)

1. An automatic theorem proving method based on neural network guidance and quantum optimization is characterized by comprising the following steps: Acquiring a proposition logic proving task containing a target proposition formula to be proving and an axiom set, and extracting a logic structure feature vector reflecting the proposition logic proving task; Inputting the logic structure feature vector into a preconfigured rule selection model to generate rule applicability weights for a natural deduction rule set; Constructing a secondary unconstrained binary optimization model based on a target proposition formula, an axiom set and a rule applicability weight, wherein the secondary unconstrained binary optimization model comprises a rule constraint term, and the rule applicability weight is configured to adjust a penalty coefficient of the rule constraint term so as to remodel the energy landform of a solution space; Mapping the secondary unconstrained binary optimization model to a calculation solution back end to execute quantum calculation optimization to obtain binary variable assignment for minimizing the model energy; verifying the validity of binary variable assignment, and reconstructing a logic proving path from an axiom set to a target proposition formula based on the valid binary variable assignment; The energy function H of the quadratic unconstrained binary optimization model is constructed as follows: H=H axiom +H goal +H structure +λ rule ∑ r w r ·C r ; The algorithm comprises a rule constraint item, a structural constraint item, a lambda rule and a C r , wherein H axiom is an axiom constraint item with a true value of a variable corresponding to a forced axiom set, H goal is a target constraint item with a true value of a variable corresponding to a forced target proposition formula, H structure is a structural constraint item for guaranteeing semantic consistency of a logic operator, w r is a basic rule penalty coefficient, w r is a weight value corresponding to an r rule in rule applicability weights, and C r is a rule constraint item corresponding to the r rule and is configured to contribute non-zero penalty energy if and only if a precondition variable of the rule is true and a conclusion variable is false.
2. The method of claim 1, wherein constructing a quadratic unconstrained binary optimization model comprises: Executing a forward reasoning iterative process, and determining a variable set required by the model, wherein the variable set comprises initializing a current knowledge base containing an axiom set; Traversing a natural deduction rule set, detecting whether a proposition combination in a current knowledge base meets the precondition of a preset rule, if so, deducing and generating an intermediate proposition conclusion based on the preset rule, and adding the intermediate proposition conclusion into the current knowledge base; Repeating the detecting and deriving steps until the current knowledge base is no longer expanded or contains the target proposition formula; mapping all propositions accumulated in the current knowledge base into QUBO variables, and constructing corresponding rule constraint items for each pair of deduction relations established in the forward reasoning iterative process.
3. The method of claim 1, wherein the rule constraint term C r includes a constraint C MP for a positive pre-law and a constraint C MT for a negative post-law, and the calculation formulas are respectively: C MP =x P ·x P->Q ·(1-x Q ); C MT =x P->Q ·x ~Q ·(1-x ~P ); Where x P is a binary variable corresponding to a front-end proposition P in a rule premises, x P->Q is an auxiliary binary variable corresponding to an implication proposition P > Q, x Q is a binary variable corresponding to a back-end proposition Q in a rule conclusion, x ~Q is an auxiliary binary variable corresponding to a back-end proposition negation form-Q, x ~P is an auxiliary binary variable corresponding to a front-end proposition negation form-P, and the constraint is configured to generate an energy penalty of value 1 if and only if the combination of truths of related propositions violates a logical reasoning rule.
4. The method of claim 1, wherein mapping the quadratic unconstrained binary optimization model to the computational solution backend comprises adaptively scheduling heterogeneous computational resources according to model variable scale: Counting the total number N of binary variables contained in the secondary unconstrained binary optimization model; If N is smaller than a first preset threshold, mapping the model to a coherent Xin Ji classical simulator based on a central processing unit, and solving a kinetic equation by utilizing a numerical integration algorithm; If N is larger than a first preset threshold value and smaller than a second preset threshold value, mapping the model to a parallel coherent Xin Ji simulator based on a graphic processor, and executing large-scale parallel evolution by utilizing a single-instruction multi-data stream architecture; If N is greater than a second preset threshold, the model is mapped to coherent if Xin Ji, and the QUBO model is naturally mapped to equivalent if Xin Moxing.
5. A method according to claim 3, wherein the structural constraint terms comprise semantic consistency constraints for auxiliary variables introduced by logical operators, and the corresponding structural penalty functions C struct for negative, conjunctive and implication operations in the propositional logic are constructed as follows: C ~P =(x ~P +x P -1) 2 ; C P∧Q =(x P∧Q -x P ·x Q ) 2 ; C P->Q =(x P->Q -1+x P -x P ·x Q ) 2 ; wherein x P∧Q is an auxiliary binary variable corresponding to a conjunctive proposition P ∈Q; The structure penalty function is configured to generate a positive energy penalty when the value of the auxiliary binary variable does not agree with the truth table definition of the logical operator to force the auxiliary binary variable to remain semantically correct in an energy minimized state.
6. The method of claim 4, wherein prior to mapping the model to the computational solution backend, storing and managing the coupling matrix Q of the quadratic unconstrained binary optimization model using a global memory resource optimization strategy, the global memory resource optimization strategy comprising storing non-zero elements of the coupling matrix Q in a compressed lean row format such that the storage space complexity S mem satisfies the relationship: ; wherein N is the total number of binary variables in the model, and N nz is the number of non-zero elements in the coupling matrix Q; the back end of the calculation solution is provided with a memory pool multiplexing mechanism for maintaining and multiplexing the distributed sparse matrix memory blocks among a plurality of sampling tasks.
7. The method of claim 1, wherein extracting a logical structural feature vector reflecting a propositional logical proof task comprises computing a multi-dimensional logical feature to form the logical structural feature vector: calculating basic statistical characteristics, including axiom quantity, variable total number and formula tree depth in an axiom set; calculating operator distribution characteristics, including frequencies of occurrence of implication operators, negation operators, conjunctions operators and disjuncts operators in an axiom set and a target proposition formula; calculating an inference mode matching feature, and generating positive front part potential and negative back part potential indexes by detecting whether clause structures meeting the premise of an inference rule exist in an axiom set; and calculating variable association characteristics, including variable overlapping proportion and isolated variable quantity between the axiom set and the target proposition formula.
8. The method of claim 1, wherein verifying validity of binary variable assignments comprises performing a hierarchical verification policy: Performing energy constraint verification, calculating the total energy value of binary variable assignment under a secondary unconstrained binary optimization model, and if the total energy value is equal to zero, judging that the verification is passed; If the total energy value is greater than zero, judging that the energy constraint verification is not passed, and triggering a backup verification process based on a semantic model; The backup verification process comprises the steps of constructing a true value assignment model set M which enables all propositions in an axiom set to be true, checking whether a target propositional formula is true under each true value assignment model in the true value assignment model set M, if so, judging that verification is passed, otherwise, judging that verification is failed.
9. The method of claim 1, wherein the preconfigured rule selection model is obtained by supervised learning pre-training, the training objective being to minimize the difference between the predicted weight distribution and the actual application of the rule in the historical successful proof path, the loss function L during training being defined as follows: L=-Σ i Σ r [y i,r ·log(w i,r )+(1-y i,r )·log(1-w i,r )]; The method comprises the steps of i representing an ith proving task sample in a training data set, r representing an ith rule in a natural deduction rule set, y i,r representing a binary label, taking a value of 1 when the ith rule is used in a successful proving path of the ith task, otherwise taking a value of 0;w i,r as an applicability weight of a model to the ith rule predicted by the ith task, and updating model parameters in a training process through a back propagation algorithm until a loss function L converges.

Description

Automatic theorem proving method based on neural network guidance and quantum optimization Technical Field The invention belongs to the crossing field of artificial intelligence and quantum computation, in particular to an automatic theorem proving method based on neural network guidance and quantum optimization. Background The automatic theorem proves that ATP is used as a basic stone of a formalization method, and provides strict mathematical guarantee for computer science, integrated circuit design and artificial intelligence. By automatically deducing a logic conclusion through an algorithm, the ATP can verify the security of the kernel of the operating system, check the logic loopholes of the chip design and support the complex reasoning of the intelligent system. With the increasing complexity of software systems and hardware architecture, the industry is increasingly pressing the need for efficient proof techniques that can handle large-scale propositions. The current automatic theorem proves that the mainstream technology mainly relies on symbol reasoning based on the principle of induction and SAT solvers based on DPLL/CDCL (conflict driven clause learning) algorithms. The former resolves clauses by continuously applying logic rules, and the latter finds satisfying assignments by combining depth-first search with conflict learning strategies. In addition, early optimization methods attempted to convert the logic problem to constraint satisfaction problem CSP or simple Boolean satisfaction problem, and solved using traditional heuristic algorithms or early simulated annealing algorithms. Existing research focuses on optimizing a single symbol search strategy or improving underlying variable branch heuristics. Specifically, the lack of global intuition of solution space in traditional symbol reasoning results in blind rule selection and exponential ineffective search, while the existing optimization-based method (such as static QUBO coding) cannot adapt to the dynamics of logic reasoning, and is difficult to deal with the chained problem that needs multi-step deduction to generate new intermediate conclusions. The prior art lacks a mechanism for mapping the dynamic process of logical reasoning into a static physical energy model in advance, and lacks an effective means for converting the heuristic experience of artificial intelligence into energy guiding force in the physical solving process. When the existing quantum or classical optimization means face the problem of deep logic deduction, the existing quantum or classical optimization means are often not converged due to undefined variable space or lost search direction. Disclosure of Invention The invention aims to provide an automatic theorem proving method based on neural network guidance and quantum optimization, which aims to solve at least one of the problems existing in the prior art. According to one aspect of the application, an automatic theorem proving method based on neural network guidance and quantum optimization comprises the following steps: Acquiring a proposition logic proving task containing a target proposition formula to be proving and an axiom set, and extracting a logic structure feature vector reflecting the proposition logic proving task; Inputting the logic structure feature vector into a preconfigured rule selection model to generate rule applicability weights for a natural deduction rule set; Constructing a secondary unconstrained binary optimization model based on a target proposition formula, an axiom set and a rule applicability weight, wherein the secondary unconstrained binary optimization model comprises a rule constraint term, and the rule applicability weight is configured to adjust a penalty coefficient of the rule constraint term so as to remodel the energy landform of a solution space; Mapping the secondary unconstrained binary optimization model to a calculation solution back end to execute quantum calculation optimization to obtain binary variable assignment for minimizing the model energy; verifying the validity of binary variable assignment, and reconstructing a logic proving path from the axiom set to the target proposition formula based on the valid binary variable assignment. The method has the beneficial effects that the problems that the conventional method is blind in rule selection and the static optimization model is difficult to process dynamic multi-step reasoning are effectively solved, and the improvement of proving efficiency and expandability is realized. Drawings Fig. 1 is a schematic diagram of the overall flow of an automatic theorem proving method based on neural network guidance and quantum optimization. FIG. 2 is a schematic flow chart of a process for constructing a quadratic unconstrained binary optimization model. FIG. 3 is a flow chart of computing multi-dimensional logical features to form logical structure feature vectors. FIG. 4 is a schematic diagram of a process flow for perfor