Search

CN-121981263-A - Knowledge enhancement strong language model reasoning method based on experience knowledge

CN121981263ACN 121981263 ACN121981263 ACN 121981263ACN-121981263-A

Abstract

The application relates to a knowledge-enhanced large language model reasoning method based on experience knowledge. The method comprises the steps of constructing a knowledge base, obtaining reasoning problems and corresponding problem solving experiences, extracting the problem solving experiences into structured knowledge items, extracting reasoning structures of the problems in a structure abstraction process, extracting experiences into reusable solving, judging and correcting modes in a mode generalization process, retrieving knowledge, retrieving related items by utilizing the reasoning structures of new problems, and then selecting a final item set in a diversity reordering step to provide strategic and diverse contexts for reasoning. The problem-based core reasoning structure adopts a diversity perception retrieval mechanism, so that not only is the reasoning accuracy of LLM improved, but also the LLM can be learned and evolved from practice without retraining.

Inventors

  • ZENG WEIXIN
  • ZHOU MINGJUN
  • ZHAO XIANG
  • HUANG HONGBIN
  • WU JIBING
  • LIU LIHUA
  • Tang jiuyang

Assignees

  • 中国人民解放军国防科技大学

Dates

Publication Date
20260505
Application Date
20260115

Claims (7)

  1. 1. The knowledge enhancement strong language model reasoning method based on experience knowledge is characterized by comprising the following steps: acquiring an original reasoning problem of knowledge of each department in the education field; The method comprises the steps of establishing a knowledge base, obtaining a problem solving experience corresponding to an reasoning problem, refining the problem solving experience into a structured knowledge item, converting an original reasoning problem into an reasoning structure by using a prompt word, refining the problem solving experience by using a mode generalization process, wherein the mode generalization process comprises a solving stage, a judging stage and an iteration correction stage, the solving stage generates an initial solution to the reasoning problem, the judging stage carries out judging and examining on a given solution and generates a structured feedback tuple, and the iteration correction stage generates a refined solution by using the problem, a defective solution and corresponding feedback thereof; The step of diversity reordering then selects the final item set to provide policy diverse context for reasoning; The knowledge enhancement comprises the steps of providing a verified solving method by the searched solving mode, pre-setting the solving mode set in a prompt, and providing a context example for the model, thereby improving the probability of generating a correct answer for the first time; and outputting a solving answer of the new problem.
  2. 2. The knowledge-based strong language model inference method as claimed in claim 1, wherein said solving phase gives an inference problem Generating an initial solution , Is an inference trace of a trace, Is a candidate answer, the initial solution comprising an inference trace and the candidate answer: Where i represents the concatenation of strings, Is a large language model of the language, Is a predefined solving prompt word; the evaluation stage for critically examining a given solution , Is an inference path that was previously generated, Is a candidate answer generated previously and generates a structured feedback tuple , Is a criticizing in the form of a natural language, The binary quality score: is a predefined judgment prompt word; if the solution is determined to be defective, i.e Entering into iterative correction stage to utilize problems Defect solution And its corresponding feedback Generating a refined solution : Is a predefined correction prompt; The cycle of judgment and correction is repeated until an acceptable solution is generated or a preset maximum number of attempts is reached, and the record of the whole cycle forms a complete dislike track as a raw material of the structured knowledge.
  3. 3. The knowledge-based knowledge enhancement method of language model reasoning as claimed in claim 2, wherein the pattern generalization process further refines the reusable strategy in the trace into a set of patterns , Respectively a solving mode, a judging mode and an enhancing mode, which together form a single knowledge item ; The dislike trajectory is converted into a structured knowledge item.
  4. 4. The knowledge-based strong language model inference method as claimed in claim 3, wherein for a given problem And complete experience The complete experience To solve-judge-correct the trajectory, the original problem is first solved by using the prompt word Converted into its reasoning structure The extracted reasoning result is; Subsequently, specific experience is to be adopted Refining into cognitive mode, for each stage in the cognitive process Using specialized generalization cues Generating corresponding patterns : The method comprises the steps of a solving stage, a judging stage and a correcting stage; Generating a pattern set Then paired with the inference structure and stored as a single entry in the knowledge base: ; At present, new knowledge of the document is based on an inference structure The existing entries are checked for similarity, preventing redundancy.
  5. 5. The knowledge enhancement method based on experience knowledge according to claim 4, wherein the knowledge retrieval step ensures that the most appropriate knowledge is retrieved by selecting knowledge items that are compatible with high correlation and policy diversity, and specifically comprises: First establish relevance to input queries in order to achieve accurate matching beyond surface-layer terms Extracting the underlying inference structure of an application structure abstraction process : Then, using the abstract representation, a similarity search is performed in the knowledge base entry to retrieve an initial candidate set These candidates are highly logically related at all; a set of carefully chosen knowledge items is generated, each knowledge item encapsulates a complete set of patterns for solving, evaluating and correcting, specifically, the retrieved solving patterns guide the solving stage, the evaluating patterns guide the evaluating stage, and the correcting patterns provide examples for the correcting stage.
  6. 6. The knowledge-based knowledge enhancement method of claim 5, wherein said selecting a final optimal subset from the pool of related entries Comprising: MMR (maximum likelihood correlation) algorithm is adopted to select final Item, MMR iteration selection candidate item The following objective function is maximized: Wherein, the In order to abstract the query, As a candidate entry for the item to be checked, For the set of selected items, The first term prioritizes the correlation, the second term penalizes redundancy, and the hyper-parameters as a similarity function This trade-off is balanced.
  7. 7. The knowledge enhancement method according to claim 6, wherein the knowledge enhancement step includes a solving, judging and correcting stage, and the solving step is expressed as: then, in the judging stage of knowledge enhancement, judging the experience of the common trap is summarized by the judging modes, and the judging modes are gathered The method comprises the following steps of: finally, the correction phase provides an effective repair strategy for the specific error type, a set of correction modes The method is used for guiding the model to systematically diagnose and repair defects, and comprises the following correction steps: The steps described above progressively refine the solution from the initial candidate to the final answer.

Description

Knowledge enhancement strong language model reasoning method based on experience knowledge Technical Field The invention belongs to the field of educational knowledge questions and answers, and particularly relates to a knowledge enhancement large language model reasoning method based on experience knowledge. Background Recent advances in large language models LLMs have demonstrated their excellent capabilities in complex reasoning and problem resolution. In the field of educational knowledge questions and answers, such as the fields of education of various subjects of middle and primary schools and universities, a large language model has been widely applied to these subjects knowledge questions and answers, however, once deployed, its parameters are basically fixed, and it is difficult to utilize past experiences such as problems, success or errors, etc. in the reasoning stage due to the great calculation costs of training and fine tuning. They deal with each problem instance in isolation and infer from the first principles, often repeatedly deducing the same hole pattern and repeatedly making the same mistake. Inspired by human cognition, i.e., past experience, is continuously internalized into persistent, reusable memory, one promising paradigm is to dynamically sort a compact memory library, such as reusable strategies, solution sketches, code fragments, etc., from the LLM's own operational history, and to augment the LLM with these memories beyond freezing parameters in new reasoning stages. Specifically, the existing method works by building a continuously evolving past problem-solving experience library. When faced with new problems, these models retrieve the relevant memories from the library and guide LLM reasoning as contextual examples, thus making it learned in practice. While this empirically learned paradigm represents a significant advance, its existing implementation still presents significant problems. First, existing methods are limited in building memory content. The memory bank structure is too simple, and the content is limited to the prior 'question-answer (QA)' pair. However, these QA pairs, by their nature, lack generalization and may not be effective for new problems. More importantly, in addition to simple QA, the large language model LLM can also obtain more knowledge from past experience by self-evaluating answer quality and providing potential correction directions, and these knowledge segments can be referenced when faced with new questions. Furthermore, these methods do not take full advantage of memory that they retrieve relevant memory content based on surface semantic similarity between new queries and past experiences, but this may retrieve problems that are similar in surface features and radically infer structures or solve norms that are quite different. For example, two mathematical questions may both relate to "train" and "speed", but one is to solve a linear system of equations and the other is simply unit conversion. In this case, the retrieval based on surface form similarity may introduce significant noise and even mislead LLM to make erroneous inferences. Disclosure of Invention While large language models have been successful in complex reasoning, they are treated separately for each problem instance, starting from scratch to generate solutions. In the prior art, past experience is stored as scene memory, so that LLM can multiplex gradually accumulated memory in an reasoning stage, and the conversion from isolated reasoning to language agents with experience perception capability is realized. However, these methods store memories that are primarily specific "question-answer" pairs that lack generalization capability and are primarily searched according to surface form similarity during reasoning, often failing to provide experience with truly similar reasoning structures. To address these limitations, the present application is based on cognitive structure migration theory, aimed at utilizing knowledge summed up from past experience to enhance LLM reasoning. Specifically, the application is directed to education field problems, including various discipline problems of middle and primary schools, universities and the like, including but not limited to the fields of engineering, literature and the like, and constructs a knowledge base with generalizable modes, wherein each mode is derived from complete self-thinking of past examples by LLM, comprises solution, judgment and correction thereof and is associated with a core reasoning structure of a source problem. The application also provides an efficient knowledge retriever with discrimination, and by matching the reasoning structure of the new problem with the structure in the knowledge base, highly relevant and diversified modes are retrieved, thereby providing a steady guidance for the reasoning process of the LLM. In order to achieve the above object, the knowledge enhancement method based on experience knowled