CN-121981249-A - Large model reasoning enhancement method combining knowledge graph embedding and door control residual error connection
Abstract
The invention discloses a large model reasoning enhancement method combining knowledge graph embedding and door control residual error connection, which relates to the technical field of artificial intelligence and natural language processing, and comprises the steps of obtaining initial knowledge embedding through knowledge graph embedding learning, and constructing a knowledge unit representation library through semantic space mapping processing; the method comprises the steps of obtaining a query triplet based on knowledge demand identification of text processing training data, retrieving a candidate knowledge set in a knowledge graph and constructing an enhanced training sample, inputting the enhanced sample into a large language model of an integrated knowledge gating residual error connection module, carrying out dynamic gating fusion processing according to a current layer hiding state and knowledge unit representation to obtain an enhanced hiding state after knowledge injection, generating a prediction result based on the enhanced hiding state, determining a joint loss function, and optimizing model parameters to obtain a large language model with reasoning enhancement. The invention realizes the depth alignment and dynamic regulation of knowledge semantics and language characterization, and improves the model reasoning accuracy and the fact consistency.
Inventors
- Bai Kuntai
- TAN QIAO
- ZHANG MIAOZHI
- GONG MENGCHUN
- SHI WENZHAO
Assignees
- 神州医疗科技股份有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20251204
Claims (10)
- 1. A large model reasoning enhancement method combined with knowledge graph embedding and door control residual error connection is characterized by comprising the following steps: s1, carrying out knowledge graph embedding learning according to a knowledge graph to obtain initial knowledge embedding, and carrying out semantic space mapping processing based on the initial knowledge embedding to obtain a knowledge unit representation library aligned with a language model hiding space; s2, carrying out knowledge demand recognition based on text processing training data to obtain a query triplet, carrying out knowledge retrieval in the knowledge graph according to the query triplet to obtain a candidate knowledge set, and constructing an enhanced training sample based on the candidate knowledge set; S3, inputting the enhanced training sample into a large language model of an integrated knowledge gating residual error connection module, and carrying out dynamic gating fusion processing according to the hidden state of the current layer and knowledge unit representation to obtain an enhanced hidden state after knowledge dynamic injection; S4, forward propagation is carried out based on the enhanced hidden state to generate a text processing prediction result, a joint loss function is determined according to the text processing prediction result, and model parameters of the large language model are optimized according to the joint loss function, so that the large language model with the enhanced reasoning is obtained.
- 2. The large model reasoning enhancement method combined with knowledge graph embedding and gating residual connection of claim 1, wherein the S1 specifically comprises: Performing complex domain embedding learning on the entity set and the relation set in the knowledge graph by adopting a rotation embedding method to obtain complex embedding; And the semantic space mapping process carries out linear transformation on the initial knowledge embedding through a lightweight pair salvo shadow network to obtain a knowledge unit representation library consistent with the hidden space dimension of the language model.
- 3. The large model reasoning enhancement method combined with knowledge graph embedding and gating residual connection of claim 1, wherein the dynamic gating fusion process comprises: Splicing the hidden state of the current layer with the knowledge unit representation, linearly transforming the hidden state by a learnable gating parameter matrix and a bias vector, generating gating weight by a Sigmoid activation function, multiplying the gating weight by the knowledge unit representation element by element to obtain a weighted knowledge injection signal, and carrying out residual error addition on the knowledge injection signal and the hidden state of the current layer to obtain an enhanced hidden state after knowledge dynamic injection.
- 4. The large model reasoning enhancement method combined with knowledge graph embedding and gating residual connection of claim 1, wherein the joint loss function is formed by weighting language modeling loss and knowledge consistency regularization term loss; Determining the language modeling loss through cross entropy calculation of the text processing prediction result and the real label; and encoding a text processing prediction result to obtain a prediction encoding representation, aggregating the knowledge unit representations to obtain a knowledge aggregation representation, calculating an L2 norm distance between the prediction encoding representation and the knowledge aggregation representation, and taking the L2 norm distance as the knowledge consistency regularization term loss.
- 5. The large model reasoning enhancement system combining knowledge graph embedding and door control residual error connection is characterized by comprising a knowledge alignment module, a knowledge construction module, a door control fusion module and a loss optimization module; The knowledge alignment module is used for carrying out knowledge graph embedding learning according to the knowledge graph to obtain initial knowledge embedding, and carrying out semantic space mapping processing based on the initial knowledge embedding to obtain a knowledge unit representation library aligned with the hidden space of the language model; The knowledge construction module is used for carrying out knowledge demand recognition based on text processing training data to obtain a query triplet, carrying out knowledge retrieval in the knowledge graph according to the query triplet to obtain a candidate knowledge set, and constructing an enhanced training sample based on the candidate knowledge set; The gating fusion module is used for inputting the enhanced training sample into a large language model of the integrated knowledge gating residual error connection module, and carrying out dynamic gating fusion processing according to the hidden state of the current layer and knowledge unit representation to obtain the enhanced hidden state after knowledge is dynamically injected; the loss optimization module is used for generating a text processing prediction result based on forward propagation of the enhanced hidden state, determining a joint loss function according to the text processing prediction result, and optimizing model parameters of the large language model according to the joint loss function to obtain the large language model with reasoning enhancement.
- 6. The large model inference enhancement system coupled to a gating residual in combination with knowledge graph embedding of claim 5, wherein the knowledge alignment module specifically comprises: Performing complex domain embedding learning on the entity set and the relation set in the knowledge graph by adopting a rotation embedding method to obtain complex embedding; And the semantic space mapping process carries out linear transformation on the initial knowledge embedding through a lightweight pair salvo shadow network to obtain a knowledge unit representation library consistent with the hidden space dimension of the language model.
- 7. The large model inference enhancement system coupled to a gated residual in combination with knowledge-graph embedding of claim 5, wherein the dynamic gated fusion process comprises: Splicing the hidden state of the current layer with the knowledge unit representation, linearly transforming the hidden state by a learnable gating parameter matrix and a bias vector, generating gating weight by a Sigmoid activation function, multiplying the gating weight by the knowledge unit representation element by element to obtain a weighted knowledge injection signal, and carrying out residual error addition on the knowledge injection signal and the hidden state of the current layer to obtain an enhanced hidden state after knowledge dynamic injection.
- 8. The large model inference enhancement system connected with a gating residual in combination with knowledge graph embedding of claim 5, wherein the joint loss function is formed by language modeling loss and knowledge consistency regularization term loss weighting; Determining the language modeling loss through cross entropy calculation of the text processing prediction result and the real label; and encoding a text processing prediction result to obtain a prediction encoding representation, aggregating the knowledge unit representations to obtain a knowledge aggregation representation, calculating an L2 norm distance between the prediction encoding representation and the knowledge aggregation representation, and taking the L2 norm distance as the knowledge consistency regularization term loss.
- 9. A computer device comprising a processor coupled to a memory, the memory having stored therein at least one computer program that is loaded and executed by the processor to cause the computer device to implement the method of any of claims 1-4.
- 10. A computer readable storage medium having stored therein at least one computer program that is loaded and executed by a processor to cause a computer to implement the method of any one of claims 1 to 4.
Description
Large model reasoning enhancement method combining knowledge graph embedding and door control residual error connection Technical Field The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a large model reasoning enhancement method combining knowledge graph embedding and door control residual error connection. Background In the field of text processing, large language models still have fundamental drawbacks in handling complex tasks that require deep semantic understanding and logical reasoning. Taking long document abstract generation as an example, the model needs to identify key entities, understand event causal relationships and verify fact consistency, but the existing model often omits core information due to lack of structural knowledge support, or cannot build a cross-paragraph logic chain in multi-hop question-answering. More seriously, the opaque reasoning paths inside the model cause the generated content to be difficult to trace back, and the fact illusion problem makes the model possibly fictitious details not mentioned in the text, and the defects are especially deadly in high-precision scenes such as legal document analysis or technical document processing. The core contradiction is that the model has strong language characterization capability, but a deep fusion mechanism cannot be established with external structured knowledge, so that the knowledge utilization efficiency is low and uncontrollable. The prior art mainly alleviates the problems by three types of schemes. The first type of retrieval enhancement generation architecture is characterized in that a vector retriever extracts relevant fragments from a knowledge base and splices the relevant fragments to an input context to realize knowledge supplement in a generation process. The second kind of knowledge graph embedding method adopts geometric transformation to encode entity relationship into low-dimensional vector and can encode implicit structured knowledge in text. And a third type of fixed gating mechanism, wherein static residual connection is introduced into the pre-training language model, and the information flow is regulated and controlled through the learnable parameters. These techniques provide basic knowledge injection capability for text processing tasks, with some effect in simple question-answering or fact-completion subtasks. However, the above approach exposes significant limitations in complex text processing scenarios. The retrieval and the generation process of the retrieval enhancement generation framework are decoupled, fine granularity alignment of the retrieval result and the language characterization is difficult to realize, and the generated content is directly polluted by retrieval noise, so that the abstract deviates from the core semantics of the original text. The vector space of the knowledge graph embedding method and the hidden layer semantic space of the large language model have distribution isomerism, the geometrical characteristics of the embedded representation cannot be compatible with the attention mechanism of a transducer, and the direct splicing or weighted fusion can only capture shallow layer association and is difficult to support deep logic reasoning. The fixed gating structure cannot dynamically adjust the knowledge injection strength according to the semantic requirement of an input text, structural knowledge is excessively introduced to cause redundant interference when a general descriptive text is processed, inference faults are caused by insufficient injection when a professional text highly dependent on background knowledge is processed, and the static architecture is difficult to adapt to the diversity and the dynamics of text processing tasks. These drawbacks together result in the inability of the prior art to meet both generation accuracy, logic stringency, and computational efficiency. In view of the foregoing, the prior art fails to overcome the dual bottleneck of knowledge depth fusion and adaptive regulation in text processing, and an innovative architecture for implementing knowledge structure alignment, dynamic gating and joint optimization inside a model is needed to support highly-trusted, traceable and interpretable text generation and reasoning tasks. Disclosure of Invention Aiming at the defects of the prior art, the invention specifically provides a large model reasoning enhancement method for embedding and connecting a knowledge graph with a gating residual error, which aims at the problems of low reasoning accuracy and reality illusion caused by a dynamic depth fusion mechanism lacking structural knowledge in a complex reasoning task of a large language model, and specifically comprises the following steps: 1) In a first aspect, the invention provides a large model reasoning enhancement method for embedding and connecting with a gating residual by combining a knowledge graph, which comprises t