CN-116166779-B - Multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention

CN116166779BCN 116166779 BCN116166779 BCN 116166779BCN-116166779-B

Abstract

The invention discloses a multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention, which relates to the technical field of knowledge base question-answering, and comprises a retrieval source construction module, a question expression module, an IR-MH model reasoning module and an answer generation module; the method and the device are used for solving the problem of the prior art, acquiring a subgraph of a specific problem from a knowledge graph according to the problem, then executing a sequencing algorithm to gradually acquire an answer from a subject entity, introducing a middle reasoning attention mechanism on the basis of a searching multi-hop knowledge base question-answer framework, paying more attention to the part of the original problem which is not solved by the prior hop count in the process of generating each hop reasoning instruction, and ensuring the concerned weight to be related to the previous hop reasoning state, so that the reasoning state of the middle step can be fully utilized, the reasoning instruction of each step can be dynamically updated, the tight interaction between the reasoning state of the middle step and the reasoning instruction is promoted, and an effective attention feedback is provided for optimizing the middle step reasoning instruction.

Inventors

ZHANG WENBIN
WANG KAI
ZHANG HAN
SHE JIAJU
WANG YI
LIU CHAO
WANG YINGQIU
LI YE
ZHAO MENG
TIAN RUNZE

Assignees

国网绿色能源有限公司
国网天津市电力公司

Dates

Publication Date: 20260505
Application Date: 20221222

Claims (8)

1. The multi-hop knowledge base question-answering IR-MT model based on the intermediate reasoning attention is characterized by comprising a retrieval source construction module, a question representation module, an IR-MH model reasoning module and an answer generation module; The problem representation module is used for receiving an initial problem Q= { w 1 ,w 2 ,…,w n }; the retrieval source construction module is used for constructing a subgraph aiming at an initial problem Q= { w 1 ,w 2 ,…,w n } through understanding a knowledge base K= { (e s ,r,e 0 ) } of the field to which the initial problem Q belongs, wherein the subgraph comprises a plurality of candidate entities; The IR-MT model reasoning module comprises an instruction component and a reasoning component, wherein the instruction component is used for converting an initial problem Q into an instruction vector and sending the instruction vector to the reasoning component; the reasoning component is provided with an IR-MH model, and is used for deducing entity distribution and learning entity representation by using the IR-MH model according to the instruction vector, extracting answer entities from all candidate entities in the subgraph and transmitting the answer entities to the answer generation module; the relation r extraction result of each reasoning step is calculated as follows: (3) wherein R is a candidate relation set of the current reasoning step, Splicing the problem vector with the reasoning instruction vector of the previous stage, wherein F is a classification model; In the fact triplet given to the knowledge base, its vector representation matched with the instruction vector i k of the current inference step is calculated as: (4) Wherein W R is a training parameter, and then, taking the probability of the relation r of the current reasoning step as the attention weight to obtain the relation aggregate score transmitted from the previous step: (5) Wherein N e is a set of possible fact triples in the knowledge base, p (r) is the probability of each relationship in the candidate relationship set, and then In combination with the entity distribution e k-1 output by the previous inference state, the entity distribution of the current inference state is obtained: (6) the FFN is a forward propagation network, and the final answer entity probability is calculated as follows: (7) wherein w is a trained parameter; the answer generation module is used for displaying answer entities.
2. The multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention as set forth in claim 1, wherein the retrieval source construction module specifically includes: A Personalized PageRank algorithm is run from the topic entity { e 0 |e 0 ε Q } to retrieve the subgraph, and each question is answered by subgraph retrieval.
3. The multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention as claimed in claim 1, wherein, The instruction component employs the pre-training model BERT to obtain a word vector representation of the initial problem Q, then Mapping it to task-specific word vector space via a bi-directional LSTM network to obtain the final problem vector representation H l j , wherein L represents the length of the initial problem Q.
4. The mid-reasoning attention-based multi-hop knowledge base question-answer IR-MT model of claim 1, wherein the instruction vector for each reasoning step is denoted by i k , the specific calculation process is as follows: (1) (2) Wherein the method comprises the steps of , , H is word vector representation of the initial problem Q, s k-1 is the reasoning state vector of the previous reasoning step; For intermediate reasoning of attention weight, obtaining instruction vector list by repeating the above-mentioned process Then instruction vectors for n inference steps.
5. The multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention as set forth in claim 4, wherein the specific working steps of the reasoning component are: After the IR-MT model obtains the instruction vector i k through the instruction component, the instruction vector i k of the current step and the reasoning state s k-1 of the previous reasoning step are used as a guiding signal of the reasoning component, wherein the input of the reasoning component comprises the instruction vector i k of the current step, and the reasoning state s k-1 comprises entity distribution p k-1 and entity vector representation { e k-1 }; The output of the inference component includes the entity distribution p k and the entity vector representation { e k } of the current inference step, where e 0 is the subject entity.
6. A method for multi-hop knowledge base question-answering IR-MT based on intermediate reasoning attention, characterized in that it is applied to the multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention as claimed in any one of claims 1-5, the method comprising: firstly, receiving an initial problem Q= { w 1 ,w 2 ,…,w n }; The retrieval source construction module is used for constructing a subgraph aiming at an initial problem Q= { w 1 ,w 2 ,…,w n } through understanding a knowledge base K= { (e s ,r,e 0 ) } of the field to which the initial problem Q belongs, wherein the subgraph comprises a plurality of candidate entities; And extracting answer entities from all candidate entities in the subgraph through an IR-MH model by utilizing an IR-MH model reasoning module, and transmitting the extracted answer entities to an answer generation module for display.
7. An electronic device comprising a processor, a memory and a computer program stored in the memory, wherein the processor, when executing the computer program, performs a method of multi-hop knowledge base question-answering IR-MT based on intermediate reasoning attention as claimed in claim 6.
8. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a method of multi-hop knowledge base question-answering IR-MT based on intermediate reasoning attention as claimed in claim 6.

Description

Multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention Technical Field The invention relates to the technical field of knowledge base questions and answers, in particular to a multi-hop knowledge base question and answer IR-MT model based on intermediate reasoning attention. Background Aiming at the multi-hop knowledge base question-answering task, the early research method mostly adopts a pipeline framework. Although the relation chain-based method can excellently complete the single-hop knowledge base question-answering task, the method is in a dilemma when the multi-hop knowledge base question-answering task is oriented to a large-scale knowledge graph. First, the number of candidate paths (relationships) grows exponentially with path length (relationship hops), rendering relationship calculations that deal with large-scale knowledge spectrograms impractical. More importantly, the model cannot trade off the path length needed to process the original question, i.e., reason when to terminate. Aiming at the defects of the pipeline framework, an end-to-end multi-hop knowledge base question-answering framework recently receives extensive attention from researchers. According to the differences of technical paradigms, the traditional neural network modes based on key value pair modeling and the emerging pattern neural network modes can be classified. Although the end-to-end framework alleviates the shortcomings of the pipeline framework, the model performance is still unstable due to the lack of intermediate supervisory signals. The lack of supervisory signals in the middle step of multi-hop reasoning results in that the model can only receive feedback of final answers, the reasoning instructions in the middle step cannot be effectively optimized, and forward propagation of the reasoning state is weakened. In order to solve the problems, the invention provides a multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention on the basis of searching a multi-hop knowledge base question-answering framework. Disclosure of Invention The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention, wherein the IR-MT model can fully utilize the reasoning state of the intermediate step, dynamically update the reasoning instruction of each step, promote the tight interaction between the reasoning state of the intermediate step and the reasoning instruction, and provide an effective attention feedback for the optimization of the intermediate step reasoning instruction. To achieve the above object, an embodiment according to a first aspect of the present invention proposes a multi-hop knowledge base question-answering IR-MT model based on intermediate reasoning attention, including a retrieval source construction module, a question representation module, an IR-MH model reasoning module, and an answer generation module; the problem representation module is used for receiving a problem Q= { w 1,w2,…,wn }; the retrieval source construction module is used for constructing a subgraph aiming at an initial problem Q= { w 1,w2,…,wn } through understanding a knowledge base K= { (e s,r,e0) } of the field to which the initial problem Q belongs, wherein the subgraph comprises a plurality of candidate entities; The IR-MH model reasoning module is used for extracting answer entities from all candidate entities in the subgraph through an IR-MH model, and transmitting the extracted answer entities to the answer generation module for display; the IR-MH model reasoning module consists of an instruction component and a reasoning component, wherein the instruction component is used for sending instruction vectors to the reasoning component, and the reasoning component is used for deducing entity distribution and learning entity representation. Further, the search source construction module specifically includes: A PersonalizedPageRank algorithm is run from the topic entity { e 0|e0 ε Q } to retrieve the subgraph, and each question is answered by subgraph retrieval. Further, the specific working steps of the instruction component are as follows: the instruction component first converts a given initial question Q into a series of instruction vectors that control the inference process, the input of the instruction component is made up of the word vector of the initial question Q and the instruction vector of the previous inference step, the initial instruction vector is set to a zero vector. Further, wherein the instruction component employs a pre-training model BERT to obtain a word vector representation of the initial problem Q, which is then mapped to the task-specific word vector space via a bi-directional LSTM network to obtain a final problem vector representation H lj, whereinL denotes the l