CN-116226352-B - Answer selection method considering question-answer space-time dependency relationship

CN116226352BCN 116226352 BCN116226352 BCN 116226352BCN-116226352-B

Abstract

The invention discloses an answer selection method considering a question-answer space-time dependency relationship, which comprises the steps of 1, constructing question-answer data and carrying out data preprocessing, 2, obtaining word-level similarity matrixes of question-answer pairs by using a BERT model, 3, splicing the similarity matrixes of a plurality of answers under the same question thread to obtain a question-answer space tensor, and 4, carrying out question-answer matching degree prediction by using a ConvLSTM model. According to the invention, the BERT is used for obtaining the question-answer pair word-level similarity matrix with implicit semantic relevance, the question-answer time space tensor is constructed based on the word-level similarity matrix, and the time-space dependency relationship information in the question-answer data is learned through the ConvLSTM model, so that answer selection of the questions is finally realized, and the best answer with the highest matching degree with the questions can be accurately recommended.

Inventors

JIANG YUANCHUN
CUI FULAI
SONG LIRONG
LI LIANG
GAO JING
NIE ZHONGYI
ZHENG SHU

Assignees

合肥工业大学

Dates

Publication Date: 20260508
Application Date: 20230328

Claims (3)

1. An answer selection method considering the space-time dependency relationship of questions and answers is characterized by comprising the following steps: Step 1, constructing question-answer data and preprocessing the data; Collecting question and answer data on a community forum, comprising: Problems of Under all problems Answers of each Simultaneously recording the time stamp of each question issue and each answer response, wherein, Represent the first A problem is that, Represent the first Problems of A kind of electronic device Answers, and ; Represent the first Problems of T th answer of (a) question And answers thereto Composition No. 1 Sets of question-answer pairs , Represent the first Sets of question-answer pairs The t question-answer pair in (a); Step 2, obtaining a word-level similarity matrix of the question-answer pair by using the BERT model; Step 2.1, obtain the first Sets of question-answer pairs Is divided into question and each answer is divided into words; Acquiring the t question-answer pair by using a marking tool of the BERT model Middle (f) Problems of Word segmentation set of (a) The t th answer Word segmentation set of (a) , wherein, Represent the first Problems of Is the first of (2) The word is divided into a plurality of words, Is the first Problems of Is used for dividing the word into a plurality of words, Represent the first Problems of Is the first of (2) Answers of each Is used for the n-th word segmentation of the code, Is the first Answers of each Is defined as the total number of words; For the first Problems of A kind of electronic device Word segmentation set of individual answers Performing a union operation to obtain the first Problems of A kind of electronic device Answers of each Word segmentation set of (a) Wherein, the method comprises the steps of, Represent the first Problems of Answer No. 1 The word is divided into a plurality of words, Representing answer sets Is defined as the total number of words; Step 2.2, constructing a BERT model consisting of 12 self-attentive layers, wherein the number of heads in each self-attentive layer is And the t question-answer pair Input into the BERT model for processing and output by the 11 th layer from the attention layer Problems of Is the first of (2) Self-attention weight matrix of answers Will be Multiplying by a weight matrix Obtaining the embedded matrix ; Layer 12 self-attention layer to be embedded in matrix Respectively multiplied by two weight matrices to be trained And Thereafter, an initial query embedding matrix is generated And key embedding matrix ; Layer 12 self-attention layer embeds initial queries into matrix And key embedding matrix Respectively with Weight matrix to be trained And Matrix multiplication is performed to obtain correspondingly Query embedding matrix for individual header And key embedding matrix , wherein, And Respectively represent the first A query embedding matrix and a key embedding matrix for each header; representing the s-th query weight matrix to be trained, Representing an s-th key to be trained to embed a weight matrix; The 12 th self-attention layer obtains the t question-answer pair by using the formula (1) Multi-headed word-level similarity matrix of (2) : (1) In equation (1), softmax represents the normalized exponential function, and operates on each column of the matrix, Representing multi-head word-level similarity matrices Word-level similarity matrix for the s-th head of (a), Representing the dimension of the weight matrix of each head, and , Representing an embedding matrix Dimension of each element in (a); From the slave Take out the first Problems of Word segmentation set of (a) As column element, the first Answers of each Is word-integrated As the similarity matrix corresponding to the row element, obtaining the local similarity matrix of the s-th head ; Step 3, splicing similar matrixes of a plurality of answers under the same question to obtain a time-space tensor of question-answer; Step 3.1, creating a word segmentation dictionary for each answer set of the questions: Will be Answers of each Word segmentation set of (a) Each word is used as a key of the dictionary, the value corresponding to each word is initialized to 0, and the first word is constructed Problems of Answer set of (a) Word segmentation dictionary of (a) , wherein, Represented by the first Problems of Answer No. 1 Individual word segmentation And the first corresponding initial value A key value pair; Step 3.2, expanding a similarity matrix: According to word-segmentation dictionary Constructing a local similarity matrix Corresponding initial similarity matrix dictionary ={ And } wherein, Represent the first Problems of And the first Local similarity matrix index numbers after the answers are developed; Local similarity matrix Column index of the first column is Problems of Word segmentation set of (a) The line index is the t-th answer Word segmentation set of (a) Will be Each word is respectively associated with The key names of each element are compared if And (3) with Identical, then In local similarity matrix Corresponding columns as a similarity matrix dictionary Middle key Corresponding value, otherwise, the filling length is Column vector with element value 0, if Repeatedly occur, then In local similarity matrix Accumulating the corresponding columns and finally taking an average value to obtain a t question-answer pair A kind of electronic device Similarity matrix dictionary set of multiple multi-attention headers And then obtain the first Sets of question-answer pairs A kind of electronic device Dictionary set of individual similarity matrices ; Step 3.3, converting the data format to obtain a question-answering time-space tensor: Dictionary of similarity matrix Conversion to a 2-dimensional matrix Then the t question-answer pair A kind of electronic device Similarity matrix dictionary set of multiple multi-attention headers At the position of Stacking in multiple multi-attention head dimensions to obtain a 3-dimensional tensor And then the (th) Individual question-answer pair sets A kind of electronic device Dictionary set of individual similarity matrices At the position of Stacking the similar matrix dictionaries in dimension to obtain 4-dimensional tensors Then carrying out packing treatment to obtain 4-dimensional tensors with uniform sizes , wherein, Represent the first Problems of And the first Answers of each Spatio-temporal tensors on h heads; step 4, predicting the question-answer matching degree by using ConvLSTM model, and selecting answers; Step 4.1, tensor time-space Inputting ConvLSTM into model for prediction to obtain the first Problems of Predictive scoring set for T answers , wherein, Represents the t question-answer pair Is a predictive score of a valid answer; And 4.2, training the ConvLSTM model by using a gradient descent method, calculating a cross entropy loss function and a back propagation algorithm to update parameters, stopping training when the cross entropy loss function converges or the training times reach a predetermined threshold value, thereby obtaining an optimal answer selection model, and outputting an optimal answer of each question in a community question-answering system.
2. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that supports the processor to perform the answer selection method of claim 1, the processor being configured to execute the program stored in the memory.
3. A computer readable storage medium having a computer program stored thereon, characterized in that the computer program when executed by a processor performs the steps of the answer selection method of claim 1.

Description

Answer selection method considering question-answer space-time dependency relationship Technical Field The invention relates to the technical field of community question-answering systems, in particular to an answer selection method considering a question-answer space-time dependency relationship. Background With the rapid development of deep learning, question-answer matching (Question Answering, QA) has become an important issue in the field of natural language processing (Natural Language Processing, NLP). The deep learning question-answer matching models are mainly divided into two categories, neural network-based and pre-training language model-based. Question and answer matching models based on neural networks (e.g., CNN, RNN, etc.) typically represent questions and answers as vectors and are classified using neural networks. The former has strong local feature learning ability, and the latter has good long-term dependent learning ability. Question-answer matching models based on pre-trained language models typically use pre-trained language models (e.g., BERT, GPT, etc.) to represent questions and answers, and then use task-specific fine-tuning models to perform question-answer matching. The model can greatly improve the representation capability of natural language. However, there may be complex time interactions between the questions and the candidate answers in the real community question-answer scenario, such as intense discussions between the answers in the questions, so considering deep semantic matching between the questions and the answers and time interactions between the candidate answers has an important role in aggregating knowledge to facilitate the selection of answers. The model of any type can be used alone, so that hidden features between the questions and the candidate answers and trend changes between the answers cannot be accurately described at the same time, the accuracy of answer recommendation of the current community question-answering system is low, and the data quantity and the text quality are required to be higher, so that the practicability in the specific field is too low, the cost of system development is increased, and the defects of poor experience and the like brought to users are overcome. Thus, it is a difficulty how to capture latent semantic features within the time-series question-answer data, thereby completing answer selection. Disclosure of Invention The invention provides an answer selection method considering the space-time dependency relationship of questions and answers so as to mine the space-time dependency relationship between the questions and the answers, thereby accurately recommending the best answer with the highest matching degree with the questions. In order to achieve the aim of the invention, the invention adopts the following technical scheme: The invention relates to an answer selection method considering a question-answer space-time dependency relationship, which is characterized by comprising the following steps: Step 1, constructing question-answer data and preprocessing the data; Collecting question and answer data on a community forum, wherein the question and answer data comprises M questions Q= { Q i, i=1, 2,.,. M } and T answers A= { A i, i=1, 2,.,. M } under all questions, and simultaneously recording a time stamp of each question issue and each answer response, wherein Q i represents an ith question, A i represents T answers of an ith question Q i, A i＝{Ait,t＝1,2,…,T};Ait represents the T answers of the ith question Q i, the question Q i and the answer A i thereof form an ith question and answer pair set QA i＝{(Qi,Ait),t＝1,2,…,T},(Qi,Ait) represents the T question and answer pair in the ith question and answer pair set QA i; Step 2, obtaining a word-level similarity matrix of the question-answer pair by using the BERT model; Step 2.1, obtaining question segmentation of the ith question-answer pair set QA i and segmentation of each answer; Obtaining the word segmentation set of the ith question Q i in the t question-answer pair (Q i,Ait) by using the self-contained marking tool of the BERT model Word segmentation set of t-th answer A itWherein, the The mth word segment representing the ith question Q i, k i is the total number of word segments of the ith question Q i,N-th word segment of the t-th answer a it representing the i-th question Q i, and p it is the total number of word segments of the t-th answer a it; The word segmentation set { a it, t=1, 2, & gt, T } of the T answers of the i-th question Q i is subjected to union operation, so that the word segmentation set of the T answers a i of the i-th question Q i is obtained Wherein, the The g-th word segment representing the answer to the i-th question Q i, and P i represents the total number of word segments of the answer set A i; Step 2.2, constructing a BERT model consisting of 12 layers of self-attentive layers, wherein the number of heads in each layer of self-attentive la