CN-115408603-B - Online question-answering community expert recommendation method based on multi-head self-attention mechanism
Abstract
The invention discloses an online question-answering community expert recommendation method based on a multi-head self-attention mechanism, and relates to the technical field of intelligent recommendation. The method is characterized in that the constructed question encoder and the user encoder are composed of a convolutional neural network and an attention mechanism and are used for processing target questions and user history answer questions and extracting question features, the user encoder learns dynamic interest features hidden in a user history answer sequence by utilizing a multi-head self-attention mechanism and then acquires comprehensive features of a user by combining static interest features of the user, and finally, the output target question features and the comprehensive features of the user are subjected to similarity calculation to generate recommendation results, so that accurate, individual and real-time recommendation services are provided for a platform, and the question answering rate is improved.
Inventors
- LIN GENG
- CHEN YINGTING
Assignees
- 闽江学院
- 闽江学院
Dates
- Publication Date
- 20260421
- Application Date
- 20220727
- Priority Date
- 20220727
Claims (4)
- 1. The online question-answering community expert recommendation method based on the multi-head self-attention mechanism is characterized by comprising a question encoder construction process, a user encoder construction process, a predictor construction process, a deep learning model training process and a prediction process; The construction process of the question encoder comprises the steps of carrying out feature coding on the questions, extracting information in the title and the theme of the questions as question labels, and matching the question labels with interest labels of expert users; The user encoder construction process comprises the steps of carrying out feature coding on a historical answer question sequence of a user and a user attention topic, extracting dynamic interest feature information of the user from the historical answer question sequence by utilizing a multi-head self-attention mechanism, extracting static interest feature information of the user from the user attention topic, and splicing a dynamic interest expression vector and a static interest expression vector of the user to obtain a comprehensive expression vector of the user; The user encoder construction process specifically comprises the following steps: Step B1, mining historical answer question sequence information of a user, firstly arranging the historical answer questions in time sequence, then utilizing a question encoder of the previous section to process each question in the sequence to obtain question vector representations of the questions, and finally obtaining the sequence L is the sequence length; step B2, the multi-head self-attention mechanism adds a position vector Adding timing information to a problem representation vector in a sequence to obtain a sequence The calculation formula is as follows: ; ; wherein pos is the position of the problem in the sequence, and the value range is , , I.e. the position vector dimension; step B3, sequence of handles Inputting a multi-head self-attention mechanism network structure to capture the dynamic interest change of the user and obtain a new sequence Each output element z i is a user dynamic interest representation vector u d learned by the input element e i through a multi-head self-attention mechanism; step B4, learning a long-term interest expression vector of the user through the subject concerned by the user; Extracting user attention topics from user information J represents the number of topics concerned, word embedding representation is firstly obtained, global average pooling is then carried out, and a static interest representation vector u s of a user is obtained, wherein the calculation formula is as follows: ; Wherein, the Is a word embedding representing a parameter matrix for global pooling of D; And step B5, splicing the short-term dynamic interest expression vector and the long-term static interest expression vector of the user to obtain the comprehensive expression vector of the user, wherein the comprehensive expression vector comprises the following formula: ; the predictor construction process comprises the steps of judging whether a current user accepts the invitation of a given problem or an inviting user by calculating the similarity of the representing vector of the target problem and the comprehensive representing vector of the inviting user; The deep learning model training process comprises the steps of collecting training data of a question-answer community, constructing a training sample by combining invitation records of users-questions with description information of the user portrait and the user answer records, marking a sample label as 0 or 1, and converting expert recognition questions into classification questions; The prediction process comprises the steps of calculating the similarity between the representing vector of the target problem and the representing vector of the inviting user for the given problem and the inviting user through the trained model, and judging whether the current user can accept the invitation of the problem or not, so that expert recommendation results are generated.
- 2. The method of claim 1, wherein the problem encoder construction process comprises: a1, segmenting a title of a problem through an embedding layer, carrying out word embedding representation, and converting the title into word vector representation of an implicit semantic space; Assume a problem title The words of (a) are , Representative of Is provided for the length of (a), the transformed word vector is expressed as: ; a2, capturing the representation of local semantic information learning context words by using CNN; Assume that a heading word vector of an implicit context word representation is noted as The calculation formula is as follows: ; Wherein the nonlinear activation function The formula of (2) is , Finger is located at And Word embedding connection between the two parts, C and b are filter parameters of the convolutional neural network, and M is 1; a3, distributing the weight of the title word through an attention mechanism; Assuming that the attention weight of the i-th word is a i , the calculation formula is: ; Wherein, the The function formula is A i is an intermediate variable generated by the attention mechanism, v and v b are trainable parameters; The characterization vector of the problem title is represented by a context word obtained by CNN and is obtained by weighting attention weights, and the calculation formula is as follows ; Step A4, word segmentation is carried out on the subject to which the problem belongs, word embedding representation is carried out, and the word embedding representation is converted into word vector representation of the implicit semantic space; assume that the words in the question topic t are L represents the length of t, and the transformed word vector is expressed as: ; Step A5, carrying out global average pooling on word vectors of topics to which a problem belongs, and obtaining an average word vector e t to represent topic information of the problem, wherein the calculation formula is as follows: ; Wherein, the Is a global average pooled parameter matrix; step A6, for each question, executing the steps to generate a title of the question and a vector representation of the subject, and generating a final question vector representation e through vector stitching, wherein the following formula is as follows: 。
- 3. The method according to claim 1, wherein in the step B3, the multi-headed self-focusing mechanism network comprises a multi-headed self-focusing network layer, a first residual connection and layer normalization operation layer, a feedforward neural network layer, a second residual connection and layer normalization operation layer and a global averaging pooling layer, wherein the multi-headed self-focusing network layer comprises h self-focusing networks calculated in parallel, and the obtained sequence matrix is global averaged and pooled to obtain a dynamic interest expression vector u d of the user.
- 4. The method of claim 1, wherein the predictor calculates the target problem for a given problem and inviting user Is a representation vector of (a) Inviting users Is a comprehensive representation vector of (1) Judging whether the current user can accept the invitation of the problem, wherein the similarity calculation formula is as follows: ; Wherein, the The formula of the function is The result is a probability score with a value in the range of 0, 1.
Description
Online question-answering community expert recommendation method based on multi-head self-attention mechanism Technical Field The invention relates to the technical field of intelligent recommendation, in particular to an online question-answering community expert recommendation method based on a multi-head self-attention mechanism. Background With the popularization of the Internet, an online question-answering community becomes an important knowledge sharing platform, however, with the explosive growth of the data volume of the platform, how to effectively recommend massive questions to expert users to get answers is a serious challenge facing the platform. The general expert recommendation methods mainly include a link analysis method and a text analysis method. The link analysis method is to find out expert through question and answer relation of community users, and the representative method is web page ordering method and hyperlink theme searching method. The text analysis rule is used for modeling aiming at the answer records of the users, mining the interests of the users, calculating the matching degree of interest labels and problem labels, and representing the probability latent semantic analysis model and the latent Dirichlet distribution topic model. In addition, some works convert expert recommendation problems into classification problems, solve the problems by using decision trees and support vector machines in machine learning, and can apply various features to expert recommendation. The disadvantage of these methods is 1. Relying on the quality of the complex features of the manual construction, hampering the scalability of the recommendation. 2. It is difficult to learn abstract high-level feature interaction information. 3. Heterogeneous multi-source information such as images and texts cannot be integrated, and data mining is insufficient. In recent years, deep learning technology is continuously developed, and has the advantages that a machine learns and processes complex problems like a human, high-order feature interaction information is extracted through a deep network structure, and complex features do not need to be manually constructed. Among expert recommendation applications of deep learning methods, convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are the most popular. The CNN is good at learning the contextual characteristic information of the input text, the CNN is utilized to embed the user interests and given problem words into the characteristic representation, and then the result is output through the soft magnetic layer to predict whether the user is expert. The RNN is good at processing time sequence characteristics, learns the dependency relationship among sequences, abstracts the historical solution information of the user into sequence information, inputs the sequence information into the RNN, and captures the dynamic interest change of the user. The method has the defects that 1, short-term interest drift of the user in an actual scene is ignored, personalized requirements of the user are not fully considered, and recommendation quality is affected. Because of the special sequence structure, RNNs can only be calculated one by one, cannot be calculated in parallel, and easily lose sequence front-end information, so that the recommendation accuracy is affected. Therefore, the existing expert recommendation method often ignores the dynamic interest change of the user, cannot well extract the dynamic interest feature, and reduces the recommendation accuracy. Disclosure of Invention The technical problem to be solved by the invention is to provide an online question-answering community expert recommendation method based on a multi-head self-attention mechanism, wherein dynamic interest characterization of a user is extracted through the multi-head self-attention mechanism, short-term interest changes of the user are dynamically captured according to a historical answer question sequence of the user, accurate, individual and real-time expert recommendation service is provided for an online question-answering community, and the question-answering rate is improved. In order to solve the technical problems, the invention is realized as follows: An online question-answering community expert recommendation method based on a multi-head self-attention mechanism comprises a question encoder construction process, a user encoder construction process, a predictor construction process, a deep learning model training process and a prediction process; The construction process of the question encoder comprises the steps of carrying out feature coding on the questions, extracting information in the title and the theme of the questions as question labels, and matching the question labels with interest labels of expert users; The user encoder construction process comprises the steps of carrying out feature coding on a historical answer question sequence of a user