CN-115329064-B - Method and device for generating abstract and storage medium
Abstract
The invention discloses a method, a device and a storage medium for generating abstracts, and belongs to the technical field of natural language processing. The abstract generation method comprises the steps of S1, gradually obtaining output probabilities of related words of corresponding roles in an output word list by a decoder according to a constructed word list in a training database, S2, extracting k words with the largest output probabilities, splicing the k words with the largest output probabilities to a decoded word sequence in a model to serve as candidate abstracts, S3, sorting a plurality of candidate abstracts according to the output probabilities by the model, ensuring that the size of a bundle is k, and taking the candidate sequence with the largest output probabilities as the abstract after model prediction is finished. The invention also comprises a device for generating the abstract and a storage medium. The storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the above-described summary generation method. The generating method helps the model generate a better abstract facing the role, and the generated abstract content is obviously promoted.
Inventors
- ZHOU YU
- LIN HAITAO
- XIANG LU
- ZONG CHENGQING
Assignees
- 北京中科凡语科技有限公司
Dates
- Publication Date
- 20260508
- Application Date
- 20220419
Claims (4)
- 1. The method for generating the abstract is characterized by comprising the following steps of: The method specifically comprises the following steps of: t1, adopting different encoders to correspondingly encode dialogue contents according to different roles; T2, the decoder acquires statement representations of the roles according to the corresponding roles, and the decoder focuses on statement representations of other roles by adopting different attention modules when decoding; t3, calculating KL divergence crossing the attention distribution, which is expressed by sentences of the same role, of different decoders to obtain a loss function crossing the attention interaction; T4, in the self-attention module of the decoder, the hidden layer representation of each decoder focuses on the hidden layer representations of other decoders to form role self-attention interaction; T5, predicting the output probability of each word at each position by a decoder, and obtaining a loss function of the abstract according to the output probability and the maximum likelihood estimation; T6, synthesizing the loss function crossing the attention interaction and the loss function of the abstract, and training and optimizing gradient descent for the model; after the abstract generating model is established and optimized, the following steps are executed: S1, the decoder gradually obtains the output probability of related words of the corresponding roles in the output word list according to the constructed word list in the training database; S2, extracting k words with the largest output probability, and splicing the k words to a decoded word sequence in the model to serve as candidate abstracts, wherein k is greater than 1; S3, the model sorts the candidate abstracts according to the output probability and ensures that the size of the beam is k; in step T1, the encoding the dialog content according to different roles by using different encoders includes: Step T11, splicing the speaker role information of the dialogue statement and the dialogue content together according to the turn sequence, and obtaining word embedding representation through a word embedding layer; step T12, using an encoder to encode the word embedded representation to obtain a representation of each word in the dialogue; in step T2, the decoder obtains the sentence representation of the character according to the corresponding character, and the sentence representation of the other character focused by the different attention modules when the decoder decodes includes: Step T21, according to the role of the speaker of each sentence in the dialogue, the dialogue is represented according to different roles; Step T22, the decoder respectively focuses attention on sentences of all roles to obtain attention distribution representation and encoder context representation; In step T4, in the self-attention module of the decoder, the hidden layer representation of each decoder focuses on the hidden layer representations of other decoders, and forming the role self-attention interaction includes: step T41, each decoder obtains the hidden layer state of the decoder at the current moment according to the decoding state of the previous moment and the current input information; Step T42, focusing the hidden layer state at the current moment of the decoder on the hidden layer states of all the previous moments of other decoders to obtain the context representation of the role abstract; In step T5, the decoder predicting the output probability of each positional word includes: the decoder predicts the output probability of the word at each position in the abstract according to the hidden state information, the encoder context representation and the context representation of the role abstract; In step T6, the synthesizing the loss function across attention interactions and the loss function of the summary, training and optimizing the gradient descent for the model includes: t61, carrying out weighted fusion on the loss function crossing the attention interaction and the loss function of the abstract; And T62, training and optimizing the model by using a gradient descent algorithm until the loss is no longer reduced on the verification set.
- 2. The summary generation method according to claim 1, wherein in step S1, the roles include a user role and a customer service role.
- 3. A summary generation apparatus, comprising: A decoder for splicing the speaker character information of the dialogue sentence and the dialogue content together in turn order in model building and optimizing stage, obtaining word embedded representation through word embedded layer, and encoding the word embedded representation according to different characters using different encoders to obtain representation of each word in the dialogue, and According to the speaker role of each sentence in the dialogue, the dialogue is expressed according to different roles, and the attention is respectively paid to the sentences of all roles to obtain the attention distribution expression and the encoder context expression, and Calculating KL divergence across attention distributions of statement representations of the same character focused by different decoders to obtain loss functions across attention interactions, and In its own self-attention module, obtaining hidden layer state at current moment according to decoding state at last moment and current input information, focusing said hidden layer state on hidden layer states of all previous moments of other decoders to form character self-attention interaction and obtain context representation of character abstract, and Predicting the output probability of each word in the abstract based on the hidden layer information, the encoder context representation and the context representation of the character abstract, and combining the maximum likelihood estimation to obtain the loss function of the abstract, and Performing weighted fusion on the loss function crossing the attention interactions and the loss function of the abstract, training and optimizing the model by using a gradient descent algorithm until the loss is no longer reduced on the verification set, and In the model application stage, gradually obtaining the output probability of related words of the corresponding roles in the output word list according to the constructed word list in the training database; The data processing unit is used for extracting k words with the highest probability output by the decoder in a model application stage, and splicing the k words to a decoded word sequence in a model to serve as candidate abstracts, wherein k is greater than 1; And the abstract generating unit is used for sequencing the candidate abstracts obtained by the data processing unit according to the output probability and ensuring the size of the beam to be k in the model application stage, and taking the candidate sequence with the maximum output probability as the abstract after the model prediction is finished.
- 4. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the digest generation method according to any one of claims 1 to 2.
Description
Method and device for generating abstract and storage medium Technical Field The invention relates to the technical field of natural language processing, in particular to a method and a device for generating a abstract and a storage medium. Background Text summarization is a task of summarizing the content in a long text with a short text. By applying the technology, people can quickly grasp key information in the text. In recent years, with the continuous progress of communication means, more and more text information appears in the form of conversations. The dialogue text has the characteristics of long turn number, incontinuous semantics, spoken expression mode and the like. Compared with direct reading of the dialogue original text, reading of a section of abstract of the dialogue can greatly improve the reading efficiency of people, so that the dialogue abstract technology is gradually paid attention to by people. One feature of a conversation, as compared to a general text, is that it is made up of the content of expressions of a plurality of speakers, each of which plays a respective role and holds a respective perspective in the conversation. Thus, in addition to making a general overall summary for the discussion of a conversation, we can summary the summary content related to the role, i.e., the role-oriented conversation summary, for different conversational roles. In the field of customer service, the role abstract has great practical application value. The user-oriented abstract mainly contains questions and encountered difficulties raised by the user, and can reflect which questions are frequently raised by the user, so that the relevant platform can conveniently conduct statistical analysis. The customer service-oriented abstract mainly comprises a customer service problem solving process, and can help a platform to automatically evaluate the service quality of the customer service. Most of the existing role-oriented dialog abstract methods separate different roles from each other, and how to generate a role-oriented dialog abstract by using information of other roles is a problem to be solved in the prior art. Disclosure of Invention Because of interactions between different characters in a conversation, the speaking content of one character may provide the necessary assistance to the abstract of another character. Therefore, the invention provides a method, a device and a storage medium for generating abstracts, which comprise different roles, extract key information of other roles from dialogue statement aspects and abstracts, and help a model to generate abstract contents more accurately oriented to the roles. The technical scheme of the invention provides a method for generating a abstract, which comprises the following steps: S1, the decoder gradually obtains the output probability of related words of the corresponding roles in the output word list according to the constructed word list in the training database; S2, extracting k words with the largest output probability, and splicing the k words to a decoded word sequence in the model to serve as candidate abstracts, wherein k is greater than 1; and S3, the model sorts the candidate abstracts according to the output probability, ensures the size of the beam to be k, and takes the candidate sequence with the largest output probability as the abstract after the model prediction is finished. Further, before step S1, the method further includes creating and optimizing a summary generating model, specifically including the following steps: t1, adopting different encoders to correspondingly encode dialogue contents according to different roles; T2, the decoder acquires statement representations of the roles according to the corresponding roles, and the decoder focuses on statement representations of other roles by adopting different attention modules when decoding; t3, calculating KL divergence crossing the attention distribution, which is expressed by sentences of the same role, of different decoders to obtain a loss function crossing the attention interaction; T4, in the self-attention module of the decoder, the hidden layer representation of each decoder focuses on the hidden layer representations of other decoders to form role self-attention interaction; T5, predicting the output probability of each word at each position by a decoder, and obtaining a loss function of the abstract according to the output probability and the maximum likelihood estimation; and T6, integrating the loss function crossing the attention interaction and the loss function of the abstract, and training and optimizing the gradient descent of the model. Further, in step T1, the using different encoders to encode the dialogue content according to different roles includes: Step T11, splicing the speaker role information of the dialogue statement and the dialogue content together according to the turn sequence, and obtaining word embedding representation through a word