KR-102962887-B1 - SEQUENCE TEXT SUMMARY PROCESSING DEVICE USING BART MODEL AND CONTROL METHOD THEREOF

KR102962887B1KR 102962887 B1KR102962887 B1KR 102962887B1KR-102962887-B1

Abstract

The present invention relates to a sequential text summarization processing device utilizing a BART model and a control method thereof. The sequential text summarization processing device according to the present invention, utilizing a BART model including an encoder and a decoder, comprises: a document topic embedding calculation unit that inputs sequential text into a topic model to generate topic modeling data and calculates a document topic embedding value for the generated topic modeling data; and a topic attention layer disposed between an encoder of the BART model and an input layer that transmits an input value to the encoder, wherein the topic attention layer receives a document topic embedding value calculated by the document topic embedding calculation unit and performs cross-attention processing based on the received document topic embedding value and an input value input from the input layer.

Inventors

김태균
이정하
김지현
김우주
조정제

Assignees

주식회사 엘지유플러스

Dates

Publication Date: 20260512
Application Date: 20220412

Claims (16)

In a sequential text summary processing device utilizing a BAT model including an encoder and a decoder, A document topic embedding calculation unit that inputs sequential text into a topic model to generate topic modeling data and calculates document topic embedding values for the generated topic modeling data; It includes a topic attention layer positioned between the encoder of the above-mentioned BART model and an input layer that transmits input values to the said encoder, and A sequential text summary processing device characterized in that the topic attention layer receives a document topic embedding value calculated by the document topic embedding calculation unit and performs cross-attention processing based on the received document topic embedding value and an input value input from the input layer, wherein the embedding data input from the input layer is used as a query, the document topic embedding value input from the document topic embedding calculation unit is used as a key, and the document topic embedding value input from the document topic embedding calculation unit is used as a value, and then performs cross-attention processing on the query, key, and value.
In paragraph 1, A sequential text summary processing device characterized in that the above topic model is a Latent Dirichlet Allocation (LDA) model.
In paragraph 2, A sequential text summary processing device characterized by a document topic embedding calculation unit adjusting the distribution data of words for each topic using document-specific topic weight data among topic modeling data generated using the latent Dirichlet allocation model, performing word embedding based on the sequential text, and calculating a document topic embedding value by performing a vector product process on the adjusted distribution data of words for each topic and the generated word embedding data.
In paragraph 3, A sequential text summary processing device characterized by using the Fasttext method for the word embedding processing described above.
In paragraph 1, A sequential text summary processing device characterized by extracting the above sequential text from conversational text based on specific parts of speech, and including a utterance time token of the speaker.
In paragraph 1, A sequential text summary processing device characterized in that the topic attention layer performs cross-attention processing using the document topic embedding value and the input value input from the input layer, then passes the result of residual connection and layer normalization processing through a feedforward neural network, and then performs residual connection and layer normalization processing again.
In paragraph 6, A sequential text summary processing device characterized by processing the above cross-attention processing by the following formula. Here, Query is embedding data received from the input unit, Key is the document topic embedding value (token embedding value) input from the document topic embedding output unit, Value is the document topic embedding value input from the document topic embedding calculation unit, is the dimension of token and document topic embeddings.
A method for controlling a sequential text summary processing device utilizing a BAT model including an encoder and a decoder, (a) a step of inputting sequential text into a topic model to generate topic modeling data and calculating document topic embedding values for the generated topic modeling data; (b) a step of placing a topic attention layer between the encoder of the above-mentioned BART model and an input layer that transmits input values to the said encoder; and (c) A method for controlling a sequential text summary processing device, wherein the document topic embedding value calculated in step (a) above is transmitted to the topic attention layer to enable cross-attention processing based on the document topic embedding value and the input value input from the input layer by the topic attention layer, wherein the embedding data input from the input layer is used as a query, the document topic embedding value input from the document topic embedding calculation unit is used as a key, and the document topic embedding value input from the document topic embedding calculation unit is used as a value, and then cross-attention processing is performed on the query, key, and value.
In paragraph 8, A control method for a sequential text summary processing device characterized in that the above topic model is a Latent Dirichlet Allocation (LDA) model.
In Paragraph 9, The above step (a) is, (a1) A step of adjusting the distribution data of words for each topic using the topic modeling data generated using the above latent Dirichlet allocation model, with the topic weight data per document as the weight; (a2) a step of performing word embedding based on the above sequential text; and (a3) A method for controlling a sequential text summary processing device, characterized by including a step of calculating a document topic embedding value by performing a vector product of the word distribution data for each topic adjusted in step (a1) and the word embedding data generated in step (a2).
In Paragraph 10, A control method for a sequential text summary processing device characterized by performing word embedding using the Fasttext method in step (a2) above.
In paragraph 8, A control method for a sequential text summary processing device, characterized in that the above sequential text is extracted from conversational text based on a specific part of speech, and includes a token of the speaker's utterance time.
In paragraph 8, A control method for a sequential text summary processing device, characterized in that in step (c) above, cross-attention processing is performed using the document topic embedding value and the input value from the input layer, and then the result of performing residual connection and layer normalization processing is passed through a feedforward neural network, and then residual connection and layer normalization processing is performed again.
In Paragraph 13, A control method for a sequential text summary processing device characterized by processing the above cross-attention processing by the following formula. Here, Query is embedding data received from the input unit, Key is the document topic embedding value (token embedding value) input from the document topic embedding output unit, Value is the document topic embedding value input from the document topic embedding calculation unit, is the dimension of token and document topic embeddings.
A computer-readable recording medium having a program for executing any one of the methods of paragraphs 8 through 14.
An application stored on a computer-readable recording medium in combination with hardware to execute the method of any one of claims 8 through 14.

Description

Sequential text summarization processing device using a BART model and control method thereof The present invention relates to a sequential text summary processing device and a control method thereof, and more specifically, to a sequential text summary processing device utilizing a BART model and a control method thereof. Recently, as Deep Learning, which was introduced as a method of machine learning, has shown outstanding results, attempts to introduce artificial intelligence into various fields are being made. In particular, natural language processing is a meaningful and important field in that it enables machines to understand human language. Natural language refers to the language used in everyday life, and natural language processing is a general term for the process of analyzing the meaning of such natural language so that it can be processed by a computer. Natural language processing can be used in endless fields such as speech recognition, content summarization, translation, user sentiment analysis, text classification tasks (spam email classification, news article category classification), question answering systems, and chatbots. Various models for natural language processing have been proposed, the most representative of which is the seq2seq (sequence-to-sequence) model using a Recurrent Neural Network (RNN). The seq2seq model is a model that converts one sentence (sequence) into another sentence (sequence), and it is internally composed of an encoder and a decoder. Here, the encoder encodes the input data, and the decoder decodes the encoded data. For example, as a specific sentence passes through the encoder, a context vector containing information that condenses the meaning of the sentence is created, and as this context vector passes through the decoder, an output sentence corresponding to the specific sentence (e.g., a translated sentence) is generated. However, in these seq2seq models, the information transmitted from the encoder to the decoder is a fixed-length vector. The term "fixed length" implies that the sentence is always converted into a vector of the same length, regardless of how long it is. Consequently, a problem arises where not all necessary information can be fully contained within a limited-length, fixed vector—in other words, the required information is not properly included in the context vector. Consequently, learning becomes inefficient as the input sentence becomes longer. To address this, an attention mechanism was introduced, which requires the decoder to refer back to the entire input sentence from the encoder at every time-step when predicting the output word. In this case, instead of referring to the entire input sentence at the same rate, the parts of the input words related to the word to be predicted at that point are given more weight. Accordingly, even if the input sentence becomes long, the condensed meaning of the input sentence can be conveyed to the output, thereby resolving the problem of information loss. However, even with the application of an attention mechanism to the seq2seq model, natural language processing capabilities did not reach a satisfactory level, and thus a new model called the 'Transformer' was proposed. Transformer is a model derived from the paper "Attention is all you need" published by Google in 2017. While it follows the encoder-decoder structure of the existing seq2seq, it is a model implemented using only attention, as the name of the paper suggests. Even though this model did not use RNNs and designed an encoder-decoder structure, it demonstrated superior translation performance compared to RNNs. Various models that extend this Transformer model structure include BERT (Bidirectional Encoder Representations from Transformers), GPT (Generative Pre-trained Transformer), and BART (Bidirectional Auto-Regressive Transformer). Here, the BERT model is a pre-trained model released by Google in 2018, and its basic structure is a stack of Transformer encoders. Furthermore, the GPT model is a model that utilizes only the decoder structure by excluding the encoder from the Transformer; in contrast to the BERT model, which excels at extracting sentence meaning, GPT is a model that excels at sentence generation. Furthermore, the BART model has a form similar to combining the aforementioned BERT and GPT into one, and is a model that combines existing Sequence-to-Sequence Transformer models by training them through a new pre-training objective. Significant progress has been made in natural language processing across various fields by utilizing these BART models, but the area of summarization generation for sequential text remains somewhat lacking. Here, sequential text refers to text where the analysis of a time-series context is essential, such as human conversations or counseling; therefore, generating summaries from such sequential text is a significantly important task for manual or automated analysis of conversation or counseling content.