Search

KR-20260063667-A - Device and method for automatically generating a persona-based dialogue dataset, and a chatbot system trained with a persona-based dialogue dataset

KR20260063667AKR 20260063667 AKR20260063667 AKR 20260063667AKR-20260063667-A

Abstract

A device for automatically constructing a persona-based conversation dataset and a chatbot system trained with a persona-based dataset are disclosed. The device for automatically constructing a persona-based conversation dataset summarizes an input conversation to determine a correct persona, additionally selects the correct persona and a neutral persona to create a persona memory, and connects the created persona memory to conversation data to create a persona-based conversation dataset. The chatbot system trained with the persona-based dataset receives the persona memory and the conversation as input, encodes them, performs sentence-level attention on the encoded vectors to check for the presence or absence of a persona, decodes the conversation token vector to generate a response if no persona is present, and performs token-level attention to combine and decode each attention value and the conversation token vector to generate a response if a persona is present. According to the present invention, a high-quality persona-based conversation dataset can be automatically generated cost-effectively, and a chatbot system effective for long-term conversations can be trained.

Inventors

  • 김학수
  • 김홍진
  • 정근영
  • 이성희

Assignees

  • 건국대학교 산학협력단

Dates

Publication Date
20260507
Application Date
20241030

Claims (17)

  1. In a persona-based conversation dataset automatic construction device, A data input unit that receives conversation data between two or more speakers; Persona extraction unit that extracts a conversation summary summarizing input conversation data; A correct answer persona determination unit that determines a correct answer persona based on the above conversation summary; and A persona memory generation unit that stores the above-mentioned correct answer persona in a persona memory and links the persona memory in which the correct answer persona is stored with the above-mentioned input conversation data; comprising Persona-based conversation dataset automatic construction device.
  2. In Article 1, It further includes a re-raising unit that receives a conversation summary generated by the persona extraction unit and outputs another expression having a similar meaning to the conversation summary. The above correct answer persona determination unit is, Determining the correct persona using a different representation of the persona output from the above re-raising unit, Persona-based conversation dataset automatic construction device.
  3. In Article 1, A candidate persona determination unit further comprising: selecting a persona that is in a neutral relationship with the correct answer persona from a persona pool in which multiple personas are stored, and determining it as a candidate persona. Persona-based conversation dataset automatic construction device.
  4. In Paragraph 3, The above candidate persona determination unit is, A natural language inference unit that determines, through natural language inference, the relationship between a randomly selected persona from a persona pool in which multiple personas are stored and the correct answer persona. Persona-based conversation dataset automatic construction device.
  5. In a chatbot system trained with a persona-based dataset, A sentence-level attention unit that receives conversation token vectors and persona token vectors as input and performs attention operations by mean pooling each; A token-level attention unit that receives a conversation token vector and a persona token vector as input and performs attention operations; A decoding unit that generates a response based on an input conversation token vector; and Persona identification unit comprising: checking whether there is a correct persona based on a sentence-level attention weight determined by the sentence-level attention unit; controlling to input the conversation token vector into the decoding unit if there is no correct persona; and controlling to input the sentence-level attention value output from the sentence-level attention unit, the token-level attention value output from the token-level attention unit, and the conversation token vector into the decoding unit if there is a correct persona. Chatbot system trained on a persona-based dataset.
  6. In Article 5, A conversation encoder that receives a conversation as input, inserts tokens between each conversation, and generates a conversation token vector; and A persona encoder that receives a persona memory as input and generates a persona vector for each persona in the persona memory; further comprising Chatbot system trained on a persona-based dataset.
  7. In Article 6, A persona memory storage unit that stores persona memories for each user; further comprising Chatbot system trained on a persona-based dataset.
  8. In the method for automatically constructing a persona-based conversation dataset, A data input step for receiving conversation data between two or more speakers; Persona extraction step for extracting a conversation summary that summarizes the input conversation data; A correct answer persona determination step for determining the correct answer persona based on the above conversation summary; and A persona memory generation step comprising: storing the above-mentioned correct answer persona in a persona memory and linking the persona memory in which the correct answer persona is stored with the above-mentioned input conversation data; Method for automatically building a persona-based conversation dataset.
  9. In Article 8, It further includes a re-raising step that receives a conversation summary generated in the persona extraction step and outputs another expression of the persona having a similar meaning to the conversation summary. The above correct answer persona determination step is, Determining the correct persona using a different representation of the persona output in the above re-raising step, Method for automatically building a persona-based conversation dataset.
  10. In Article 8, The method further includes a candidate persona determination step of selecting a persona that is in a neutral relationship with the correct answer persona from a persona pool in which multiple personas are stored, and determining it as a candidate persona. The above persona memory creation step is, A step of storing the above candidate persona in the above persona memory; comprising Method for automatically building a persona-based conversation dataset.
  11. In Article 10, The above candidate persona determination step is, A natural language inference step for determining, through natural language inference, the relationship between a randomly selected persona from a persona pool in which multiple personas are stored and the correct answer persona; Method for automatically building a persona-based conversation dataset.
  12. Regarding the method of generating responses in a persona-based chatbot system, A sentence-level attention step that receives conversation token vectors and persona token vectors as input, performs an attention operation by mean pooling each, and A persona identification step that checks whether there is a correct answer persona based on the sentence-level attention weight determined in the sentence-level attention step above; and It includes a decoding step that generates a response based on an input conversation token vector, and If it is confirmed that a correct persona exists in the above persona identification step, A token-level attention step that receives a conversation token vector and a persona token vector as input and performs an attention operation; and Further comprising an attention combining step that combines a sentence-level attention value output from the sentence-level attention step and a token-level attention value output from the token-level attention step. Method for generating responses in a persona-based chatbot system.
  13. In Article 12, A conversation encoding step that receives a conversation, inserts tokens between each conversation, and inputs the conversation to a conversation encoder to generate a conversation token vector; further comprising Method for generating responses in a persona-based chatbot system.
  14. In Article 13, A persona encoding step that inputs the persona memory for each user into a persona encoder to generate a persona vector; further comprising Method for generating responses in a persona-based chatbot system.
  15. In a method for training a chatbot system using a persona-based dataset, A sentence-level attention step that receives conversation token vectors and persona token vectors as input, performs an attention operation by mean pooling each, and A persona identification step that determines the persona with the largest sentence-level attention weight determined in the sentence-level attention step as the predicted persona; If it is confirmed that a predicted persona exists in the persona identification step above, a token-level attention step that receives a conversation token vector and a persona token vector as input and performs an attention operation; If it is confirmed that a predicted persona exists in the persona identification step above, an attention combining step that combines the sentence-level attention value output from the sentence-level attention step above and the token-level attention value output from the token-level attention step above; and A decoding step that generates a predicted response based on an input conversation token vector; comprising A method for training a chatbot system using a persona-based dataset.
  16. In Article 15, A conversation encoding step that receives a conversation, inserts tokens between each conversation, and inputs the conversation to a conversation encoder to generate a conversation token vector; further comprising A method for training a chatbot system using a persona-based dataset.
  17. In Article 15, The learning parameters of the sentence-level attention step, the learning parameters of the token-level attention step, and the learning parameters of the decoding step are, Updated to minimize a loss function based on the difference between the correct persona and the predicted persona, and the difference between the correct response and the predicted response, A method for training a chatbot system using a persona-based dataset.

Description

Device and method for automatically generating a persona-based dialogue dataset, and a chatbot system trained with a persona-based dialogue dataset The present invention relates to an apparatus for automatically constructing a persona-based conversation dataset and a chatbot system, and in particular, discloses a device for constructing a persona-based conversation dataset by extracting a persona from a general conversation dataset and a chatbot system trained using the persona-based conversation dataset to identify a persona in a conversation context and generate a response. With the significant advancement of neural network models, artificial intelligence conversational systems have been greatly improved, and active research is underway to develop more natural and human-like chatbots. In this regard, personas containing key information about the speaker, such as personal information, preferences, and values, are becoming increasingly important. In human conversation, humans remember meaningful information about the other person and utilize this remembered information to respond in subsequent conversations. Chatbots can more closely mimic human conversational abilities by integrating personas into conversations. Therefore, various datasets have been developed to train chatbots for persona-based conversations. However, such datasets are primarily generated by human annotators and incur significant costs. While substantial amounts of data can be obtained cheaply from online communities, the quality of this data is often remarkably poor. Furthermore, it is known that using session-based datasets is efficient for learning long-term conversational capabilities. However, such data requires a much more extensive workload from human annotators. Human annotators are provided with a memory that stores personas obtained from previous sessions, and they participate in conversations while referencing this data. Therefore, the process of constructing such datasets presents the problem of being extremely costly. Meanwhile, various chatbot models are being researched to enhance persona-based chatting capabilities. Most chatbot models use a separate search engine to extract relevant elements from persona memory. However, using a search engine leads to reduced efficiency because it slows down not only during the training phase but also during the inference phase. Published Patent No. 10-2023-0076012, disclosed on May 31, 2023, relates to "a method and system for generating persona conversation data using a super-large language model," and discloses a method for generating persona conversation data capable of finely controlling the flow of conversation of a character chatbot. The method for constructing a conversation database using a language model includes the steps of receiving a plurality of predefined conversation purpose-conversation type pairs, generating a plurality of conversation situations associated with each conversation purpose using a language model, and constructing an initial conversation database by generating a plurality of seed conversation sessions corresponding to a plurality of conversation purpose-conversation type-conversation situation triple pairs using a language model. Published Patent No. 10-2024-0076978, disclosed on May 31, 2024, relates to "an apparatus and method for generating a conversational model using knowledge and a persona," and discloses a method for generating a conversational model that reflects knowledge and a user's persona. The method for generating a conversational model includes: (a) receiving a conversation history including a user's utterance and a machine's utterance; (b) selecting a knowledge candidate using the conversation history; (c) selecting a persona candidate using the conversation history; (d) generating an input including a part of the conversation history, a knowledge candidate, and a persona candidate; and (e) generating an utterance corresponding to the input using the model. FIG. 1 is a schematic diagram showing the configuration of a persona-based conversation dataset automatic construction device according to one embodiment. FIG. 2 is a conceptual diagram showing the configuration and operation of a persona-based conversation dataset automatic construction device according to one embodiment. FIG. 3 is a flowchart illustrating a method for automatically constructing a persona-based conversation dataset according to one embodiment. FIG. 4 is a schematic diagram showing a chatbot system trained with a persona-based dataset according to one embodiment. FIG. 5 is a conceptual diagram showing the configuration and operation of a chatbot system trained with a persona-based dataset according to one embodiment. FIG. 6 is a flowchart illustrating the operation method of a chatbot system trained with a persona-based dataset according to one embodiment. FIG. 7 is a flowchart illustrating a method for training a chatbot system with a persona-based dataset according to o