KR-20260062457-A - DEVICE, METHOD AND COMPUTER PROGRAM FOR GENERATING ANSWER TO QUERY

KR20260062457AKR 20260062457 AKR20260062457 AKR 20260062457AKR-20260062457-A

Abstract

A device for generating an answer to a query comprises a decomposition unit that decomposes original data into at least one original decomposition data and decomposes query data into at least one query decomposition data, a document derivation unit that derives a correlation between the at least one original decomposition data and the at least one query decomposition data through a language model and derives a related document associated with the query based on the correlation, and an answer generation unit that generates an answer to the query data based on the related document, wherein the language model includes a plurality of attention layers and embeds the original decomposition data and the query decomposition data based on each attention score derived through the plurality of attention layers.

Inventors

박국현

Assignees

주식회사 케이티

Dates

Publication Date: 20260507
Application Date: 20241029

Claims (17)

In a device for generating answers to questions, A decomposition unit that decomposes original data into at least one original decomposed data and decomposes query data into at least one query decomposed data; A document derivation unit that derives the relationship between the at least one original decomposition data and the at least one query decomposition data through a language model, and derives an associated document related to the query based on the relationship; and Answer generation unit that generates an answer to the above query data based on the above-mentioned related documents Includes, The above language model includes a plurality of attention layers, and An answer generation device that embeds the original decomposed data and the query decomposed data based on each attention score derived through the plurality of attention layers.
In Article 1, An answer generation device in which the above decomposition unit decomposes the original data or the query data such that the at least one original decomposition data or the at least one query decomposition data contains only one piece of knowledge information.
In Article 1, An embedding unit that embeds the above at least one original decomposition data into a first vector; and An answer generation device further comprising a storage unit that stores the above-mentioned first vector in a database.
In Paragraph 3, It further includes a judgment unit that determines whether a search needs to be performed based on the degree of specificity of the answer content to be generated for each of the above-mentioned at least one query decomposition data, and An answer generation device in which the above embedding unit embeds the query decomposition data requiring search execution among the at least one query decomposition data into a second vector based on the above determination result.
In Article 4, An answer generation device in which the document derivation unit derives an associated document related to at least one query from the database through the inner product between the first vector and the second vector.
In Article 1, An answer generation device, wherein the language model derives a first attention score for the relationship between at least one original decomposed data and another original decomposed data or the relationship between at least one query decomposed data and another query decomposed data through a first attention layer among the plurality of attention layers.
In Article 1, An answer generation device in which the language model derives a second attention score regarding the relationship between the original data and the at least one original decomposed data or the relationship between the query data and the at least one query decomposed data through the second attention layer among the plurality of attention layers.
In Article 1, An answer generation device in which the language model derives a third attention score for the at least one original decomposed data or the at least one query decomposed data through the third attention layer among the plurality of attention layers.
A method for generating an answer to a query performed in an answer generation device, A step of decomposing original data into at least one original decomposed data; A step of decomposing query data into at least one query decomposition data; A step of deriving the relationship between the at least one original decomposed data and the at least one query decomposed data through a language model; A step of deriving related documents related to the above query based on the above relevance; and Step of generating an answer to the above query data based on the above-mentioned related documents Includes, The above language model includes a plurality of attention layers, and A method for generating an answer, wherein the original decomposed data and the query decomposed data are embedded based on each attention score derived through the plurality of attention layers.
In Article 9, A method for generating an answer, further comprising the step of decomposing the original data or the query data such that the at least one original decomposed data or the at least one query decomposed data contains only one piece of knowledge information.
In Article 9, The step of embedding the above at least one original decomposition data into a first vector; and A method for generating an answer, further comprising the step of storing the above-mentioned first vector in a database.
In Article 11, A step of determining whether to perform a search based on the degree of specificity of the answer content to be generated for each of the above at least one query decomposition data; and A method for generating an answer, further comprising the step of embedding the query decomposition data requiring search execution among the at least one query decomposition data into a second vector based on the above determination result.
In Article 12, The step of deriving the aforementioned related documents is, A method for generating an answer, comprising the step of deriving an associated document related to at least one query from the database through the inner product between the first vector and the second vector.
In Article 9, A method for generating an answer, wherein the language model derives a first attention score regarding the relationship between at least one original decomposed data and another original decomposed data or the relationship between at least one query decomposed data and another query decomposed data through a first attention layer among the plurality of attention layers.
In Article 9, A method for generating an answer, wherein the language model derives a second attention score regarding the relationship between the original data and the at least one original decomposed data or the relationship between the query data and the at least one query decomposed data through the second attention layer among the plurality of attention layers.
In Article 9, A method for generating an answer, wherein the language model derives a third attention score for at least one original decomposed data or at least one query decomposed data through the third attention layer among the plurality of attention layers.
In a computer program stored on a computer storage medium comprising a sequence of instructions that generate an answer to a query, When the above computer program is executed by a computing device, Decompose the original data into at least one original decomposed data, and decompose the query data into at least one query decomposed data, and Deriving the correlation between the at least one original decomposed data and the at least one query decomposed data through a language model, and deriving associated documents related to the query based on the correlation, Includes a sequence of commands to generate an answer to the above query data based on the above-mentioned associated documents, and The above language model includes a plurality of attention layers, and A computer program stored in a computer storage medium that embeds the original decomposition data and the query decomposition data based on each attention score derived through the plurality of attention layers.

Description

Device, method and computer program for generating an answer to a query The present invention relates to an apparatus, a method, and a computer program for generating an answer to a query. A Large Language Model (LLM) is a type of artificial intelligence program capable of performing tasks such as recognizing and generating text. Large Language Models are used to build conversational systems that provide answers to user queries. However, since large language models generate answers relying solely on inherent knowledge, there were instances where they produced answers that were not factual, leading to hallucinations. Recently, to prevent the occurrence of hallucinations, Retriever Augmented Generation (RAG) technology is emerging, which generates output by including documents containing information to answer user queries in the input prompts of large language models. In relation to such RAG technology, the prior art Korean Published Patent No. 10-2024-0076978 discloses an apparatus and method for generating a conversational model using knowledge and personas. For RAG technology to function properly, accurate documents capable of answering user queries must be retrieved, and large language models must accurately capture the information required for the retrieved documents. To achieve this, documents composed of natural language are converted into vectors and stored; consequently, user queries are converted into vectors, and optimal documents can be retrieved by calculating distance or similarity between these vectors and the documents. However, regarding the issue of document information volume, existing RAG technology had a disadvantage in that it was difficult for the vectors to have sufficiently representative expressive power because it did not distinguish between documents of similar length, even though the amount of information could be vast or insignificant depending on the type or category, and represented all documents as vectors of the same dimension. Alternatively, regarding the issue of user query types, RAG technology had a disadvantage in that it had to search for documents without distinguishing between queries that a large language model could answer without document search, queries that could be answered by searching only a single document, and queries that could be answered by referencing multiple documents. FIG. 1 is a configuration diagram of an answer generation device according to one embodiment of the present invention. Figure 2 is a diagram illustrating a language model including a conventional attention layer. FIGS. 3a to 3e are exemplary drawings for illustrating a plurality of attention layers included in a language model according to an embodiment of the present invention. FIG. 4 is a flowchart of a method for generating an answer to a query performed in an answer generation device according to an embodiment of the present invention. Embodiments of the present invention are described below with reference to the attached drawings so that those skilled in the art can easily implement the invention. However, the present invention may be embodied in various different forms and is not limited to the embodiments described herein. Furthermore, in order to clearly explain the present invention in the drawings, parts unrelated to the explanation have been omitted, and similar parts throughout the specification are denoted by similar reference numerals. Throughout the specification, when a part is described as being "connected" to another part, this includes not only cases where they are "directly connected" but also cases where they are "electrically connected" with other elements interposed between them. Furthermore, when a part is described as "including" a component, this means that, unless specifically stated otherwise, it does not exclude other components but may include additional components, and it should be understood that this does not preclude the existence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. In this specification, the term "part" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. Additionally, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware. Some of the operations or functions described in this specification as being performed by a terminal or device may instead be performed by a server connected to said terminal or device. Likewise, some of the operations or functions described as being performed by a server may also be performed by a terminal or device connected to said server. An embodiment of the present invention will be described in detail below with reference to the attached drawings. FIG. 1 is a configuration diagram of an answer generation device according to an embodiment of the present invention. Referring to FIG. 1, the answer generation device (100) may include a decomposition unit (110)