KR-20260066369-A - Explainable post-training method and apparatus for reducing hallucination of text based generative language model

KR20260066369AKR 20260066369 AKR20260066369 AKR 20260066369AKR-20260066369-A

Abstract

A post-learning method and apparatus for mitigating hallucination phenomena in a text-based generative language model are provided. A post-learning method for a generative language model according to one embodiment of the present invention comprises: a verification question generation step of collecting online data reflecting the latest content from online sources and generating a verification question for verifying hallucination phenomena from said online data; an answer generation step of inputting said verification question into a large language model to generate an answer to the verification question; a hallucination verification step of comparing the answer to said verification question with said online data to determine whether a hallucination phenomenon exists; and a post-extended learning step of determining hallucinogenicity based on the verification result in said hallucination verification step, and if it is determined that a hallucination has occurred, training said large language model using said online data.

Inventors

배용진
배경만
김민호
김현기
노지현
이형직
장명길

Assignees

한국전자통신연구원

Dates

Publication Date: 20260512
Application Date: 20241104

Claims (20)

A verification question generation step of collecting online data reflecting the latest content from online sources and generating verification questions for verifying hallucinogenic phenomena from the said online data; Answer generation step of inputting the above validation question into a large language model to generate an answer to the validation question; A hallucination verification step for determining whether there is a hallucination by comparing the answer to the above verification question with the above online data; and A post-extended learning step in which hallucinogenicity is determined based on the verification results of the hallucination verification step above, and if it is determined that a hallucination has occurred, the large language model is trained using the online data above. A post-training method for a generative language model equipped with
In paragraph 1, the verification question generation step is, A prompt to generate a verification question suitable for the hallucinogenic phenomenon to be verified, input the aforementioned online data into the aforementioned large language model, and generate the output of the said large language model as a verification question. Post-training method for generative language models.
In paragraph 2, In the above verification question generation step, multiple verification questions are generated for the above online data, and In the above answer generation step, multiple answers to the above multiple verification questions are generated, and The above hallucination verification step determines whether there is a hallucination phenomenon with respect to the above plurality of answers, and The above post-extended learning stage evaluates hallucinogenicity using some of the verification results among the verification results for the plurality of answers according to the learning direction of the above large language model, Post-training method for generative language models.
In paragraph 1, the above-mentioned post-extended learning step is, The step of generating training data using the above online data, and A step of retraining the large language model using the generated training data, and A step of re-verifying the hallucinogenicity of the retrained large language model using the above validation question, repeating retraining if hallucination occurs, and terminating the procedure on the collected online data if hallucination does not occur. A post-training method for a generative language model including
In paragraph 4, the above-mentioned relearning step is, Training using a reinforcement learning algorithm by forming pairs of incorrect responses from a large language model and regenerated correct data. Post-training method for generative language models.
In any one of paragraphs 1 through 5, In the above verification question generation step, four verification questions are generated from the above online data for four types of hallucinations: factual hallucinations, relatedness hallucinations, consistency hallucinations, and completeness hallucinations, and In the above answer generation step, four answers to the above four verification questions are generated, and The above hallucination verification step determines whether there is a hallucinatory phenomenon for each of the above four answers, and The above post-extended learning stage evaluates hallucinogenicity using some or all of the verification results of the above four answers according to the learning direction of the large language model, Post-training method for generative language models.
In paragraph 6, In the above verification question generation step, a fact verification question suitable for factual hallucination is generated to output question-answer pairs, and In the above hallucination verification step, factual hallucination verification is performed by comparing the answers to the above fact verification questions using the above question-answer pairs. Post-training method for generative language models.
In paragraph 6, In the above verification question generation step, context verification questions suitable for relevance illusion are generated to generate questions and answers that require relevant context or background information to verify the accuracy of information, and In the above hallucination verification step, the answers to the above context verification questions are divided into paragraph or sentence units to separate them into relevance hallucination verification units, and the suitability of the responses is verified by pairing the questions with the hallucination verification units. Post-training method for generative language models.
In paragraph 6, In the above verification question generation step, cross-validation questions suitable for consistency illusion are generated to generate multiple questions by asking the same information in different ways, and In the hallucination verification step above, verifying whether there is a consistent hallucination based on whether the answers to the plurality of questions match, Post-training method for generative language models.
In paragraph 6, In the above verification question generation step, a comprehensive information verification question suitable for completeness illusion is generated to also extract keywords indicating what important essential information is from the above online data, and In the above hallucination verification step, the complete hallucination is verified by comparing the response content of the large language model with the above keywords to determine whether all keywords are included in the response content of the large language model. Post-training method for generative language models.
A verification question generation unit that collects online data reflecting the latest content from online sources and generates verification questions for verifying hallucinogenic phenomena from the said online data; An answer generation unit that inputs the above verification question into a large language model to generate an answer to the verification question; A hallucination verification unit that determines whether there is a hallucination phenomenon by comparing the answer to the above verification question with the above online data; and A post-extended learning unit that determines hallucinogenicity based on the verification results of the hallucination verification unit and, if it is determined that a hallucination has occurred, trains the large language model using the online data. A post-learning device for a generative language model equipped with
In Clause 11, the above verification question generation unit is, A prompt to generate a verification question suitable for the hallucinogenic phenomenon to be verified, input the above online data into the above large language model, and generate the output of the above large language model as a verification question. Post-training device for a generative language model.
In Paragraph 12, The above verification question generation unit generates a plurality of verification questions for the above online data, and The above answer generation unit generates multiple answers to the above multiple verification questions, and The hallucination verification unit determines whether there is a hallucination phenomenon for each of the plurality of answers, and The above-mentioned post-extension learning unit evaluates hallucinogenicity by using some of the verification results among the multiple verification results of the hallucinogenicity verification unit for the multiple answers according to the learning direction of the large language model. Post-training device for a generative language model.
In Clause 11, the above-mentioned post-extended learning unit is, Generating training data using the above online data, retraining the above large language model using the generated training data, re-verifying the hallucinogenicity of the retrained large language model using the above validation questions, repeating retraining if hallucinations occur, and terminating the procedure for the above collected online data if hallucinations do not occur. Post-training device for a generative language model.
In Clause 14, the above-mentioned post-extended learning unit is, Retraining the large language model using a reinforcement learning algorithm by forming pairs of incorrect responses from the large language model and regenerated correct data. Post-training device for a generative language model.
In any one of paragraphs 11 through 15, The above verification question generation unit generates four verification questions for four types of hallucinations—factual hallucinations, relatedness hallucinations, consistency hallucinations, and completeness hallucinations—from the above online data, and The above answer generation unit generates four answers to the above four verification questions, and The above hallucination verification unit determines whether there is a hallucinatory phenomenon for each of the above four answers, and The above-mentioned post-extended learning unit evaluates hallucinogenicity using some or all of the verification results of the above four answers according to the learning direction of the large language model, Post-training device for a generative language model.
In Paragraph 16, The above verification question generation unit generates a fact verification question suitable for factual hallucination that outputs a question-answer pair, and The hallucination verification unit above performs factual hallucination verification by comparing the answer to the fact verification question using the question-answer pair above. Post-training device for a generative language model.
In Paragraph 16, The above verification question generation unit generates context verification questions suitable for relevance illusions that generate questions and answers requiring relevant context or background information to verify the accuracy of information, and The above hallucination verification unit separates the answers to the above context verification questions into paragraph or sentence units to form relevance hallucination verification units, and verifies the suitability of the responses by pairing the questions with the hallucination verification units. Post-training device for a generative language model.
In Paragraph 16, The above verification question generation unit generates cross-validation questions suitable for consistency illusion, which generate multiple questions by asking the same information in different ways, and The above hallucination verification unit verifies whether there is a consistent hallucination based on whether the answers to the above plurality of questions match. Post-training device for a generative language model.
In Paragraph 16, The above verification question generation unit generates a comprehensive information verification question suitable for completeness illusion that also extracts keywords indicating what important essential information is from the above online data, and The above hallucination verification unit verifies complete hallucination by comparing the response content of the above large language model with the above keywords and determining whether all keywords are included in the response content of the above large language model. Post-training device for a generative language model.

Description

Explainable post-training method and apparatus for reducing hallucination of text-based generative language model The present invention relates to an explainable post-learning method and apparatus for mitigating hallucinogenicity in text-based generative language models. Due to advancements in artificial intelligence and natural language processing technologies, Large Language Models (LLMs) are being used in various application fields. These models are pre-trained using large datasets and then fine-tuned to suit specific tasks. However, these models sometimes exhibit hallucinations, generating inaccurate or fictitious information. This phenomenon has a particularly negative impact on user credibility and the practical usability of the models. Conventional language models rely on pre-trained data, which limits their ability to generate accurate responses tailored to new information or specific situations. While Retrieval Augmented Generation (RAG) attempts to mitigate hallucinations, it does not address the fundamental problems of language models and incurs additional costs associated with utilizing external knowledge. The hallucinations resulting from these limitations hinder the user experience and undermine reliability in real-world applications. Consequently, there is a growing need for methods to further train models and for devices to implement such processes in order to reduce hallucinations. FIG. 1 is a block diagram showing the configuration of a post-learning device for a text-based generative language model according to one embodiment of the present invention. FIG. 2 is a flowchart showing the operation flow of a post-learning method of a text-based generative language model according to an embodiment of the present invention. FIG. 3 shows the flow of output data generated from each component for four major hallucinogenic phenomena in one embodiment of the present invention. The aforementioned objectives of the present invention, as well as other objectives, advantages, and features, and the methods for achieving them, will become clear from the embodiments described in detail below together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below but can be implemented in various different forms, and the following embodiments are provided merely to easily inform those skilled in the art of the purpose, structure, and effects of the invention, and the scope of the rights of the present invention is defined by the description in the claims. Meanwhile, the terms used in this specification are for describing the embodiments and are not intended to limit the invention. In this specification, the singular form includes the plural form unless specifically stated otherwise in the text. As used in this specification, "comprises" and/or "comprising" do not exclude the presence or addition of one or more other components, steps, actions, and/or elements to the mentioned components, steps, actions, and/or elements. The present invention trains a model by utilizing the capabilities of a large language model without human intervention to perform operations such as data collection, evaluation, and retraining, thereby generating questions and answers, and verifying and modifying illusions. In this regard, the present invention offers cost advantages and can generate more accurate and reliable responses because it continuously collects data reflecting the latest content from online sources to incorporate up-to-date information. The present invention will be described in detail below with reference to the drawings. FIG. 1 is a block diagram showing the configuration of a post-learning device for a text-based generative language model according to one embodiment of the present invention. The learning device of the present invention is a device for performing post-learning of a Large Language Model (LLM) (500), and comprises a verification question generation unit (100) for generating a question for verifying a hallucination phenomenon, an answer generation unit (200) for generating an answer to the verification question generated by the verification question generation unit (100), a hallucination verification unit (300) for determining whether a hallucination phenomenon exists by comparing the answer generated by the answer generation unit (200) with online data, and a post-extension learning unit (400) for performing post-learning of a generative language model by modifying the hallucinated content based on online data if it is determined that a hallucination phenomenon exists. The verification question generation unit (100) collects data reflecting the latest content from online sources and generates various questions for verifying hallucinogenic phenomena. To generate verification questions, the verification question generation unit (100) inputs a prompt to generate verification questions suitable for the hallucinogenic phenomenon to be verified into a larg