KR-20260062794-A - An LLM-based high quality language translation platform using a feedback loop reflected expert opinions
Abstract
The present invention relates to a high-quality language translation platform based on a Large Language Model (LM) having a feedback loop that reflects expert opinions, and more specifically, to a high-quality language translation platform based on an LLM having a feedback loop that reflects expert opinions, wherein evaluation results of translations from an LLM-based language translation system are provided to the platform using LLM and Retrieval Augmented Generation (RAG), expert opinions are fed back to the translations provided to the platform, and the feedback opinions are reflected in the translations to produce high-quality language translations. The high-quality language translation platform based on LLM having a feedback loop reflecting expert opinions according to the present invention is characterized by comprising: a document processing unit that preprocesses source translation documents to collectively manage various types of files containing documents (or sentences) requiring translation; a prompt unit that generates translation requirements as prompts corresponding to the classification type and translation requirement level of the preprocessed documents; a RAG system unit that searches for multiple translation-related information considering the translation requirements or the translation context of the generated prompts; an LLM that generates translation results using the search information of the RAG system unit; and a feedback unit that evaluates the translation results and feeds the evaluation results back to the RAG system unit.
Inventors
- 김성렬
Assignees
- 주식회사 스터닝박스
Dates
- Publication Date
- 20260507
- Application Date
- 20250501
- Priority Date
- 20241025
Claims (11)
- In an LLM-based high-quality language translation platform having a feedback loop that incorporates expert opinions for high-quality document translation, A document processing unit that preprocesses source translation documents based on classification type or translation requirement level, A prompt unit that generates a translation requirement as a prompt corresponding to the classification type and translation requirement level of the above-mentioned preprocessed document, A RAG system unit that searches for multiple translation-related information by considering the translation context of the above translation requirements or generated prompts, LLM that generates translation results using search information from the above RAG system unit and Characterized by including a feedback unit that evaluates the above translation result and feeds the evaluation result back to the RAG system unit. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above document processing unit is characterized by including a document structure verification module that verifies the original language and internal structure of the original document for translation, a document classification module that classifies documents based on the verification module, and a translation level judgment module that determines whether the documents classified by the classification module are of a linguistic level or have ambiguity of vocabulary, cultural differences, linguistic or cultural humor, or require imagination. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above prompt unit is characterized by comprising: a prompt input module that identifies the document type in which the original translation document is classified by the document classification module of the document processing unit and the required translation level of the original translation document in the translation level judgment module; a prompt generation module that defines specific requirements to correspond to the classification type and translation level of the original translation document identified through the prompt input unit; a prompt quality evaluation module that evaluates the quality of whether the generated prompt is a proper prompt; and a prompt regeneration module that updates the generated prompt based on the evaluated result. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above-mentioned RAG system is characterized by including an input encoder that understands input requirements and encodes given requirements in a vector form when a generated prompt is input, an information retrieval module that searches for multiple relevant information in an external knowledge database based on the encoded input, and a translation knowledge augmentation module that enables the LLM to generate more accurate and richer translation results by utilizing the multiple retrieved information. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above feedback unit is characterized by the LLM performing back-translation using the translation results through feedback from the LLM itself to determine whether it matches the original translation document, and evaluating the translation usage priority or the usefulness of the documents for a plurality of documents selected in the RAG system unit according to the ratio of matching. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above feedback unit is characterized by evaluating the translation result in a manner in which an external group of experts or the person requesting the translation directly reviews, modifies, or assigns evaluation scores to the translation result. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above feedback unit is characterized by evaluating multiple documents selected by the RAG system unit in a manner in which an external expert group or a translation requester assigns priority. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 1, The above feedback unit is characterized by tracking how the feedback history of a specific document selected in the information retrieval module has changed in order to provide high-quality translation results, and readjusting weights such as the document's priority by comparing the previously assigned priority or evaluation grade with the current status. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In an LLM-based high-quality language translation platform having a feedback loop that incorporates expert opinions for high-quality document translation, A document processing unit that preprocesses source translation documents based on classification type or translation requirement level, A RAG system unit that searches multiple K source texts similar to the above-mentioned original translation document and K translation results for the K source texts, A prompt unit that generates a translation requirement corresponding to the classification type and translation requirement level of the above-mentioned pre-processed document as a prompt, and provides K translation results retrieved through the above-mentioned RAG system to the LLM, An LLM that generates translation results by utilizing the query of the above prompt and the K translation results provided in the prompt, and Characterized by including a feedback unit that evaluates the above translation result and feeds the evaluation result back to the RAG system unit. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 9, The above RAG system is characterized by vector embedding K original texts and storing the corresponding K translation results as metadata. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
- In Article 10, The above RAG system is characterized by vector embedding the original document to find K similar original texts when a translation request for the original document is received, and finding the translation results of the K original texts by referring to metadata. A high-quality LLM-based language translation platform with a feedback loop incorporating expert opinions.
Description
An LLM-based high quality language translation platform using a feedback loop that reflects expert opinions The present invention relates to a high-quality language translation platform based on a Large Language Model (LM) having a feedback loop that reflects expert opinions, and more specifically, to a high-quality language translation platform based on an LLM having a feedback loop that reflects expert opinions, wherein evaluation results of translations from an LLM-based language translation system are provided to the platform using LLM and Retrieval Augmented Generation (RAG), expert opinions are fed back to the translations provided to the platform, and the feedback opinions are reflected in the translations to produce high-quality language translations. In the case of traditional document translation, since it is not simply a matter of changing the language but a process of accurately conveying the meaning and nuances of the original text, there are various difficulties such as differences in grammar and structure depending on the language, ambiguity of vocabulary, cultural differences, lack of vocabulary suitable for the precise meaning, translation of technical terms and technology, changes in language over time, and translation of linguistic and cultural humor. Due to these linguistic characteristics, high-quality translation results meeting the user's standards can only be produced if the translation is performed directly by a professional translator who is well-versed in both the first language and the second language to which it is to be translated, and who also understands cultural differences. Recently, there have been attempts to build translation systems using Large Language Models (LLMs) to overcome the limitations of traditional translation approaches. LLMs perform translation by considering not only the context within a sentence but also paragraphs or larger contexts, and they can achieve high performance by transferring knowledge between trained languages. Furthermore, unlike rule-based systems, LLMs learn various sentence structures and vocabulary choices, enabling more natural translations. In addition, because LLM is trained on vast amounts of data, it has a high potential to translate generally rare vocabulary or specialized terminology, such as in medicine or law, and offers various advantages, including the ability to learn multiple languages simultaneously and process several languages with a single model. However, despite the advantages of such LLM-based language translation systems, LLM has an inherent hallucination problem where it provides incorrect translations due to insufficient or biased data. Since the problem of hallucinations arises from diverse and complex causes, various convergent methods are being attempted to resolve it. Representative methods include not only improving the quality of generative AI training data but also utilizing prompt engineering, search augmented generative (RAG), and agents that connect multiple generative AIs to reduce hallucinations and misinformation. Among these, there is growing interest in RAG systems, which offer relatively low costs for resolving hallucination issues and can continuously utilize new data in real time; however, since translations are still generated based on retrieved information, inaccurate results may be produced if the retrieved data is incorrect. Furthermore, because RAG is fundamentally based on factual accuracy, difficulties remain in achieving high quality when dealing with vocabulary ambiguity, cultural differences, linguistic and cultural humor, or translations requiring imagination. Figure 1 is an overall structural diagram of Example 1 of a high-quality language translation platform based on a Large Language Model (LM) with a feedback loop that reflects expert opinions. Figure 2 is a conceptual diagram illustrating the configuration and response process of the RAG system section. Figure 3 is an overall structural diagram of Example 2 of a high-quality language translation platform based on a Large Language Model (LM) with a feedback loop that reflects expert opinions. The specific features and advantages of the present invention will be described in detail below with reference to the accompanying drawings. Prior to this, if it is determined that a detailed description of the functions and configurations related to the present invention may unnecessarily obscure the essence of the invention, such detailed description will be omitted. Large Language Models (LLMs), or massive language models, are language models composed of artificial neural networks with numerous parameters. They are widely utilized in various types of language translation because they can learn from vast amounts of text data to generate diverse types of text, understand the complexity of natural language, and process and generate text in multiple languages. Nevertheless, due to the hallucination problem, there are many cases where LLM-based la