CN-121981132-A - Language model accurate translation method and device based on term evaluation

CN121981132ACN 121981132 ACN121981132 ACN 121981132ACN-121981132-A

Abstract

The invention relates to a language model accurate translation method and device based on term evaluation. Relates to the technical field of translation. The method comprises the steps of obtaining a target domain term set and a domain corpus, extracting a target sentence containing target terms to generate a reference translation, replacing the target terms with candidate words in the candidate word set to generate a disturbance sentence and obtaining the disturbance translation, counting candidate word probability distribution which enables the disturbance translation and the reference translation to achieve term level semantic matching, and calculating term translation entropy based on the candidate word probability distribution. And (3) integrating the term translation entropy into the training, decoding and/or data enhancement process of the translation language model, or executing the translation task by using the term translation uncertainty reflected by the term translation entropy to select the target translation language model with strong term suitability in the field. The invention recognizes uncertainty of the model on term translation so as to guide the more accurate translation of the model or select a language model of a more adaptive term, thereby improving the accuracy and consistency of term translation.

Inventors

CAI LINGYU
FU QUANMING
GAO PENG
LIU MINJIE
Kong Kunkun
Zhou Landong

Assignees

山东百舜信息技术有限公司

Dates

Publication Date: 20260505
Application Date: 20260408

Claims (10)

1. A language model accurate translation method based on term evaluation, comprising: Calculating the term translation entropy of the translation language model translation target term so as to quantify the translation uncertainty of the translation language model on the target term; when the translation language model trains aiming at term translation, term translation uncertainty reflected by term translation entropy is fused into the training, decoding and/or data enhancement process of the translation language model, and the translation language model is guided to improve term translation precision; Or when the translation language model is selected for translation, the target translation language model with strong suitability of terms in the field is selected by using the term translation uncertainty reflected by the term translation entropy, and the translation task is executed.
2. The language model accurate translation method based on term evaluation according to claim 1, wherein the process of calculating the term translation entropy of the translation language model translation target term comprises the steps of obtaining a target domain term set and a domain corpus, and constructing a domain-adapted candidate word element set; extracting a plurality of sentences containing any target terms in the target domain term set from a domain corpus to form a target term sentence set; Replacing the target term in the sentence with a candidate word element in a preset candidate word element set, generating a plurality of disturbance sentences, and translating the disturbance sentences through the translation language model to obtain a plurality of disturbance translations; Counting candidate tokens corresponding to the corresponding disturbance translations matched with any reference translation to obtain a corresponding candidate token subset; And calculating term translation entropy of the target term based on the frequency distribution of each candidate term in the subset of all candidate terms.
3. The language model accurate translation method based on term evaluation according to claim 2, wherein the candidate metaset is a domain adaptation candidate set, the candidate metaset collects semantic neighbor words of any target term in the target term set, and the acquiring mode of the semantic neighbor words comprises selecting the semantic neighbor words of any target term in the target term set from a domain term library or/and domain synonym dictionary, or/and, embedding the calculated semantic neighbor words of any target term in the target term set based on the words.
4. The language model accurate translation method based on term evaluation according to claim 2, wherein the statistics and the corresponding disturbance translations of any reference translations are matched according to a term-level semantic matching standard, specifically, whether the translation contents of the disturbance translations and the reference translations at the target term positions meet a preset semantic similarity threshold or whether the translation contents belong to a preset synonym mapping relation or are consistent is judged.
5. The term evaluation-based language model accurate translation method according to claim 2, wherein the process of calculating the term translation entropy of the target term based on the occurrence frequency of each candidate term in all the candidate term subsets comprises: summarizing candidate word element subsets of all target sentences of any target terms, and calculating each candidate word element The total number of occurrences in all candidate word subsets, and then, calculating the probability distribution of occurrence of any candidate word; calculating a term translation entropy value according to the probability distribution of the candidate word elements corresponding to the target term: ; Wherein, the Meaning target terminology Is defined by the term translation entropy of (a), Meaning target terminology Candidate lemmas of (a) Probability distribution of occurrence in all of the candidate token subsets.
6. The method for accurately translating a language model based on term evaluation according to claim 1, wherein in the fine-tuning stage of the translated language model, the target term set contained in the training sample of parallel sentence pairs used for fine-tuning is identified; Or for the translation language model trained through contrast learning, constructing positive and negative sample pairs by using the term translation entropy in the process of contrast learning, wherein for a source sentence containing high translation entropy terms, the correct target translation is a positive example, and the term is replaced by a candidate word element to obtain a negative example, so that the translation language model is more sensitive to the correct translation of the term through contrast loss.
7. The term evaluation-based language model accurate translation method according to claim 1, wherein in a decoding stage of a translation language model adopting a beam search, candidate path scores of the beam search are dynamically adjusted by using term translation entropy to avoid selection of translations which may cause term uncertainty, and the specific process comprises: in the decoding process, when generating a term possibly belonging to a term, the term translation entropy value of the term in the current context is calculated in real time based on cached historical statistics ; The term translation entropy penalty is introduced in the scoring function of the bundle search: ; Wherein, the For 1 to t time sequence translation result Is a cluster search score of (c) for a group, For 1 to t-1 time sequence translation result Is a cluster search score of (c) for a group, Expressed in terms of giving the source language text x and giving the 1 to t-1 sequential translation result Under the condition of (1) translating out the t translation result Is a function of the probability of (1), For the term translation entropy penalty term, β is the penalty coefficient, the higher the term translation entropy the greater the penalty, thus suppressing paths that may lead to term uncertainty.
8. The term evaluation-based language model accurate translation method according to claim 1, wherein the term translation entropy is used for identifying a weak domain of the translation language model in term translation, and corresponding training data is pertinently collected or generated according to the weak domain to enhance the training data for training the translation language model.
9. The language model accurate translation method based on term evaluation according to claim 1, wherein term translation entropy of a plurality of candidate translation language models on a term benchmark set of a field of translation is calculated, the plurality of candidate translation language models are ordered according to term translation evaluation indexes containing the term translation entropy, and a translation language model with the lowest term translation entropy is selected as a target translation language model to translate in the field of translation.
10. A language model accurate translation device based on term evaluation, comprising at least one processing unit, the processing unit is connected with a storage unit through a bus unit, the storage unit stores a computer program capable of running on a processor, and the processing unit realizes the language model accurate translation method based on term evaluation according to any one of claims 1-9 by running the computer program stored in the storage unit.

Description

Language model accurate translation method and device based on term evaluation Technical Field The invention relates to the technical field of translation, in particular to a language model accurate translation method and device based on term evaluation. Background The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art. With the acceleration of globalization process, machine translation is increasingly widely applied in various industries, especially in the professional fields of medical treatment, law, finance, patents and the like, and the accuracy of the term translation directly determines the quality and usability of the translation result. When machine translation is performed based on a large language model, the term-oriented translation often faces the problems that some terms have different meanings in different fields, the language model easily selects wrong word senses, especially when context conditions are not obvious or the meaning selection of the model has high randomness when field professional terms occur less frequently in a language model translation training corpus, the same term must be translated into the same word in the whole text in a long document such as a technical manual, and the machine translation can translate the same term into a plurality of different words in the same paragraph due to slight differences of the context. For new terms appearing in the rapid development of the field, models are often put in mind, producing erroneous translations. To solve the term translation problem, one solution is to set a source and target language term library, and at the time of translation, a mandatory model translates a specified term in a source language into a target language term defined in the term library. However, the scheme needs to maintain term libraries among different fields and different languages, has high cost, needs matching in the translation process, and can have errors and unnatural conditions in hard term replacement translation. One scheme is to fine tune a large language model, so that the parameters of the large language model are more suitable for the term alignment between source and target languages in a specific field, the large language terms are accurately translated, the translated sentence patterns and the language gases are more suitable for the professional habits of the corresponding field, the model can understand the context in which the terms appear, process vocabulary variants and synonyms, is more flexible than hard dictionary replacement, and can cover expressions which are not recorded in a term library but are common in the field. But fine tuning requires secondary training of large language models for general translation using high quality bilingual parallel corpus in specific fields (e.g., medical papers, legal contracts). One approach is to use context learning and hint engineering to hint terms, and use the powerful understanding capabilities of a large language model to guide the model to translate according to specified terms, with a small-scale glossary, translation specification or example as hint when entering a translation request. The method can switch the term requirements at any time according to different documents and clients without retraining a model, can give consideration to the fluency and grammar of the whole sentence on the premise of ensuring the accuracy of the term, and can also correctly use other forms of the term original form according to the context. However, the large model has randomness, and may sometimes obey a glossary and sometimes not obey, and when the glossary is used as a prompt input model for each translation, the glossary is too large, so that the content to be translated exceeds a context window, information is lost, and additionally, the added prompt increases the translation cost. The translation optimizing scheme of the language model for the term mainly relies on artificial detection and evaluation, and the existing evaluation indexes such as BLEU, COMET and the like mainly evaluate and generate the overall similarity between the translation and the reference translation, and cannot deeply quantify the "certainty" or "confidence" of the model for the translation of the specific term. There is therefore a need for a language model accurate translation method based on term evaluation. Disclosure of Invention In order to solve the technical problems or at least partially solve the technical problems, the invention provides a language model accurate translation method and device based on term evaluation. The invention provides a language model accurate translation method based on term evaluation, which comprises the following steps: Calculating the term translation entropy of the translation language model translation target term so as to quantify the translation uncertainty of the translation language model