Search

CN-115359509-B - Model training method, natural language translation method, device, equipment and storage medium

CN115359509BCN 115359509 BCN115359509 BCN 115359509BCN-115359509-B

Abstract

The present disclosure relates to a model training, natural language translation method, apparatus, device, and storage medium. The method comprises the steps of obtaining standard first sign language text and at least one nonstandard second sign language text corresponding to natural language sample text, and outputting predicted sign language text corresponding to the natural language sample text through a translation model to be trained. A first penalty value for the translation model is calculated based on the predicted sign language text and the first hand language text, and at least one second penalty value for the translation model is calculated based on the predicted sign language text and the at least one second sign language text. Because the smaller the first loss value is, the better the second loss value is, the larger the second loss value is, therefore, according to the first loss value and at least one second loss value, the model parameters of the translation model can be accurately controlled, and the translation model obtained through training is more accurate. Therefore, the training obtained translation model can obtain accurate sign language text when translating the natural language text to be translated.

Inventors

  • ZHANG JIASHUO
  • Zu xinxing
  • ZHAO ZHONGZHOU
  • LI JIAHUI
  • WANG QI
  • WU SHUMING
  • HAN YUJIE
  • LIN MIAO

Assignees

  • 阿里巴巴(中国)有限公司

Dates

Publication Date
20260508
Application Date
20220722

Claims (14)

  1. 1. A model training method, wherein the method comprises: acquiring a first sign language text and at least one second sign language text corresponding to a natural language sample text, wherein the accuracy of the second sign language text is lower than that of the first sign language text; inputting the natural language sample text into a translation model to be trained, and outputting a predicted sign language text corresponding to the natural language sample text through the translation model; calculating a first loss value of the translation model according to the predicted sign language text and the first sign language text; Calculating at least one second penalty value for the translation model based on the predicted sign language text and the at least one second sign language text; and adjusting model parameters of the translation model by controlling the first loss value to be reduced and controlling each second loss value to be increased at the same time so as to train the translation model.
  2. 2. The method of claim 1, wherein the first hand language text comprises a plurality of sign language words; The second sign language text is obtained by adjusting the sequence of at least part of the plurality of sign language words, or The second sign language text is obtained by replacing at least part of sign language vocabulary in the plurality of sign language vocabulary, or The second sign language text is a historical sign language text output by the translation model in a historical training process.
  3. 3. A method of natural language translation, wherein the method comprises: acquiring a target natural language text to be translated; Inputting the target natural language text into a pre-trained translation model, and outputting a target sign language text corresponding to the target natural language text through the translation model, wherein the translation model is obtained according to the model training method of claim 1 or 2.
  4. 4. The method of claim 3, wherein inputting the target natural language text into a pre-trained translation model, outputting target sign language text corresponding to the target natural language text through the translation model, comprises: and inputting the target natural language text into a pre-trained translation model, and outputting a target sign language text corresponding to the target natural language text and a target action identifier corresponding to the polysemous word in the target sign language text through the translation model.
  5. 5. The method of claim 3, wherein after outputting, by the translation model, the target sign language text corresponding to the target natural language text, the method further comprises: correcting target words which do not belong to sign language words in the target sign language text into a sign language word or a combination formed by at least two sign language words; and selecting a target action identifier from a plurality of action identifiers corresponding to the ambiguities according to the context corresponding to the ambiguities in the target sign language text.
  6. 6. The method of claim 5, wherein after modifying the target vocabulary that does not belong to the sign language vocabulary in the target sign language text to one sign language vocabulary or a combination of at least two sign language vocabularies, the method further comprises: and if the sign language vocabulary or the combination formed by at least two sign language vocabularies comprises the ambiguous words, selecting a target action identifier from a plurality of action identifiers corresponding to the ambiguous words.
  7. 7. The method of claim 3, wherein inputting the target natural language text into a pre-trained translation model comprises: Determining whether a preset natural language text matched with the target natural language text exists in a corresponding relation formed by the preset natural language text and the preset sign language text; If the corresponding relation does not have the preset natural language text matched with the target natural language text, inputting the target natural language text into a pre-trained translation model.
  8. 8. The method of claim 7, wherein if there is no preset natural language text in the correspondence that matches the target natural language text, inputting the target natural language text into a pre-trained translation model comprises: If the corresponding relation does not have the preset natural language text matched with the target natural language text, determining whether the intention of the target natural language text is the preset intention; if the intention of the target natural language text is not the preset intention, inputting the target natural language text into a pre-trained translation model.
  9. 9. The method of claim 7, wherein the method further comprises: And if the corresponding relation contains the preset natural language text matched with the target natural language text, taking the preset hand language corresponding to the matched preset natural language text as the target sign language text corresponding to the target natural language text.
  10. 10. The method of claim 8, wherein the method further comprises: if the intention of the target natural language text is a preset intention, extracting a key word from the target natural language text according to a preset slot template corresponding to the preset intention; and generating target sign language texts corresponding to the target natural language texts according to the key words.
  11. 11. A model training apparatus, comprising: the acquisition module is used for acquiring a first sign language text and at least one second sign language text corresponding to the natural language sample text, wherein the accuracy of the second sign language text is lower than that of the first sign language text; the input module is used for inputting the natural language sample text into a translation model to be trained, and outputting a predicted sign language text corresponding to the natural language sample text through the translation model; The calculation module is used for calculating a first loss value of the translation model according to the predicted sign language text and the first sign language text; calculating at least one second penalty value for the translation model based on the predicted sign language text and the at least one second sign language text; And the training module is used for adjusting the model parameters of the translation model by controlling the first loss value to be reduced and controlling each second loss value to be increased at the same time so as to train the translation model.
  12. 12. A natural language translation device, comprising: The acquisition module is used for acquiring target natural language text to be translated; The input module is used for inputting the target natural language text into a pre-trained translation model, outputting a target sign language text corresponding to the target natural language text through the translation model, and the translation model is obtained according to the model training method of claim 1 or 2.
  13. 13. An electronic device, comprising: A memory; processor, and A computer program; Wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-10.
  14. 14. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of any of claims 1-10.

Description

Model training method, natural language translation method, device, equipment and storage medium Technical Field The disclosure relates to the field of information technology, and in particular relates to a model training method, a natural language translation method, a device, equipment and a storage medium. Background In the healthy hearing world, the carrier of information is usually natural language, but for hearing impaired people sign language is their first language, so it is necessary to translate natural language into sign language. However, the inventor of the present application found that, since the sign language is an independent language, the unique vocabulary and grammar thereof are greatly different from the vocabulary and grammar in the natural language, respectively, and therefore how to accurately translate the natural language into the sign language is a problem to be solved currently. Disclosure of Invention In order to solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides a method, an apparatus, a device, and a storage medium for model training and natural language translation, so that a translation model obtained by training can obtain accurate sign language text when translating a natural language text to be translated. In a first aspect, an embodiment of the present disclosure provides a model training method, including: acquiring a first sign language text and at least one second sign language text corresponding to a natural language sample text, wherein the accuracy of the second sign language text is lower than that of the first sign language text; inputting the natural language sample text into a translation model to be trained, and outputting a predicted sign language text corresponding to the natural language sample text through the translation model; calculating a first loss value of the translation model according to the predicted sign language text and the first sign language text; Calculating at least one second penalty value for the translation model based on the predicted sign language text and the at least one second sign language text; Training the translation model according to the first loss value and the at least one second loss value. In a second aspect, an embodiment of the present disclosure provides a natural language translation method, including: acquiring a target natural language text to be translated; And inputting the target natural language text into a pre-trained translation model, and outputting the target sign language text corresponding to the target natural language text through the translation model, wherein the translation model is obtained according to the model training method. In a third aspect, an embodiment of the present disclosure provides a model training apparatus, including: the acquisition module is used for acquiring a first sign language text and at least one second sign language text corresponding to the natural language sample text, wherein the accuracy of the second sign language text is lower than that of the first sign language text; the input module is used for inputting the natural language sample text into a translation model to be trained, and outputting a predicted sign language text corresponding to the natural language sample text through the translation model; The calculation module is used for calculating a first loss value of the translation model according to the predicted sign language text and the first sign language text; calculating at least one second penalty value for the translation model based on the predicted sign language text and the at least one second sign language text; and the training module is used for training the translation model according to the first loss value and the at least one second loss value. In a fourth aspect, an embodiment of the present disclosure provides a natural language translation device, including: The acquisition module is used for acquiring target natural language text to be translated; and the input module is used for inputting the target natural language text into a pre-trained translation model, outputting the target sign language text corresponding to the target natural language text through the translation model, and the translation model is obtained according to the model training method. In a fifth aspect, embodiments of the present disclosure provide an electronic device, including: A memory; processor, and A computer program; Wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method according to the first or second aspect. In a sixth aspect, embodiments of the present disclosure provide a computer readable storage medium having stored thereon a computer program for execution by a processor to implement the method of the first or second aspect. The embodiment of the disclosure provides a model training method, a natural language transl