CN-116245080-B - Method, device, equipment and medium for converting spoken language written language based on reinforcement learning

CN116245080BCN 116245080 BCN116245080 BCN 116245080BCN-116245080-B

Abstract

The invention provides a method, a device, equipment and a medium for converting spoken language written language based on reinforcement learning, wherein the method comprises the steps of obtaining spoken language text, inputting the spoken language text into a conversion model to obtain written text output by the conversion model, wherein the conversion model takes editing operation of each word in a sample spoken language text as an action, and the degree of semantic consistency between the sample written text obtained by executing the editing operation and the sample spoken language text and/or the degree of written version of the sample written text as rewards, and reinforcement learning is obtained. The method, the device, the equipment and the medium provided by the invention have the advantages that the limitation of insufficient annotation data is eliminated in the reinforcement learning process, the semantic consistency degree and the written degree give high-level and interpretable rewards, the text conversion is carried out by applying the conversion model obtained by the method, and the reliability and the interpretability of the conversion from the spoken text to the written text are ensured.

Inventors

ZHAO YUNLONG
XU SHUANG
XU BO

Assignees

中国科学院自动化研究所

Dates

Publication Date: 20260508
Application Date: 20221212

Claims (7)

1. A method for converting spoken written language based on reinforcement learning, comprising: acquiring a spoken text; inputting the spoken text into a conversion model to obtain a written text output by the conversion model; The conversion model is obtained by taking editing operation of each word in a sample spoken text as an action, taking semantic consistency degree between the sample written text and the sample spoken text and/or writing degree of the sample written text as rewards and strengthening learning; the step of obtaining the conversion model comprises the following steps: Inputting the editing operation of the last word segmentation in the sample spoken text and the semantic features of the current word segmentation in the sample spoken text into a strategy model to obtain the editing operation of the current word segmentation output by the strategy model, and returning the next word segmentation of the current word segmentation as the current word segmentation to obtain the editing operation until the editing operation of each word segmentation in the sample spoken text is obtained; determining the sample written text based on editing operations of each word in the sample spoken text; determining a first reward based on the editing operation of each word in the sample spoken text and the editing tag of the sample spoken text; determining the reward based on the first reward, and a degree of semantic agreement between the sample written text and the sample spoken text and/or a degree of papering of the sample written text; performing reinforcement learning on the strategy model based on the rewards to obtain the conversion model; the step of obtaining the semantic features of each word in the sample spoken text comprises the following steps: inputting the sample spoken text into a language model to obtain semantic features of each word segmentation in the sample spoken text output by the language model; the language model is obtained by performing supervised classification learning training fine adjustment on editing labels of all the segmented words in the preset spoken language text, or is an unsupervised pre-training language model.
2. The reinforcement learning-based spoken written language conversion method of claim 1, wherein the obtaining of the degree of semantic consistency includes: Inputting the sample spoken text and the sample written text into a consistency degree scoring model to obtain the semantic consistency degree output by the consistency degree scoring model; The consistency degree scoring model is obtained by training based on a positive sample pair and a negative sample pair, wherein the positive sample pair comprises preset spoken text and preset written text with consistent semantics, and the negative sample pair comprises the preset spoken text and a disturbance text obtained by disturbing the preset spoken text.
3. The reinforcement learning-based spoken language written language conversion method of claim 1, wherein the step of obtaining the degree of written language comprises: inputting the sample written text into a written degree scoring model to obtain the written degree output by the written degree scoring model; the written degree scoring model is a classification model obtained by training based on a preset spoken text and a preset written text.
4. A reinforcement learning-based spoken written-language conversion device, comprising: the acquisition unit is used for acquiring the spoken text; the conversion unit is used for inputting the spoken text into a conversion model to obtain a written text output by the conversion model; The conversion model is obtained by taking editing operation of each word in a sample spoken text as an action and taking semantic consistency degree between a sample written text and the sample spoken text and/or the written degree of the sample written text as rewards and reinforcement learning; the spoken written language conversion device based on reinforcement learning further comprises: The prediction unit is used for inputting the editing operation of the last word in the sample spoken text and the semantic characteristics of the current word in the sample spoken text into a strategy model to obtain the editing operation of the current word output by the strategy model, and returning the next word of the current word as the current word to obtain the editing operation until the editing operation of each word in the sample spoken text is obtained; The editing unit is used for determining the sample written text based on the editing operation of each word in the sample spoken text; Determining a first reward based on the editing operation of each word in the sample spoken text and the editing label of the sample spoken text, and determining the reward based on the first reward and the semantic consistency degree between the sample written text and the sample spoken text and/or the literacy degree of the sample written text; The reinforcement learning unit is used for reinforcement learning the strategy model based on the rewards to obtain the conversion model; The prediction unit is further configured to: inputting the sample spoken text into a language model to obtain semantic features of each word segmentation in the sample spoken text output by the language model; the language model is obtained by performing supervised classification learning training fine adjustment on editing labels of all the segmented words in the preset spoken language text, or is an unsupervised pre-training language model.
5. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the reinforcement learning-based spoken written language conversion method of any one of claims 1 to 3 when the program is executed by the processor.
6. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the reinforcement learning-based spoken written language conversion method of any one of claims 1 to 3.
7. A computer program product comprising a computer program which, when executed by a processor, implements the reinforcement learning based spoken language written language conversion method of any one of claims 1 to 3.

Description

Method, device, equipment and medium for converting spoken language written language based on reinforcement learning Technical Field The invention relates to the technical field of natural language processing, in particular to a method, a device, equipment and a medium for converting spoken language written language based on reinforcement learning. Background Because people have differences in language application modes during speaking and writing, and the situations of wrong syntax and grammar and unfavorable conditions during speaking and noise carried during voice recording can influence accessibility and readability of spoken text obtained by voice recognition. Therefore, converting spoken text into written text is important to reduce the difficulty of understanding text content. In research of conversion from spoken language to written text, insufficient labeling data and poor interpretability are important research difficulties at present. Disclosure of Invention The invention provides a method, a device, electronic equipment and a storage medium for converting spoken language written language based on reinforcement learning, which are used for solving the defects of insufficient text conversion labeling data and poor interpretability in the prior art. The invention provides a method for converting spoken language written language based on reinforcement learning, which comprises the following steps: acquiring a spoken text; inputting the spoken text into a conversion model to obtain a written text output by the conversion model; The conversion model is obtained by taking editing operation of each word in the sample spoken text as an action, taking the semantic consistency degree between the sample written text obtained by executing the editing operation and the sample spoken text and/or the written degree of the sample written text as rewards, and strengthening learning. According to the method for converting the spoken language written language based on reinforcement learning provided by the invention, the obtaining step of the conversion model comprises the following steps: Inputting the editing operation of the last word segmentation in the sample spoken text and the semantic features of the current word segmentation in the sample spoken text into a strategy model to obtain the editing operation of the current word segmentation output by the strategy model, and returning the next word segmentation of the current word segmentation as the current word segmentation to obtain the editing operation until the editing operation of each word segmentation in the sample spoken text is obtained; determining the sample written text based on editing operations of each word in the sample spoken text; Determining the reward based on a degree of semantic agreement between the sample written text and the sample spoken text and/or a degree of papering of the sample written text; and performing reinforcement learning on the strategy model based on the rewards to obtain the conversion model. According to the method for converting spoken language written language based on reinforcement learning provided by the invention, the determining of the rewards based on the semantic consistency degree between the sample written text and the sample spoken language text and/or the degree of the written degree of the sample written text comprises the following steps: determining a first reward based on the editing operation of each word in the sample spoken text and the editing tag of the sample spoken text; determining the reward based on the first reward, and a degree of semantic agreement between the sample written text and the sample spoken text and/or a degree of papering of the sample written text. According to the method for converting the spoken language written language based on reinforcement learning provided by the invention, the step of acquiring the semantic features of each word segmentation in the sample spoken language text comprises the following steps: inputting the sample spoken text into a language model to obtain semantic features of each word segmentation in the sample spoken text output by the language sound model; the language model is obtained by performing supervised classification learning training fine adjustment on editing labels for word segmentation of the spoken language text, or is an unsupervised pre-training language model. According to the method for converting the spoken language written language based on reinforcement learning provided by the invention, the step of acquiring the semantic consistency degree comprises the following steps: Inputting the sample spoken text and the sample written text into a consistency degree scoring model to obtain the semantic consistency degree output by the consistency degree scoring model; The consistency degree scoring model is obtained by training based on a positive sample pair and a negative sample pair, wherein the positive sample pair comprises preset sp