CN-116150351-B - Training method of text classification model, text classification processing method and device

CN116150351BCN 116150351 BCN116150351 BCN 116150351BCN-116150351-B

Abstract

The embodiment of the application discloses a training method of a text classification model, a text classification processing method and a device, and the training method comprises the steps of converting N-way voice samples into N text samples, wherein each text sample comprises at least M sentences, M and N are integers larger than 1, selecting at least one sentence from each text sample as a noise sample to form a noise sample set, constructing a positive sample data set by utilizing at least one noise sample in the noise sample set and a plurality of continuous sentences in the at least one text sample, constructing a negative sample data set by utilizing the plurality of continuous sentences in the at least one text sample, and carrying out iterative training on the text classification model to be trained by utilizing the positive sample data set and the negative sample data set. The embodiment of the application can solve the problem of unsmooth semantics.

Inventors

LI CHANGLIN
XIAO BING
CAO LEI
Luo Qishuai

Assignees

马上消费金融股份有限公司

Dates

Publication Date: 20260512
Application Date: 20220711

Claims (10)

1. A method for training a text classification model, the method comprising: Converting N-way voice samples into N text samples, wherein each text sample comprises at least M sentences, and M and N are integers greater than 1; Selecting at least one sentence from each text sample as a noise sample to form a noise sample set; Constructing a positive sample data set by utilizing at least one noise sample in the noise sample set and a plurality of continuous sentences in at least one text sample, and constructing a negative sample data set by utilizing the plurality of continuous sentences in the at least one text sample, wherein the negative samples in the negative sample data set are formed by splicing the plurality of continuous sentences end to end, and the positive samples in the positive sample data set are formed by splicing the plurality of continuous sentences end to end with the noise samples in the noise sample set; and carrying out iterative training on the text classification model to be trained by utilizing the positive sample data set and the negative sample data set.
2. The method of training a text classification model of claim 1, wherein constructing a positive sample dataset using at least one noise sample in the noise sample set and a plurality of consecutive sentences in the at least one text sample comprises: Constructing n1 positive samples, respectively performing first labeling on the n1 positive samples to obtain n1 positive samples with first labeling, wherein the n1 positive samples with first labeling form the positive sample data set; The specific implementation mode of constructing each positive sample comprises the steps of randomly selecting continuous M sentences from at least M sentences in any one text sample in at least one text sample, randomly selecting at least one noise sample from the noise sample set, splicing the continuous M sentences end to end, and then splicing the continuous M sentences end to end with the at least one noise sample to obtain the positive sample, wherein the continuous M sentences are front, the at least one noise sample is back, and the M is smaller than the M.
3. The method of claim 1, wherein constructing a negative sample dataset from a plurality of sentences in the at least one text sample comprises: Constructing n2 negative samples, respectively performing second labeling on the n2 negative samples to obtain n2 negative samples with second labeling, wherein the n2 negative samples with second labeling are constructed into the negative sample data set; The specific implementation mode of constructing each negative sample comprises the steps of randomly selecting continuous m+1 sentences from at least M sentences in any one text sample in at least one text sample, and splicing the continuous m+1 sentences end to obtain the negative sample data set, wherein m+1 is smaller than M.
4. The method of training a text classification model of claim 1, wherein constructing a positive sample dataset using at least one noise sample in the set of noise samples and a plurality of sentences in the at least one text sample, and constructing a negative sample dataset using a plurality of sentences in the at least one text sample comprises: Selecting a continuous plurality of sentences from at least M sentences in the at least one text sample, constructing a positive sample dataset using at least one noise sample in the noise sample set and the continuous plurality of sentences in the at least M sentences in the at least one text sample; selecting a continuous plurality of sentences from at least M sentences in the at least one text sample, and selecting a continuous plurality of sentences from at least M sentences in the at least one text sample to construct a negative sample data set; Wherein the number of times of selecting a plurality of consecutive sentences from at least M sentences in each of the text samples is determined based on the number of sentences in each of the text samples; under the condition that the number of sentences of the text sample is larger than the average value of the number of sentences of N text samples, the selection times take a first numerical value; and under the condition that the number of sentences of the text sample is not more than the average value of the number of sentences of the N text samples, the selection times take a second numerical value, and the first numerical value is larger than the second numerical value.
5. A text classification processing method, comprising: Acquiring voice data to be recognized; Converting the voice data into text data, wherein the text data comprises at least M sentences, and M is an integer greater than 1; Inputting a plurality of continuous sentences in the M sentences in the text data into a text classification model for classification processing to obtain a classification result output by the text classification model, wherein the plurality of continuous sentences comprise sentences to be recognized, the classification processing is used for classifying the sentences to be recognized in the text data, the text classification model is trained by utilizing a positive sample data set and a negative sample data set, negative samples in the negative sample data set are formed by splicing a plurality of continuous sentences in at least one text sample end to end, and positive samples in the positive sample data set are formed by splicing a plurality of continuous sentences in the at least one text sample end to end and noise samples in the noise sample set; and determining the category of the sentence to be recognized in the text data according to the classification result, wherein the category comprises a noise category or a non-noise category.
6. The text classification processing method according to claim 5, wherein said inputting successive ones of said M sentences in said text data into a text classification model for classification processing comprises: determining M consecutive sentences before the sentence to be recognized in the M sentences, wherein M is smaller than M; Splicing the continuous m sentences and the sentences to be recognized end to obtain spliced sentences, wherein the continuous m sentences are positioned in front, and the sentences to be recognized are positioned in back; and inputting the spliced sentences into the text classification model for classification processing to obtain classification results output by the text classification model.
7. A training device for a text classification model, comprising: The conversion module is used for converting the N-way voice samples into N text samples, each text sample comprises at least M sentences, and M and N are integers larger than 1; The selecting module is used for selecting at least one sentence from each text sample as a noise sample to form a noise sample set; A construction module, configured to construct a positive sample data set using at least one noise sample in the noise sample set and a plurality of consecutive sentences in at least one text sample, and construct a negative sample data set using a plurality of consecutive sentences in the at least one text sample, wherein negative samples in the negative sample data set are formed by splicing the plurality of consecutive sentences end to end, and positive samples in the positive sample data set are formed by splicing the plurality of consecutive sentences end to end with the noise samples in the noise sample set; And the training module is used for carrying out iterative training on the text classification model to be trained by utilizing the positive sample data set and the negative sample data set.
8. A text classification processing apparatus, comprising: The acquisition module is used for acquiring voice data to be identified; The conversion module is used for converting the voice data into text data, wherein the text data comprises at least M sentences, and M is an integer greater than 1; The processing module is used for inputting sentences to be identified in the M sentences in the text data into a text classification model for classification processing to obtain classification results output by the text classification model, wherein the classification processing is used for classifying the sentences to be identified in the text data, the text classification model is trained by utilizing a positive sample data set and a negative sample data set, negative samples in the negative sample data set are formed by splicing a plurality of continuous sentences in at least one text sample end to end, and positive samples in the positive sample data set are formed by splicing a plurality of continuous sentences in the at least one text sample end to end and noise samples in the noise sample set; and the determining module is used for determining the category of the sentence to be recognized in the text data according to the classification result, wherein the category comprises a noise category or a non-noise category.
9. An electronic device, comprising: A processor; a memory for storing the processor-executable instructions; Wherein the processor is configured to execute the instructions to implement the training method of the text classification model of any of claims 1 to 4 or the text classification processing method of claim 5 or 6.
10. A computer readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the training method of a text classification model according to any one of claims 1 to 4 or the text classification processing method of claim 5 or 6.

Description

Training method of text classification model, text classification processing method and device Technical Field The present application relates to the field of natural language processing technologies, and in particular, to a training method for a text classification model, a text classification processing method and a device. Background Automatic speech recognition technology (Automatic Speech Recognition, ASR) is a technology that converts human speech into text, speech recognition being a multidisciplinary cross-domain that is closely coupled with acoustic, phonetic, linguistic and digital signal processing theory. When the voice quality inspection is carried out, noise is often mixed into call data of a call when the seat robot and a client call, and when the ASR is used for carrying out text conversion on recording data, the noise data is also translated, so that the translation result of the call text is disturbed, and the problem of unsmooth semantics is caused. Disclosure of Invention The application provides a training method of a text classification model, a text classification processing method and a text classification processing device, which are used for solving the problem of unsmooth semantics. In a first aspect, the application provides a training method of a text classification model, which comprises the steps of converting N-way voice samples into N text samples, wherein each text sample comprises at least M sentences, M and N are integers larger than 1, selecting at least one sentence from each text sample as a noise sample to form a noise sample set, constructing a positive sample data set by utilizing at least one noise sample in the noise sample set and a plurality of continuous sentences in at least one text sample, constructing a negative sample data set by utilizing a plurality of continuous sentences in at least one text sample, and inputting training samples in the positive sample data set and training samples in the negative sample data set into the text classification model to be trained for iterative training. It can be seen that the embodiment of the present application trains the text classification model from the text layer, wherein the negative samples in the negative sample data set are constructed to include a plurality of consecutive sentences, and the negative samples in the negative sample data set have semantic consistency due to the continuity between the sentences, while each positive sample in the positive sample data set includes a plurality of consecutive sentences and noise, and therefore the positive samples in the positive sample data set do not have semantic consistency. When the text classification model is trained, the constructed positive sample data set is utilized to learn and know the positive sample containing noise by the text classification model, and the constructed negative sample data set is utilized to learn and know the coherent text which does not contain the noise sample by the text classification model, so that the text classification model obtained through training can identify the noise data according to the consistency of the semantics, and further the problem of unsmooth semantics of translation of the call text is solved. The application provides a text classification processing method, which comprises the steps of obtaining voice data to be recognized, converting the voice data into text data, wherein the text data comprises at least M sentences, M is an integer larger than 1, inputting sentences to be recognized in the M sentences in the text data into a text classification model for classification processing, obtaining a classification result output by the text classification model, wherein the classification processing is used for classifying the sentences to be recognized in the text data, and determining the category of the sentences to be recognized in the text data according to the classification result, wherein the category comprises noise category or non-noise category. It can be seen that when the embodiment of the application utilizes the text classification model to identify noise data, the noise data causing the incoherence of the semantics is identified from the text layer by utilizing the consistency of the text semantics of the text data, so that the identified noise data is removed in the subsequent process, and the problem of the incoherence of the semantics caused by the interference of the noise data on the translation result of a call text is avoided. In a third aspect, the application provides a training device of a text classification model, which comprises a conversion module, a selection module, a construction module and a training module, wherein the conversion module is used for converting N-way voice samples into N text samples, each text sample comprises at least M sentences, M and N are integers larger than 1, the selection module is used for selecting at least one sentence from each text sample as a noise