CN-117370542-B - Model training method, device and storage medium
Abstract
The application provides a model training method, a model training device and a model training storage medium, relates to the field of artificial intelligence, and is used for solving the problem that a model generated data classification result is inaccurate in the prior art. The method comprises the steps of obtaining target data, wherein the target data comprise historical data, inputting the historical data into a first model to generate first abstract texts corresponding to the historical data, determining a preset number of second abstract texts from the first abstract texts, training a second model based on the second abstract texts and the historical data corresponding to the second abstract texts to obtain abstract models, wherein the abstract models are used for determining the abstract texts of the data, and the data generating capacity of the first model is larger than that of the second model.
Inventors
- ZHOU XUYAN
- ZHANG AIBIN
Assignees
- 中国联合网络通信集团有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20230912
Claims (9)
- 1. A method of model training, comprising: the method comprises the steps of obtaining target data, wherein the target data comprises historical data and training data; inputting the historical data into a first model to generate a first abstract text corresponding to the historical data; determining a preset number of second abstract texts from the first abstract texts; Training a second model based on the second abstract text and historical data corresponding to the second abstract text to obtain an abstract model, wherein the abstract model is used for determining abstract text of data, and the data generating capacity of the first model is larger than that of the second model; inputting the training data into the abstract model, and determining a third abstract text of the training data; Clustering the third abstract text based on the similarity between the third abstract text, and determining a plurality of first clustering clusters; determining a first label of each first cluster; Determining a label of each third abstract text based on the first label of each first cluster and the third abstract text in each first cluster.
- 2. The method of claim 1, wherein the determining a label for each of the third summary text based on the first label for each of the first clusters and the third summary text in each of the first clusters comprises: The method comprises the steps of receiving first indication information, wherein the first indication information is used for adjusting the first cluster and labels of the first cluster; Responding to the first indication information, adjusting the first cluster and the labels of the first cluster, and determining a second cluster and a second label of the second cluster; And generating a mapping relation table based on the third abstract text in each second cluster and the second label of each second cluster, wherein the mapping relation table is used for representing the mapping relation between each third abstract text and the second label.
- 3. The method of claim 2, wherein the target data further comprises real-time data, and wherein after generating the mapping table, further comprises: inputting the real-time data into the abstract model, and determining a fourth abstract text of the real-time data; Determining the second label corresponding to the real-time data based on the fourth abstract text of the real-time data and the mapping relation table; and determining a data classification result corresponding to the real-time data based on the second label.
- 4. The method of claim 1, wherein the inputting the history data into the first model to generate the first summary text corresponding to the history data comprises: determining a prompt template corresponding to the historical data based on the historical data, wherein the prompt template is a format template of the first abstract text; and inputting the historical data and the prompt template into the first model to generate the first abstract text.
- 5. The method according to any one of claims 1-4, wherein training the second model based on the second abstract text and the history data corresponding to the second abstract text to obtain the abstract model further comprises: determining a training set, a verification set and a test set based on the second abstract text and historical data corresponding to the second abstract text; Step 1, training a current model based on a second abstract text in the training set and historical data corresponding to the second abstract text in the training set to obtain a training abstract model, wherein the current model is the second model or a model determined in the previous iterative training process; Step 2, verifying the training abstract model based on the second abstract text in the verification set and the historical data corresponding to the second abstract text in the verification set, and determining whether a loss function value between a fourth abstract text generated by the training abstract model and the second abstract text meets a convergence condition or not; Step 3, if not, taking the training abstract model as the current model, and iteratively executing the step 1, the step 2 and the step 3 until the loss function value meets a convergence condition; Step 4, if yes, testing the training abstract model based on the second abstract text in the test set and the historical data corresponding to the second abstract text in the test set, and determining whether the similarity ratio between the fifth abstract text generated by the training abstract model and the second abstract text is larger than a preset value or not; Step 5, if the similarity ratio is smaller than a preset value, using the training abstract model as the current model, and iteratively executing the step 1, the step 2, the step 3, the step 4 and the step 5 until the similarity ratio is larger than or equal to the preset value; and step 6, if the similarity ratio is greater than or equal to a preset value, determining that the training abstract model is the abstract model.
- 6. The method as recited in claim 1, further comprising: acquiring the text length of the first abstract text; Determining the first abstract text with the text length being greater than or equal to a preset length as a sixth abstract text, wherein the sixth abstract text is an effective abstract text with the text length meeting the preset length; And determining a preset number of second abstract texts from the sixth abstract texts.
- 7. The model training device is characterized by comprising an acquisition unit, a generation unit, a determination unit and a training unit; the acquisition unit is used for acquiring target data, wherein the target data comprises historical data and training data; the generating unit is used for inputting the historical data into a first model and generating a first abstract text corresponding to the historical data; The determining unit is used for determining a preset number of second abstract texts from the first abstract texts; The training unit is used for training a second model based on the second abstract text and the historical data corresponding to the second abstract text to obtain an abstract model, wherein the abstract model is used for determining the abstract text of data; The determining unit is further used for inputting the training data into the abstract model and determining a third abstract text of the training data; The determining unit is further configured to cluster the third abstract text based on the similarity between the third abstract text, and determine a plurality of first clusters; the determining unit is further configured to determine a first label of each first cluster; The determining unit is further configured to determine a label of each third abstract text based on the first label of each first cluster and the third abstract text in each first cluster.
- 8. A model training device comprising a memory and a processor, wherein the memory is configured to store computer-executable instructions, the processor is coupled to the memory via a bus, and when the model training device is in operation, the processor executes the computer-executable instructions stored in the memory to cause the model training device to perform the model training method of any of claims 1-6.
- 9. A computer readable storage medium comprising computer executable instructions which, when run on a computer, cause the computer to perform the model training method of any of claims 1-6.
Description
Model training method, device and storage medium Technical Field The present application relates to the field of artificial intelligence, and in particular, to a model training method, apparatus, and storage medium. Background With the continuous expansion of the service scale of the customer service industry, the related orders of magnitude of the customer service communication records generated by the customer service industry are increasing, and the topic classification of the customer service communication records is becoming more important. In the prior art, a text clustering method is generally used to convert a customer service communication record into text data corresponding to the customer service communication record, and text features of the text data are extracted. And generating a clustering model corresponding to the customer service communication records according to the text features of the plurality of customer service communication records. And then performing anti-interference processing on the cluster models according to the similarity among the plurality of cluster models, so as to obtain topic classification corresponding to the customer service communication records. However, when the method processes customer service communication records with larger quantity and large scale or customer service communication records with longer text length, the processing efficiency is lower and the accuracy is lower. Disclosure of Invention The application provides a model training method, a model training device and a storage medium, which are used for solving the problem that a model generated data classification result is inaccurate in the prior art. In order to achieve the above purpose, the application adopts the following technical scheme: the method comprises the steps of obtaining target data, inputting the historical data into a first model to generate first abstract texts corresponding to the historical data, determining a preset number of second abstract texts from the first abstract texts, training the second model based on the second abstract texts and the historical data corresponding to the second abstract texts to obtain abstract models, wherein the abstract models are used for determining the abstract texts of the data, and the data generating capacity of the first model is larger than that of the second model. Optionally, the target data further comprises training data, after the abstract model is generated, the model training method further comprises the steps of inputting the training data into the abstract model, determining third abstract texts of the training data, clustering the third abstract texts based on similarity among the third abstract texts, determining a plurality of first clustering clusters, determining a first label of each first clustering cluster, and determining a label of each third abstract text based on the first label of each first clustering cluster and the third abstract texts in each first clustering cluster. Optionally, determining the label of each third abstract text based on the first label of each first cluster and the third abstract text in each first cluster comprises receiving first indication information, wherein the first indication information is used for adjusting the first clusters and the labels of the first clusters, adjusting the first clusters and the labels of the first clusters in response to the first indication information, determining the second clusters and the second labels of the second clusters, generating a mapping relation table based on the third abstract text in each second cluster and the second labels of each second cluster, and the mapping relation table is used for representing the mapping relation between each third abstract text and the second labels. Optionally, the target data further comprises real-time data, after the mapping relation table is generated, the model training method further comprises the steps of inputting the real-time data into a summary model, determining a fourth summary text of the real-time data, determining a second label corresponding to the real-time data based on the fourth summary text of the real-time data and the mapping relation table, and determining a data classification result corresponding to the real-time data based on the second label. Optionally, inputting the historical data into a first model to generate a first abstract text corresponding to the historical data, wherein the method comprises the steps of determining a prompt template corresponding to the historical data based on the historical data; the prompting template is a format template of the first abstract text, and the history data and the prompting template are input into the first model to generate the first abstract text. Optionally, training the second model based on the second abstract text and the history data corresponding to the second abstract text to obtain an abstract model, wherein the model training metho