CN-121997056-A - Model training method, device, storage medium and program product

CN121997056ACN 121997056 ACN121997056 ACN 121997056ACN-121997056-A

Abstract

The embodiment of the disclosure provides a model training method, a device, a storage medium and a program product, relates to the technical field of machine learning, and solves the problem of low answer accuracy of a content generation model in the related technology. The method comprises the steps of obtaining training data of a content generation model, wherein the training data comprise a first type of problem, a standard answer of the first type of problem and a second type of problem, determining a loss function of the content generation model based on the first parameter and the second parameter, wherein the first parameter is used for improving the similarity between an output result of the model when the model answers the first type of problem and the standard answer, the second parameter is used for improving the stability of the output result of the model when the model answers the second type of problem, and performing iterative training on an initial model based on the training data and the loss function to obtain the content generation model.

Inventors

DONG HANG
XU LEI
TAO YE
XU JISEN

Assignees

中国联合网络通信集团有限公司

Dates

Publication Date: 20260508
Application Date: 20241105

Claims (11)

1. A method of model training, comprising: Acquiring training data of a content generation model, wherein the training data comprises a first type of problem, a standard answer of the first type of problem and a second type of problem; Determining a loss function of the content generation model based on a first parameter and a second parameter, wherein the first parameter is used for improving the similarity between an output result of the model when answering a first type of questions and a standard answer, and the second parameter is used for improving the stability of the output result of the model when answering a second type of questions; And carrying out iterative training on the initial model based on the training data and the loss function to obtain the content generation model.
2. The method of claim 1, wherein the first parameter is determined based on similarity of a first model to a standard answer to the first type of question to an answer to the first type of question, the first model being a question-answer model with iteratively updated model parameters.
3. The method of claim 1, wherein the second parameter is determined based on a similarity of a first model to an answer to the second type of question to a second model to an answer to the second type of question, wherein the first model is a question-answer model with iteratively updated model parameters, and wherein the second model is a pre-trained question-answer model based on the second type of question and a standard answer to the second type of question.
4. A method according to any of claims 1-3, characterized in that the Loss function Loss of the content generation model satisfies the following formula: wherein t=α×n+β×m, N is the first parameter, M is the second parameter, and α and β are constants.
5. The method of claim 4, wherein the first parameter N satisfies the following formula: N=log∏(y safe |x uf )-log∏(y update |x uf ) Wherein y safe is used to characterize the standard answer of the first type of question X uf , y update is used to characterize the answer of the first model to the first type of question X uf , y safe |x uf is used to characterize the probability distribution of generating standard answers given the first type of question X uf , and y update |x uf is used to characterize the probability distribution of answers generated by the first model given the first type of question X uf .
6. The method of claim 4, wherein the second parameter M satisfies the following formula: M=log∏(y i |x sf )-log∏(y base |x sf ) Wherein y i is used to characterize the answer of the first model to the second type of question x sf , y base is used to characterize the answer of the second model to the second type of question x sf , y i |x sf is used to characterize the probability distribution of the answer generated by the first model given the second type of question x sf , and y base |x sf is used to characterize the probability distribution of the answer generated by the second model given the second type of question x sf .
7. A method according to any one of claims 1-3, characterized in that the first type of problem is an unsafe problem and the second type of problem is a safe problem.
8. The model training device is characterized by comprising an acquisition unit and a processing unit; the acquisition unit is used for acquiring training data of the content generation model, wherein the training data comprises a first type of problem, a standard answer of the first type of problem and a second type of problem; The processing unit is used for determining a loss function of the content generation model based on a first parameter and a second parameter, wherein the first parameter is used for improving the similarity between an output result and a standard answer when the model answers a first type of questions, and the second parameter is used for improving the stability of the output result when the model answers a second type of questions; And the processing unit is further used for performing iterative training on the initial model based on the training data and the loss function to obtain the content generation model.
9. A model training apparatus comprising a memory and a processor, the memory and the processor being coupled, the memory being for storing instructions executable by the processor, when executing the instructions, performing the method of any of claims 1-7.
10. A computer readable storage medium having stored thereon computer instructions which, when run on a computer, cause the computer to perform the method of any of claims 1-7.
11. A computer program product comprising computer program instructions which, when executed by a processor, implement the method of any of claims 1-7.

Description

Model training method, device, storage medium and program product Technical Field The present disclosure relates to the field of machine learning technologies, and in particular, to a model training method, a model training device, a storage medium, and a program product. Background When a Content generation model (also referred to as a large model, such as an artificial intelligence generation Content (ARTIFICIAL INTELLIGENCE GENERATED Content, AIGC) model) is applied, the large model may generate inappropriate Content for some specific problems (such as unsafe problems), which causes adverse effects and affects the use safety of the model. If the trusted enhancement mode of transmission is used for performing trusted enhancement on the content generation model, the security of the model is improved, but the overall performance of the model is affected, so that the accuracy of the generated content is lower when the content generation model generates the content. Disclosure of Invention The embodiment of the disclosure provides a model training method, a device, a storage medium and a program product, which solve the technical problem that the accuracy of generated content is lower when a content generation model generates the content. In a first aspect, a model training method is provided, including: the method comprises the steps of obtaining training data of a content generation model, wherein the training data comprise a first type of problem, a standard answer of the first type of problem and a second type of problem. And determining a loss function of the content generation model based on a first parameter and a second parameter, wherein the first parameter is used for improving the similarity between an output result of the model when answering the first type of questions and a standard answer, and the second parameter is used for improving the stability of the output result when the model answers the second type of questions. And carrying out iterative training on the initial model based on the training data and the loss function to obtain a content generation model. The embodiment of the disclosure provides a model training method, which is characterized in that when a model is trained, the similarity between an answer of the model for answering a first type of question and a standard answer is respectively determined based on different types of questions, and the stability of the answer is determined when the model answers a second type of question, so that the model is iteratively updated according to the model loss function. Because the embodiment of the disclosure considers the answer demands of different types of questions and the answer effects of different types of questions in the model training process, the model training method based on the embodiment of the disclosure can ensure the accuracy of the model for answering different types of questions, thereby improving the accuracy of the model generation content. With reference to the first aspect, in one possible implementation manner, the first parameter is determined based on a similarity between an answer of the first model to the first type of problem and a standard answer of the first type of problem, and the first model is a question-answer model with iteratively updated model parameters. With reference to the first aspect, in one possible implementation manner, the second parameter is determined based on the similarity between the answer of the first model to the second type of question and the answer of the second model to the second type of question, where the first model is a question-answering model with iteratively updated model parameters, and the second model is a question-answering model trained in advance based on the second type of question and a standard answer of the second type of question. With reference to the first aspect, in one possible implementation manner, the Loss function Loss of the content generation model satisfies the following formula: where t=α×n+β×m, N is a first parameter, M is a second parameter, and α and β are constants. With reference to the first aspect, in one possible implementation manner, the first parameter N satisfies the following formula: N=logΠ(ysafe|xuf)-logΠ(yupdate|xuf) Wherein y safe is used to characterize the standard answer of the first type of question x uf, y update is used to characterize the answer of the first model to the first type of question x uf, y safe|xuf is used to characterize the probability distribution of the standard answer generated given the first type of question x uf, and y update|xuf is used to characterize the probability distribution of the answer generated by the first model given the first type of question x uf. With reference to the first aspect, in a possible implementation manner, the second parameter M satisfies the following formula: M=logΠ(yi|xsf)-logΠ(ybase|xsf) Wherein y i is used to characterize the first model's answer to the second type of que