CN-122023962-A - Model training method, device, equipment and medium

CN122023962ACN 122023962 ACN122023962 ACN 122023962ACN-122023962-A

Abstract

The application discloses a model training method, device, equipment and medium, which are used for training an old model for image processing to obtain a new model under the condition that only a new sample is used, so that the new model for image processing has both a new function and an old function, and the training efficiency and the training effect of the image processing model are improved. The model training method comprises the steps of inputting a new sample into a first model, wherein the first model is an existing model for image processing, generating a first model label based on the new sample and the first model, building a training sample for training a second model based on the new sample and the first model label, wherein the second model is a new model for image processing, which is needed to be obtained through training, building a loss function for training the second model based on the training sample, and training the second model by adopting the loss function.

Inventors

YU KEQIANG
WANG SONG
HAO DEJUN

Assignees

浙江大华技术股份有限公司

Dates

Publication Date: 20260512
Application Date: 20260119

Claims (10)

1. A model training method, wherein the model is used for image processing, the method comprising: Inputting a new sample into a first model, wherein the first model is an existing model for image processing; Generating a first model tag based on the new sample and the first model; Establishing a training sample for training a second model based on the new sample and the first model label, wherein the second model is a new model for image processing, which is needed to be obtained through training; based on the training samples, establishing a loss function for training a second model; And carrying out second model training by adopting the loss function.
2. The method of claim 1, wherein the first model and the second model each comprise a plurality of intermediate layers; the method further comprises the steps of: inputting the same training data to the first model and the second model respectively; determining multi-scale intermediate layer operation results respectively for at least one intermediate layer operation result of the first model and the second model; And adopting a preset constraint mechanism, and utilizing the multi-scale intermediate layer operation result of the first model to constrain the multi-scale intermediate layer operation result of the second model to obtain a constraint result.
3. The method of claim 2, wherein determining a multi-scale intermediate layer operation result for at least one intermediate layer operation result of the first model and the second model, respectively, comprises: results of at least one intermediate layer operation for the first model and the second model: Establishing C, H of intermediate layer operation results and multi-scale intermediate layer operation results which are scaled in the W direction, wherein C is the number of channels, and H and W respectively represent the height and width of an image; And obtaining a multi-scale intermediate layer operation result by cutting the intermediate layer operation result.
4. The method of claim 2, wherein establishing a loss function for training a second model based on the training samples comprises: and establishing a loss function for training the second model based on the training sample and the constraint result.
5. The method of claim 1, wherein generating a first model tag based on the new sample and the first model comprises: inputting the new sample into the first model to generate an initial first model label; the optimized first model label is obtained by performing the following processing on each element in the initial first model label: determining the distance weight and the prediction weight of the current element; Establishing a first model energy function of the current element based on the distance weight and the prediction weight of the current element; And optimizing the first model energy function of the current element to obtain the optimized current element.
6. The method of claim 5, wherein determining the distance weight of the current element comprises: Binarizing pixel values in the current element, wherein the target is represented by a first preset pixel value, and the background is represented by a second preset pixel value; Accumulating coordinates of pixel points with the pixel value of the current element being a first preset pixel value after binarization, and then averaging accumulated results to obtain a coordinate average value of the pixel points with the pixel value of the current element being the first preset pixel value; Calculating the distance between the coordinates of each pixel point with the pixel value of the current element being a first preset pixel value and the average value of the coordinates; and converting each distance into a distance confidence, and taking the distance confidence as the distance weight.
7. The method of claim 6, wherein determining the predictive weight for the current element comprises: and determining the prediction weight corresponding to the current element based on the binarized current element and the preset lowest prediction confidence.
8. A model training apparatus, the apparatus comprising: the system comprises a sample input unit, a first model and a second model, wherein the sample input unit is used for inputting a new sample into the first model, and the first model is an existing model for image processing; A first model tag generating unit configured to generate a first model tag based on the new sample and the first model; The training sample establishing unit is used for establishing a training sample for training a second model based on the new sample and the first model label, wherein the second model is a new model for image processing, which is needed to be obtained through training; the loss function building unit is used for building a loss function for training the second model based on the training sample; and the second model training unit is used for carrying out second model training by adopting the loss function.
9. An electronic device, comprising: a memory for storing program instructions; a processor for invoking program instructions stored in said memory to perform the method of any of claims 1-7 in accordance with the obtained program.
10. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 7.

Description

Model training method, device, equipment and medium Technical Field The present application relates to the field of computer technologies, and in particular, to a model training method, apparatus, device, and medium. Background The application field of the image is wide, and the quality of the image directly influences the effectiveness of the image in various fields. However, the images are inevitably disturbed by noise during the acquisition, processing and transmission processes. Therefore, filtering noise in an image is of great importance. Based on the model training of image processing, after the trained image processing model is obtained, the image processing model can be used for improving the image generation quality and reducing the image noise. Disclosure of Invention The embodiment of the application provides a model training method, device, equipment and medium, which are used for training an old model for image processing to obtain a new model under the condition that only a new sample is used, so that the new model for image processing has both a new function and an old function, and the training efficiency and the training effect of the image processing model are improved. In the model training method provided by the embodiment of the application, the model is used for image processing, and the method comprises the following steps: Inputting a new sample into a first model, wherein the first model is an existing model for image processing; Generating a first model tag based on the new sample and the first model; Establishing a training sample for training a second model based on the new sample and the first model label, wherein the second model is a new model for image processing, which is needed to be obtained through training; based on the training samples, establishing a loss function for training a second model; And carrying out second model training by adopting the loss function. It can be seen that the embodiment of the application provides a training method for a model for image processing, which is characterized in that a new sample is input into an existing first model for image processing, so that a more accurate first model label (namely a high-precision old model label) is generated based on the new sample and the first model, a training sample for training a second model is further established based on the new sample and the first model label, wherein the second model is a new model for image processing, which is needed to be trained, and a loss function for training the second model is established based on the training sample, so that the second model can be trained by adopting the loss function. Therefore, the label output by the old model is more accurate through the generation of the high-precision old model label, so that forgetting of the old model function by the new model can be more effectively restrained under the condition that only a new sample exists in the training process of the new model, namely, the embodiment of the application realizes that the old model used for image processing is trained to obtain the new model under the condition that only the new sample exists, so that the new model used for image processing has both the new function and the old function, and the training efficiency and the training effect of the image processing model are improved. In some embodiments, the first model and the second model each include a plurality of intermediate layers; the method further comprises the steps of: inputting the same training data to the first model and the second model respectively; determining multi-scale intermediate layer operation results respectively for at least one intermediate layer operation result of the first model and the second model; And adopting a preset constraint mechanism, and utilizing the multi-scale intermediate layer operation result of the first model to constrain the multi-scale intermediate layer operation result of the second model to obtain a constraint result. Therefore, the embodiment of the application further provides a multi-scale constraint mechanism of the intermediate layer operation result of the image processing model, so that the training of the new model is constrained, and the new model is not easy to forget the function of the old model. In some embodiments, determining a multi-scale intermediate layer operation result for at least one intermediate layer operation result of the first model and the second model, respectively, comprises: results of at least one intermediate layer operation for the first model and the second model: Establishing C, H of intermediate layer operation results and multi-scale intermediate layer operation results which are scaled in the W direction, wherein C is the number of channels, and H and W respectively represent the height and width of an image; And obtaining a multi-scale intermediate layer operation result by cutting the intermediate layer operation result. In some embodiments, est