CN-116580261-B - Model training method, image processing method, device, equipment and storage medium

CN116580261BCN 116580261 BCN116580261 BCN 116580261BCN-116580261-B

Abstract

The disclosure relates to a model training method, an image processing device, equipment and a storage medium. The method and the device perform model training through a first data set containing multi-mode information such as image information and text information, and a model after a first round of iterative training is obtained. And carrying out multiple rounds of iterative training by taking the model after the first round of iterative training as a reference to obtain a pre-training model, and updating a target data set adopted by the previous round of iterative training before each round of iterative training. And according to the updated target data set, after the model subjected to the previous iteration training is subjected to the present iteration training, the model subjected to the present iteration training can be more accurate compared with the model subjected to the previous iteration training. The pre-training model is used as an initial model of a downstream task such as a scrap steel grade judging task, and after the pre-training model is finely adjusted according to sample data in the downstream task, the finely adjusted pre-training model accurately judges the scrap steel grade, so that scrap steel resources are effectively saved.

Inventors

YANG ZHAO
XU HAIHUA
WEI XIHAN
CHEN WEIXUAN

Assignees

阿里巴巴达摩院(杭州)科技有限公司

Dates

Publication Date: 20260512
Application Date: 20230329

Claims (14)

1. A model training method, wherein the method comprises: The method comprises the steps that a model after the previous iteration training and a target data set adopted by the previous iteration training are obtained, wherein the target data set at least comprises a first data set, the first data set comprises a plurality of first samples, the first samples comprise a first image and text information of a target object in the first image, the first data set is used for model training to obtain the model after the first iteration training, and the model after the previous iteration training is the model after the first iteration training or the model after at least one iteration training by taking the model after the first iteration training as a reference; updating the target data set through the model trained in the previous iteration; Performing iterative training on the model subjected to the previous round of iterative training according to the updated target data set to obtain a model subjected to the present round of iterative training, and obtaining a pre-training model after multiple rounds of iterative training; The method further comprises the steps of selecting a target subset from unlabeled second data sets through the model after the previous iteration training, determining the evaluation index of each second image in the target subset through the first granularity score and the second granularity score of each second image, labeling each second image in the target subset through the model after the previous iteration training to obtain a labeled target subset, updating the target data set according to the labeled target subset, and forming an updated target data set by the target data set and the labeled target subset.
2. The method of claim 1, wherein selecting the subset of targets from the unlabeled second dataset by the model after the last iteration training comprises: for each second image in the unlabeled second data set, scoring the second image with a first granularity and scoring the second image with a second granularity through the model trained in the previous iteration to obtain a first granularity score and a second granularity score of the second image, wherein the first granularity is larger than the second granularity; Determining an evaluation index corresponding to each second image according to the first granularity score and the second granularity score corresponding to each second image respectively; And forming the target subset by the second images of which the evaluation indexes in the second data set meet preset conditions.
3. The method of claim 2, wherein labeling each second image in the subset of targets with the model after the last iteration training comprises: For each second image in the target subset, detecting at least one target area in the second image and the category of a target object in each target area through the model trained in the previous iteration; and labeling the second image according to the category of the target object included in each target area in the second image.
4. A model training method, wherein the method comprises: Acquiring a first image and a second image in a sample data set; Detecting the first image and the second image respectively according to a machine learning model to be trained, so as to obtain a first detection result of the first image and a second detection result of the second image, wherein initial parameters of the machine learning model to be trained are parameters of a pre-training model, and the pre-training model is obtained according to the method of any one of claims 1-3; Performing fusion processing on a plurality of first candidate areas in the first image and a plurality of second candidate areas in the second image to obtain a plurality of third candidate areas; Obtaining a third detection result according to the plurality of third candidate areas, wherein the first detection result, the second detection result and the third detection result respectively comprise at least one target area and the category of the target object in each target area; and training the machine learning model to be trained according to the first detection result, the second detection result and the third detection result to obtain a trained machine learning model.
5. The method of claim 4, wherein training the machine learning model to be trained based on the first detection result, the second detection result, and the third detection result, results in a trained machine learning model, comprising: Constructing a loss function according to the difference value between the labeling result of the first image and the first detection result and the difference value between the second detection result and the third detection result; And training the machine learning model to be trained according to the loss function to obtain a trained machine learning model.
6. The method of claim 4, wherein training the machine learning model to be trained based on the first detection result, the second detection result, and the third detection result, results in a trained machine learning model, comprising: Constructing a loss function according to the difference value between the feature vector corresponding to the target area in the labeling result of the first image and the feature vector corresponding to the target area in the first detection result and the difference value between the feature vector corresponding to the target area in the second detection result and the feature vector corresponding to the target area in the third detection result; And training the machine learning model to be trained according to the loss function to obtain a trained machine learning model.
7. The method of claim 4, wherein the machine learning model to be trained comprises a first gradient decoupling layer, a second gradient decoupling layer, a region generation network, and a feature extractor, the first gradient decoupling layer and the region generation network being connected, the second gradient decoupling layer and the feature extractor being connected, the first gradient decoupling layer and the second gradient decoupling layer being in parallel, the first gradient decoupling layer and the second gradient decoupling layer being configured to decouple a parameter iteration process of the region generation network and a parameter iteration process of the feature extractor.
8. An image processing method, wherein the method comprises: Acquiring a target image to be processed; Inputting the target image into a trained machine learning model, such that the trained machine learning model outputs a confidence level of at least one target region in the target image and a category to which a target object in each target region belongs, the trained machine learning model being obtained according to the method of any one of claims 4-7; and determining the level of the target object in the target image according to the confidence level of the category to which the target object in each target area belongs.
9. The method of claim 8, wherein determining the level of the target object in the target image based on the confidence of the class to which the target object belongs in each target region comprises: Acquiring a first image set and a second image set associated with the target image; Correcting the confidence coefficient of the category to which the target object belongs in each target area according to the first image set and the second image set to obtain corrected confidence coefficient; And determining the category of the target object in the target image according to the corrected confidence.
10. A scrap steel grade determining method, wherein the method comprises: acquiring images of multiple levels of whole-vehicle scrap steel in the unloading process; Inputting the images of each level into a trained machine learning model, such that the trained machine learning model outputs at least one target region in the images of each level, and a class of target objects in each target region, the trained machine learning model being obtained according to the method of any one of claims 4-7; and determining the grade of the whole car scrap steel according to at least one target area in the image of each hierarchy and the category of the target object in each target area.
11. The method of claim 10, wherein the method further comprises: And determining the proportion of impurities in the whole scrap steel according to at least one target area in the image of each level and the category of the target object in each target area.
12. A model training apparatus, comprising: The system comprises an acquisition module, a training module and a training module, wherein the acquisition module is used for acquiring a model after the previous round of iterative training and a target data set adopted by the previous round of iterative training, the target data set at least comprises a first data set, the first data set comprises a plurality of first samples, the first samples comprise a first image and text information of a target object in the first image, the first data set is used for model training to obtain a model after the first round of iterative training, and the model after the previous round of iterative training is the model after the first round of iterative training or the model after at least one round of iterative training by taking the model after the first round of iterative training as a reference; The updating module is used for updating the target data set through the model trained in the previous iteration; the iterative training module is used for carrying out iterative training on the model after the previous round of iterative training according to the updated target data set to obtain a model after the current round of iterative training, and obtaining a pre-training model after multiple rounds of iterative training; The updating module is further used for selecting a target subset from unlabeled second data sets through the model after the previous iteration training, wherein the evaluation index of each second image in the target subset meets a preset condition, the evaluation index is determined through the first granularity score and the second granularity score of each second image, each second image in the target subset is labeled through the model after the previous iteration training to obtain a labeled target subset, the target data set is updated according to the labeled target subset, and the target data set and the labeled target subset form an updated target data set.
13. An electronic device, comprising: a memory; Processor, and A computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-11.
14. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of any of claims 1-11.

Description

Model training method, image processing method, device, equipment and storage medium Technical Field The disclosure relates to the field of information technology, and in particular relates to a model training method, a model processing method, a model training device, a model processing device and a storage medium. Background Currently, image processing techniques have been widely used in various fields. For example, the grade of the whole car scrap steel is judged by carrying out image acquisition on the whole car scrap steel and then adopting an image processing technology. However, the grade determination of the whole vehicle waste steel is not accurate enough at present through an image processing technology, so that the waste steel resources are not effectively utilized. Disclosure of Invention In order to solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides a model training method, an image processing device, an apparatus and a storage medium, so as to accurately determine the scrap steel grade and save scrap steel resources. In a first aspect, an embodiment of the present disclosure provides a model training method, including: The method comprises the steps that a model after the previous iteration training and a target data set adopted by the previous iteration training are obtained, wherein the target data set at least comprises a first data set, the first data set comprises a plurality of first samples, the first samples comprise a first image and text information of a target object in the first image, the first data set is used for model training to obtain the model after the first iteration training, and the model after the previous iteration training is the model after the first iteration training or the model after at least one iteration training by taking the model after the first iteration training as a reference; updating the target data set through the model trained in the previous iteration; and carrying out iterative training on the model subjected to the previous round of iterative training according to the updated target data set to obtain a model subjected to the present round of iterative training, and obtaining a pre-training model after a plurality of rounds of iterative training. In a second aspect, embodiments of the present disclosure provide a model training method, the method comprising: Acquiring a first image and a second image in a sample data set; Detecting the first image and the second image respectively according to a machine learning model to be trained, so as to obtain a first detection result of the first image and a second detection result of the second image, wherein initial parameters of the machine learning model to be trained are parameters of a pre-training model, and the pre-training model is obtained according to the method according to the first aspect; Performing fusion processing on a plurality of first candidate areas in the first image and a plurality of second candidate areas in the second image to obtain a plurality of third candidate areas; Obtaining a third detection result according to the plurality of third candidate areas, wherein the first detection result, the second detection result and the third detection result respectively comprise at least one target area and the category of the target object in each target area; and training the machine learning model to be trained according to the first detection result, the second detection result and the third detection result to obtain a trained machine learning model. In a third aspect, an embodiment of the present disclosure provides an image processing method, including: Acquiring a target image to be processed; inputting the target image into a trained machine learning model, such that the trained machine learning model outputs a confidence level of at least one target region in the target image and a category to which the target object in each target region belongs, the trained machine learning model being obtained according to the method as described in the second aspect; and determining the level of the target object in the target image according to the confidence level of the category to which the target object in each target area belongs. In a fourth aspect, an embodiment of the present disclosure provides a scrap steel grade determining method, the method including: acquiring images of multiple levels of whole-vehicle scrap steel in the unloading process; Inputting the images of each level into a trained machine learning model, such that the trained machine learning model outputs at least one target region in the images of each level and a class of target objects in each target region, the trained machine learning model being derived according to the method as described in the second aspect; and determining the grade of the whole car scrap steel according to at least one target area in the image of each hierarchy and the