CN-114299363-B - Training method of image processing model, image classification method and device

CN114299363BCN 114299363 BCN114299363 BCN 114299363BCN-114299363-B

Abstract

The application discloses a training method, an image classification method and a device of an image processing model, wherein the training method is realized by cutting a plurality of original images, the number of training samples can be effectively expanded, and the effect of the image processing model obtained by training is better. Also, the categories of the two training samples in each positive pair of samples for training the image processing model are the same, while the categories of the two training samples in the negative pair are different. Therefore, the image processing model obtained through training can be ensured to learn the characteristics of images of different categories well, and the effect of the image processing model is further improved.

Inventors

LIU TONG
ShangGuan Zeyu

Assignees

京东方科技集团股份有限公司

Dates

Publication Date: 20260505
Application Date: 20211229

Claims (11)

1. A method of training an image processing model, the method comprising: acquiring a plurality of original image sets, wherein each original image set comprises a plurality of original images with the same category, and the categories of the original images included in different original image sets are different; Cutting out a plurality of original images in the plurality of original image sets, cutting out each original image in each original image set, and cutting out a piece of original image to obtain a sub-image with the number of pieces larger than 1, so as to obtain a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample is an original image, or a sub-image obtained by cutting out an original image; Determining a plurality of positive sample pairs and a plurality of negative sample pairs from the training sample set, wherein each positive sample pair comprises two training samples obtained based on different original images in the same original image set, and each negative sample pair comprises two training samples obtained based on original images in different original image sets; training an image processing model by using the plurality of positive sample pairs and the plurality of negative sample pairs, wherein the training image processing model comprises marking the true value of each positive sample pair as 1 and marking the true value of each negative sample pair as 0; wherein the cropping the plurality of original image samples in the plurality of original image sets includes: Randomly generating a clipping size in a target size range for each original image in the plurality of original images for clipping, wherein if a clipping region is a rectangular region, the target size range comprises a width range and a height range, and the clipping size comprises a width in the width range and a height in the height range; Determining a reference point of a clipping region based on the size of the original image and the clipping size, wherein if the clipping region is a rectangular region, the reference point of the clipping region is a vertex of the rectangular region or a center point of the rectangular region; Determining the clipping region in the original image based on the clipping size and the reference point, and clipping the clipping region; Or clipping a fixed clipping region in the original image.
2. The method of claim 1, wherein determining a plurality of positive sample pairs from the training sample set comprises: determining a plurality of candidate sample pairs from the training sample set, each candidate sample pair comprising two training samples derived based on different original images in the same original image set; determining the similarity of each of the candidate sample pairs; And determining the candidate sample pair with the similarity larger than the similarity threshold as a positive sample pair.
3. The method of claim 2, wherein said determining the similarity of each of said candidate sample pairs comprises: extracting the feature vector of each training sample in each candidate sample pair by adopting a convolutional neural network; And for each candidate sample pair, adopting a similarity measurement algorithm to process the feature vectors of two training samples in the candidate sample pair, and obtaining the similarity of the candidate sample pair.
4. The method of claim 1, wherein determining a plurality of negative sample pairs from the training sample set comprises: a number of negative pairs of samples equal in number to the number of positive pairs of samples is determined from the training sample set.
5. A method of classifying images, the method comprising: Acquiring a target image to be classified; Inputting the target image into an image classification model to obtain the category of the target image output by the image classification model; wherein the image classification model is trained by the method of any one of claims 1 to 4.
6. The method of claim 5, wherein said inputting the target image into an image classification model to obtain a category of the target image output by the image classification model comprises: inputting a target image into an image classification model to obtain the similarity between the target image output by the image classification model and different types of reference images; And determining the category of the reference image with the highest similarity with the target image in the reference images with different categories as the category of the target image.
7. The method of claim 5, wherein said inputting the target image into the image classification model to obtain the class of the target image output by the image classification model comprises: Inputting a target image into an image classification model to obtain the similarity between the target image output by the image classification model and image features of different categories; Determining the category of the image features with the highest similarity with the target image from the image features of different categories as the category of the target image; The image features of each category are obtained by extracting features from a plurality of training samples of the category.
8. A training apparatus for an image processing model, the apparatus comprising: The acquisition module is used for acquiring a plurality of original image sets, wherein each original image set comprises a plurality of original images with the same category, and the categories of the original images contained in different original image sets are different; the clipping module is used for clipping a plurality of original images in the plurality of original image sets, clipping each original image in each original image set, clipping one original image to obtain a sub-image with the number of more than 1, and obtaining a training sample set, wherein the training sample set comprises a plurality of training samples, each training sample is one original image, or the sub-image obtained by clipping one original image; A determining module configured to determine a plurality of positive sample pairs and a plurality of negative sample pairs from the training sample set, wherein each positive sample pair includes two training samples obtained based on different original images in the same original image set, and each negative sample pair includes two training samples obtained based on original images in different original image sets; The training module is used for training an image processing model by adopting the plurality of positive sample pairs and the plurality of negative sample pairs, and comprises marking the true value of each positive sample pair as 1 and marking the true value of each negative sample pair as 0; wherein the cropping the plurality of original image samples in the plurality of original image sets includes: Randomly generating a clipping size in a target size range for each original image in the plurality of original images for clipping, wherein if a clipping region is a rectangular region, the target size range comprises a width range and a height range, and the clipping size comprises a width in the width range and a height in the height range; Determining a reference point of a clipping region based on the size of the original image and the clipping size, wherein if the clipping region is a rectangular region, the reference point of the clipping region is a vertex of the rectangular region or a center point of the rectangular region; Determining the clipping region in the original image based on the clipping size and the reference point, and clipping the clipping region; Or clipping a fixed clipping region in the original image.
9. An image classification apparatus, the apparatus comprising: The acquisition module is used for acquiring target images to be classified; the classification module is used for inputting the target image into an image classification model to obtain the category of the target image output by the image classification model; The image classification model is trained by the training device of the image processing model according to claim 8.
10. An image processing apparatus, characterized in that it comprises a processor and a memory in which instructions are stored, which instructions are loaded and executed by the processor to implement the training method of an image processing model according to any one of claims 1 to 4 or the image classification method according to any one of claims 5 to 7.
11. A computer readable storage medium having instructions stored therein, the instructions being loaded and executed by a processor to implement the method of training an image processing model according to any one of claims 1 to 4 or the method of classifying images according to any one of claims 5 to 7.

Description

Training method of image processing model, image classification method and device Technical Field The present application relates to the field of machine learning, and in particular, to a training method for an image processing model, an image classification method and an image classification device. Background In the field of machine learning, a large number of training samples can be used for training an image processing model so as to ensure that the image processing model obtained through training has better performance. For example, for an image classification model, a large number of different classes of images need to be acquired as training samples to train the image classification model. But the training results in a poor image processing model due to the limited number of training samples that can be used in some scenarios (e.g., limited number of images in some categories). Disclosure of Invention The application provides a training method, an image classification method and an image classification device for an image processing model, which can solve the problem of poor training effect of the image processing model in the related technology. The technical scheme is as follows: In one aspect, a training method of an image processing model is provided, the method including: Acquiring a plurality of original image sets, wherein each original image set comprises a plurality of original images with the same category, and the categories of the original images included in different original image sets are different; Cutting out a plurality of original images in the plurality of original images to obtain a training sample set, wherein the training sample set comprises a plurality of training samples, and each training sample is an original image or a sub-image obtained by cutting out an original image; Determining a plurality of positive sample pairs and a plurality of negative sample pairs from the training sample set, wherein each positive sample pair comprises two training samples obtained based on different original images in the same original image set, and each negative sample pair comprises two training samples obtained based on original images in different original image sets; the image processing model is trained using the plurality of positive sample pairs and the plurality of negative sample pairs. Optionally, for each original image in the plurality of original images for cropping, randomly generating a cropping size within a target size range, determining a reference point of a cropping area based on the size of the original image and the cropping size, determining the cropping area in the original image based on the cropping size and the reference point, and cropping the cropping area. Optionally, the target size range comprises a width range and a height range, the clipping size comprises a width in the width range and a height in the height range, the clipping area is a rectangular area, and the reference point of the clipping area is one vertex of the rectangular area or the center point of the rectangular area. Optionally, a plurality of candidate sample pairs are determined from the training sample set, each candidate sample pair comprises two training samples obtained based on different original images in the same original image set, the similarity of each candidate sample pair is determined, and the candidate sample pair with the similarity greater than a similarity threshold is determined to be a positive sample pair. And for each candidate sample pair, processing the feature vectors of two training samples in the candidate sample pair by adopting a similarity measurement algorithm to obtain the similarity of the candidate sample pair. Optionally, a number of negative pairs of samples equal to the number of positive pairs of samples is determined from the training sample set. Optionally, marking the true value of each positive sample pair as 1 and marking the true value of each negative sample pair as 0, and training an image processing model by using the marked positive sample pairs and the marked negative sample pairs. In another aspect, there is provided an image classification method, the method comprising: Acquiring a target image to be classified; Inputting the target image into an image classification model to obtain the category of the target image output by the image classification model, wherein the image classification model is obtained by training the training method of the image processing model according to the aspect. Optionally, inputting the target image into an image classification model to obtain the similarity between the target image output by the image classification model and reference images of different categories, and determining the category of the reference image with the highest similarity with the target image in the reference images of different categories as the category of the target image. Optionally, inputting a target image into an image classifi