US-12620208-B2 - Facial beauty prediction method, apparatus, device and storage medium

US12620208B2US 12620208 B2US12620208 B2US 12620208B2US-12620208-B2

Abstract

Embodiments of the disclosure discloses method and apparatus for facial beauty prediction, device and storage medium. The method includes the steps of classifying a training set into a plurality of first images with noise labels and a plurality of second images with non-noise labels; performing a re-weighting processing on the plurality of the second images; training a first model by using a target training set to obtain a second model; labeling image data by using the second model to obtain first data with labels and second data without labels; generating third data with pseudo labels through a classifier by using the second data; training the classifier according to the first data, the second data and the third data; processing an image to be predicted through a model with a target classifier to obtain a facial beauty prediction result.

Inventors

Junying GAN
Jianqiang Liu
Huicong LI
Junling XIONG
Xiaoshan XIE
Heng Luo

Assignees

WUYI UNIVERSITY

Dates

Publication Date: 20260505
Application Date: 20230522
Priority Date: 20230510

Claims (20)

1 . A method for facial beauty prediction, comprising: classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels; performing a re-weighting processing on the second images; forming a target training set by the plurality of the first images and the plurality of the re-weighted second images, and training a first model by using the target training set to obtain a second model; performing a labeling processing on image data by using the second model to obtain first data with labels and second data without labels; training the second data through a classifier to generate third data with pseudo labels; training the classifier according to the first data, the second data and the third data to obtain a target classifier; and performing facial beauty prediction on an image to be predicted through a facial beauty prediction model with the target classifier to obtain a facial beauty prediction result.
2 . The method for facial beauty prediction according to claim 1 , wherein, the classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels comprises: performing a probability calculation according to the training set of the facial beauty prediction images to obtain a plurality of probability values each of which indicates whether a corresponding facial beauty prediction image has a noise label; obtaining a joint distribution of the training set according to the plurality of probability values; and classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
3 . The method for facial beauty prediction according to claim 1 , wherein, the classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels comprises: obtaining an expected risk of classification according to the joint distribution of the training set, a real distribution of the training set, a weight of the facial beauty prediction image and a loss function; and classifying the training set according to the expected risk to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
4 . The method for facial beauty prediction according to claim 3 , wherein, the weight of the facial beauty prediction image is determined by the joint distribution of the training set and a noise rate.
5 . The method for facial beauty prediction according to claim 4 , wherein, the noise rate is a minimum value of the joint distribution of the training set within a preset range.
6 . The method for facial beauty prediction according to claim 4 , wherein, the weight is not negative in response to a value of the joint distribution of the training set being not equal to 0, and the weight is equal to 0 in response to the value of the joint distribution of the training set being equal to 0.
7 . The method for facial beauty prediction according to claim 1 , wherein, the training the classifier according to the first data, the second data and the third data to obtain a target classifier comprises: training the classifier according to the first data, the second data and the third data until the classifier converges to the target classifier.
8 . An electronic device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, wherein, the computer program, when executed by the processor, causes the processor to perform a method for facial beauty prediction comprising: classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels; performing a re-weighting processing on the second images; forming a target training set by the plurality of the first images and the plurality of the re-weighted second images, and training a first model by using the target training set to obtain a second model; performing a labeling processing on image data by using the second model to obtain first data with labels and second data without labels; training the second data through a classifier to generate third data with pseudo labels; training the classifier according to the first data, the second data and the third data to obtain a target classifier; and performing facial beauty prediction on an image to be predicted through a facial beauty prediction model with the target classifier to obtain a facial beauty prediction result.
9 . The electronic device according to claim 8 , wherein, the classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels comprises: performing a probability calculation according to the training set of the facial beauty prediction images to obtain a plurality of probability values each of which indicates whether a corresponding facial beauty prediction image has a noise label; obtaining a joint distribution of the training set according to the plurality of probability values; and classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
10 . The electronic device according to claim 8 , wherein, the classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels comprises: obtaining an expected risk of classification according to the joint distribution of the training set, a real distribution of the training set, a weight of the facial beauty prediction image and a loss function; and classifying the training set according to the expected risk to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
11 . The electronic device according to claim 10 , wherein, the weight of the facial beauty prediction image is determined by the joint distribution of the training set and a noise rate.
12 . The electronic device according to claim 11 , wherein, the noise rate is a minimum value of the joint distribution of the training set within a preset range.
13 . The electronic device according to claim 11 , wherein, the weight is not negative in response to a value of the joint distribution of the training set being not equal to 0, and the weight is equal to 0 in response to the value of the joint distribution of the training set being equal to 0.
14 . The electronic device according to claim 8 , wherein, the training the classifier according to the first data, the second data and the third data to obtain a target classifier comprises: training the classifier according to the first data, the second data and the third data until the classifier converges to the target classifier.
15 . A non-transitory computer-readable storage medium storing computer-executable instructions which, when executed by a processor, causes the processor to perform a method for facial beauty prediction comprising: classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels; performing a re-weighting processing on the second images; forming a target training set by the plurality of the first images and the plurality of the re-weighted second images, and training a first model by using the target training set to obtain a second model; performing a labeling processing on image data by using the second model to obtain first data with labels and second data without labels; training the second data through a classifier to generate third data with pseudo labels; training the classifier according to the first data, the second data and the third data to obtain a target classifier; and performing facial beauty prediction on an image to be predicted through a facial beauty prediction model with the target classifier to obtain a facial beauty prediction result.
16 . The non-transitory computer-readable storage medium according to claim 15 , wherein, the classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels comprises: performing a probability calculation according to the training set of the facial beauty prediction images to obtain a plurality of probability values each of which indicates whether a corresponding facial beauty prediction image has a noise label; obtaining a joint distribution of the training set according to the plurality of probability values; and classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
17 . The non-transitory computer-readable storage medium according to claim 15 , wherein, the classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels comprises: obtaining an expected risk of classification according to the joint distribution of the training set, a real distribution of the training set, a weight of the facial beauty prediction image and a loss function; and classifying the training set according to the expected risk to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels.
18 . The non-transitory computer-readable storage medium according to claim 17 , wherein, the weight of the facial beauty prediction image is determined by the joint distribution of the training set and a noise rate.
19 . The non-transitory computer-readable storage medium e according to claim 18 , wherein, the noise rate is a minimum value of the joint distribution of the training set within a preset range.
20 . The non-transitory computer-readable storage medium according to claim 18 , wherein, the weight is not negative in response to a value of the joint distribution of the training set being not equal to 0, and the weight is equal to 0 in response to the value of the joint distribution of the training set being equal to 0.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a national stage filing under 35 U.S.C. § 371 of international application number PCT/CN2023/095555, filed May 22, 2023, which claims priority to Chinese patent application No. 202310525485.5 filed May 10, 2023. The entire contents of these applications are incorporated herein by reference in their entirety. FIELD OF THE INVENTION Embodiments of the disclosure relate to, but are not limited to, the field of image recognition, in particular to a method and an apparatus for facial beauty prediction, a device and a storage medium. BACKGROUND OF THE INVENTION Relative facial beauty prediction methods usually require a large amount of labeled data for model training. Meanwhile, when a noise sample is labeled, a label quality is influenced by subjective factors of manual or machine data labeling, a tool technology and other factors, thereby introducing a label noise. A label noise problem can greatly influence an accuracy of a model, and a facial beauty prediction effect is reduced. SUMMARY OF THE INVENTION The following is a summary of the subject matter described in detail herein. The summary is not intended to limit the scope of protection of the claims. The embodiments of the disclosure disclose method an apparatus for facial beauty prediction, device and storage medium, which can weaken dependence of a model on noise labels and enhance an utilization effect of unlabeled data. According to an embodiment of a first aspect of the present disclosure, a facial beauty prediction method may include following steps. A training set of facial beauty prediction images is classified to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels. A re-weighting processing is performed on the plurality of the second images. A target training set is formed by the plurality of the first images and the plurality of the re-weighted second images, and a first model is trained by using the target training set to obtain a second model. A labeling processing is performed on image data by using the second model to obtain first data with labels and second data without labels. The second data is trained through a classifier to generate third data with pseudo labels. The classifier is trained according to the first data, the second data and the third data to obtain a target classifier. A facial beauty prediction is performed on an image to be predicted through a facial beauty prediction model with the target classifier to obtain a facial beauty prediction result. In some embodiments of the first aspect of the present disclosure, the classifying a training set of facial beauty prediction images to obtain a plurality of first images with noise labels and a plurality of second images with non-noise labels may include following sub-steps. A probability calculation is performed according to the training set of the facial beauty prediction images to obtain a plurality of probability values each of which indicates whether a corresponding facial beauty prediction image has a noise label. A joint distribution of the training set is obtained according to the plurality of probability values. The training set is classified according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels. In some embodiments of the first aspect of the present disclosure, the classifying the training set according to the joint distribution to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels may include following sub-steps. An expected risk of classification is obtained according to the joint distribution of the training set, a real distribution of the training set, a weight of the facial beauty prediction image and a loss function. The training set is classified according to the expected risk to obtain the plurality of first images with noise labels and the plurality of second images with non-noise labels. In some embodiments of the first aspect of the present disclosure, the weight of the facial beauty prediction image is determined by the joint distribution of the training set and a noise rate. In some embodiments of the first aspect of the present disclosure, the noise rate is a minimum value of the joint distribution of the training set within a preset range. In some embodiments of the first aspect of the present disclosure, when a value of the joint distribution of the training set is not equal to 0, the weight is not negative; when the value of the joint distribution of the training set is equal to 0, the weight is equal to 0. In some embodiments of the first aspect of the present disclosure, the training the classifier according to the first data, the second data and the third data to obtain a target classifier may include following sub-step. The classifier is trained according to the first data, the secon