CN-122023568-A - Image face changing method, device and equipment

CN122023568ACN 122023568 ACN122023568 ACN 122023568ACN-122023568-A

Abstract

The application provides an image face changing method, device and equipment, which comprise the steps of obtaining a source image and a target image, wherein the source image comprises a first face of a first object, the target image comprises a second face of a second object, performing three-dimensional reconstruction on the first face to obtain three-dimensional facial features of the first face, extracting expression features of the first face and the second face respectively to obtain a first expression feature of the first face and a second expression feature of the second face correspondingly, performing feature conversion on the three-dimensional facial features through the first expression feature and the second expression feature to obtain a three-dimensional expression driving feature, and decoding the three-dimensional expression driving feature to obtain a face changing image. The application can improve the output quality of the face-changing image.

Inventors

WANG YIBO
Liu Gongye
CAI CHENGFEI
TANG SHIYU
PAN HENG
ZHANG SHIXUE

Assignees

腾讯科技（深圳）有限公司

Dates

Publication Date: 20260512
Application Date: 20241101

Claims (15)

1. An image face-changing method, the method comprising: acquiring a source image and a target image, wherein the source image comprises a first face of a first object, and the target image comprises a second face of a second object; performing three-dimensional reconstruction on the first face to obtain three-dimensional facial features of the first face; Extracting expression features of the first face and the second face respectively, and correspondingly obtaining a first expression feature of the first face and a second expression feature of the second face; Performing feature conversion on the three-dimensional facial features through the first expression features and the second expression features to obtain three-dimensional expression driving features; and decoding the three-dimensional expression driving characteristics to obtain a face-changing image.
2. The method according to claim 1, wherein the feature converting the three-dimensional facial feature by the first expression feature and the second expression feature to obtain a three-dimensional expression driving feature includes: extracting driving parameters of the first expression features to obtain first driving parameters of the first face; extracting driving parameters of the second expression features to obtain second driving parameters of the second face; And performing feature mapping on the three-dimensional facial features based on the first driving parameters and the second driving parameters to obtain the three-dimensional expression driving features.
3. The method according to claim 2, wherein the feature mapping the three-dimensional facial feature based on the first driving parameter and the second driving parameter to obtain the three-dimensional expression driving feature includes: performing first feature mapping on the three-dimensional facial features through the first driving parameters to obtain three-dimensional facial features with the first face in an original state; And performing second feature mapping on the three-dimensional facial features in the original state through the second driving parameters to obtain the three-dimensional expression driving features.
4. A method according to claim 3, wherein decoding the three-dimensional expressive driving feature to obtain a face-change image comprises: Decoding the three-dimensional expression driving characteristics to obtain face changing characteristics; Performing background coding on a target background to obtain background characteristics; and carrying out feature fusion on the face changing features and the background features to obtain the face changing image.
5. The method according to any one of claims 1 to 4, wherein the image face-changing method is implemented by an image face-changing model, the image face-changing model being trained by: The method comprises the steps of obtaining sample data, wherein the sample data comprises a first sample source image, a first sample target image and a truth image; performing first data enhancement on the first sample source image to obtain a second sample source image; performing second data enhancement on the first sample target image to obtain a second sample target image; Carrying out face changing processing on the second sample source image according to the second sample target image through an image face changing model to be trained to obtain a sample face changing image; Performing loss calculation based on the truth image and the sample face-changing image to obtain a loss result; And updating model parameters in the image face-changing model based on the loss result to obtain a trained image face-changing model.
6. The method of claim 5, wherein the first sample source image includes a first sample face of a first sample object and the first sample target image includes a second sample face of a second sample object; The first data enhancement is performed on the first sample source image to obtain a second sample source image, which comprises the following steps: Extracting facial attributes of the first sample face and the second sample face respectively, and correspondingly obtaining a first facial attribute set and a second facial attribute set; And performing first image transformation on the first sample source image based on the first surface attribute set and the second surface attribute set to obtain the second sample source image.
7. The method of claim 6, wherein performing a second data enhancement on the first sample target image to obtain a second sample target image comprises: extracting face key points of the first sample face and the second sample face respectively, and correspondingly obtaining a first face key point set and a second face key point set; And performing second image transformation on the first sample target image based on the first surface key point set and the second surface key point set to obtain the second sample target image.
8. The method of claim 7, wherein the image face-changing model comprises a three-dimensional feature encoder, an expression encoder, a drive encoder, and a decoder, wherein the second sample source image comprises a third sample face and the second sample target image comprises a fourth sample face; the step of performing face-changing processing on the second sample source image according to the second sample target image by using the image face-changing model to be trained to obtain a sample face-changing image comprises the following steps: Performing three-dimensional reconstruction on the third sample face through the three-dimensional feature encoder to obtain three-dimensional sample face features of the third sample face; Extracting expression features of the third sample face and the fourth sample face through the expression encoder respectively, and correspondingly obtaining a first sample expression feature of the third sample face and a second sample expression feature of the fourth sample face; Performing feature conversion on the three-dimensional sample facial features based on the first sample expression features and the second sample expression features through the driving encoder to obtain three-dimensional sample expression driving features; And decoding the three-dimensional sample expression driving characteristics through the decoder to obtain the sample face-changing image.
9. The method according to claim 8, wherein the performing a loss calculation based on the truth image and the sample face-change image to obtain a loss result comprises: Generating countermeasures based on the truth image and the sample face-changing image to obtain a first loss value; Performing line-of-sight loss calculation based on the truth image and the sample face-changing image to obtain a second loss value; Performing pixel loss calculation based on the truth image and the sample face-changing image to obtain a third loss value; Carrying out expression loss calculation based on M third sample source images and N third sample target images to obtain a fourth loss value, wherein M and N are integers larger than 1; And carrying out loss value fusion on the first loss value, the second loss value, the third loss value and the fourth loss value to obtain the loss result.
10. The method according to claim 9, wherein the generating a counterdamage calculation based on the truth image and the sample face-change image results in a first damage value, comprising: performing feature mapping on the sample face-changing image to obtain a first discrimination probability for representing that the sample face-changing image is a natural image; performing feature mapping on the truth image to obtain a second discrimination probability for representing that the truth image is a natural image; and generating countermeasures according to the first discrimination probability and the second discrimination probability to obtain the first loss value.
11. The method according to claim 9, wherein the performing a line-of-sight loss calculation based on the truth image and the sample face-change image, to obtain a second loss value, comprises: Extracting the sight line characteristics of the sample face-changing image to obtain a first sight line characteristic of the sample face-changing image; Extracting the sight line characteristics of the truth image to obtain second sight line characteristics of the truth image; And performing line-of-sight loss calculation based on the first line-of-sight feature and the second line-of-sight feature to obtain the second loss value.
12. The method according to claim 9, wherein the performing a pixel loss calculation based on the truth image and the sample face-change image to obtain a third loss value comprises: extracting pixel characteristics of the sample face-changing image to obtain a first pixel set of the sample face-changing image; extracting pixel characteristics of the truth image to obtain a second pixel set of the truth image; and carrying out pixel loss calculation based on the first pixel set and the second pixel set to obtain the third loss value.
13. The method according to claim 9, wherein the performing expression loss calculation based on the M third sample source images and the N third sample target images to obtain a fourth loss value includes: Respectively carrying out data enhancement on the M third sample source images and the N third sample target images to correspondingly obtain M fourth sample source images and N fourth sample target images, wherein the M fourth sample source images comprise M fifth sample faces respectively corresponding to the M third sample objects, and the N fourth sample target images comprise N sixth sample faces respectively corresponding to the N fourth sample objects; Performing three-dimensional reconstruction on the M fifth sample faces to obtain three-dimensional sample face features of each of the M fifth sample faces; Extracting expression features of the M fifth sample faces and the N sixth sample faces respectively, and correspondingly obtaining respective third sample expression features of the M fifth sample faces and respective fourth sample expression features of the N sixth sample faces; Respectively extracting driving parameters of M third sample expression features and N fourth sample expression features to correspondingly obtain M third driving parameters and N fourth driving parameters; and carrying out expression loss calculation based on the M three-dimensional sample facial features, the M third driving parameters and the N fourth driving parameters to obtain a fourth loss value.
14. An image face-changing apparatus, comprising: the system comprises an acquisition module, a display module and a display module, wherein the acquisition module acquires a source image and a target image, the source image comprises a first face of a first object, and the target image comprises a second face of a second object; the three-dimensional reconstruction module is used for carrying out three-dimensional reconstruction on the first face to obtain three-dimensional facial features of the first face; The expression feature extraction module is used for extracting expression features of the first face and the second face respectively, and correspondingly obtaining a first expression feature of the first face and a second expression feature of the second face; The feature conversion module is used for carrying out feature conversion on the three-dimensional facial features through the first expression features and the second expression features to obtain three-dimensional expression driving features; and the decoding module is used for decoding the three-dimensional expression driving characteristics to obtain a face-changing image.
15. An electronic device, comprising: a memory for storing computer executable instructions; A processor for implementing the image face-changing method of any one of claims 1 to 13 when executing computer-executable instructions stored in the memory.

Description

Image face changing method, device and equipment Technical Field The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, and a device for changing a face of an image. Background The image face-changing technique refers to a technique of replacing a face in one image with a face in another image by algorithmic processing. Thereby visually effecting an "identity transformation". Image face-changing technology is widely used in various application scenes, such as movie and television portrait production, game character design, avatar, privacy protection, etc. For example, in video portrait production, when an actor cannot complete professional actions, the professional can complete the professional first, and later, the face of the professional can be automatically replaced with the face of the actor by using a face-changing algorithm. In the virtual image, the user can change the face of the virtual character by using a face changing algorithm, so that the live broadcast interestingness is improved and the personal privacy is protected. However, in the related art, the image face-changing technology is mainly focused on two-dimensional image processing and primary three-dimensional modeling technologies, and although the methods perform considerable in some application scenes, the methods have obvious limitations in real-time image face-changing scenes and expression dynamic capturing. In particular, in highly dynamic interactive environments, such as video telephony and virtual reality application scenarios, the related art cannot achieve highly natural and realistic effects. Disclosure of Invention The embodiment of the application provides an image face changing method, device and equipment, which can improve the output quality of a face changing image. The technical scheme of the embodiment of the application is realized as follows: The embodiment of the application provides an image face changing method, which comprises the steps of obtaining a source image and a target image, wherein the source image comprises a first face of a first object, the target image comprises a second face of a second object, three-dimensional reconstruction is carried out on the first face to obtain three-dimensional facial features of the first face, expression feature extraction is carried out on the first face and the second face respectively to obtain a first expression feature of the first face and a second expression feature of the second face correspondingly, feature conversion is carried out on the three-dimensional facial features through the first expression feature and the second expression feature to obtain three-dimensional expression driving features, and decoding is carried out on the three-dimensional expression driving features to obtain a face changing image. The embodiment of the application provides an image face changing device, which comprises an acquisition module, a three-dimensional reconstruction module, an expression feature extraction module and a feature conversion module, wherein the acquisition module acquires a source image and a target image, the source image comprises a first face of a first object, the target image comprises a second face of a second object, the three-dimensional reconstruction module is used for carrying out three-dimensional reconstruction on the first face to obtain three-dimensional facial features of the first face, the expression feature extraction module is used for carrying out expression feature extraction on the first face and the second face respectively to correspondingly obtain first expression features of the first face and second expression features of the second face, the feature conversion module is used for carrying out feature conversion on the three-dimensional facial features through the first expression features and the second expression features to obtain three-dimensional expression driving features, and the decoding module is used for decoding the three-dimensional driving features to obtain a face changing image. In the scheme, the feature conversion module is further used for extracting driving parameters of the first expression feature to obtain first driving parameters of the first face, extracting driving parameters of the second expression feature to obtain second driving parameters of the second face, and performing feature mapping on the three-dimensional facial feature based on the first driving parameters and the second driving parameters to obtain the three-dimensional expression driving feature. In the above scheme, the feature conversion module is further configured to perform a first feature mapping on the three-dimensional facial feature through the first driving parameter to obtain a three-dimensional facial feature with the first face in an original state, and perform a second feature mapping on the three-dimensional facial feature with the first face in the o