CN-115375850-B - 3D dense face alignment model construction method, system and readable storage medium

CN115375850BCN 115375850 BCN115375850 BCN 115375850BCN-115375850-B

Abstract

The invention discloses a 3D dense face alignment model construction method, a system and a readable storage medium, wherein the method comprises the steps of preprocessing face images in a face data set to obtain face deformation information, establishing a multi-scale dual-attention basic block and a similar information enhancement module to establish a 3D face regression model by using the face deformation information, wherein the 3D face regression model at least comprises the multi-scale dual-attention basic block and the similar information enhancement module, and supervising and training the 3D face regression model by using a preset loss function to obtain the 3D dense face alignment model. The invention provides an end-to-end training efficient model for further improving the accuracy of 3D dense face alignment tasks based on a deep learning technology with strong fitting capability and a face deformation information enhancement technology, and can enhance the 3D face corresponding to the face in an input image by using the face deformation information to obtain a more accurate 3D dense face alignment model frame.

Inventors

LUO YU
YANG CHAOLIN
LING JIE

Assignees

广东工业大学

Dates

Publication Date: 20260508
Application Date: 20220829

Claims (4)

1. The 3D dense face alignment model construction method is characterized by comprising the following steps of: cutting and converting the face image into a uniform size according to the position information of the face feature points; calculating and storing three-dimensional vertex information of the human face into the UV map based on target parameters in the human face data set to obtain a UV position map; and performing difference operation on the UV position map and the average face model after projection transformation to obtain face deformation information after projection transformation, and storing the face deformation information in a UV map form, wherein the formula is as follows: ; Wherein, the Representing the face deformation information after projection transformation, Representing a three-dimensional set of vertices of a face, Is a scaling factor that is used to scale the image, And Is a rotation matrix and a translation vector which are obtained in the face data set through target parameter calculation, Is the set of vertices on the average face model, Is the deformation information of the face shape in the face image; establishing a multi-scale dual-attention basic block and a similar information enhancement module to establish a 3D face regression model by utilizing face deformation information, wherein the 3D face regression model is composed of two networks of Encoder-Decoder structures in parallel; generating supervision data of network training based on the three-dimensional vertex information of the face; Using loss functions Monitoring regression results of 3D face branching network and utilizing loss function Monitoring the regression result of the human face deformation information branch network; , Representative of Or alternatively , And Respectively denoted as the height and width of the UV map; Representing the position coordinates in the UV space as Three-dimensional coordinate information of model prediction of points of (3) symbol Supervision data of the corresponding item represented; The position coordinates of the representation are A preset weight value of the point of (a); Using loss functions Monitoring an alignment result of the 3D face on a face alignment task; Supervised training of a 3D face regression model to obtain a loss function for a 3D dense face alignment model The calculation formula is as follows: Wherein the loss factor =1、 =1、 =0.5; Applying a constraint term to key feature points (landmarks) of the face to the regressed 3D face The representation is: ; Wherein, n=68, The corresponding UV space position coordinates of the representation are Three-dimensional coordinate information of landmark points on the 3D face predicted by the model of (c), Is the corresponding UV space position coordinate in the supervision data is Is a three-dimensional coordinate information of landmark points on the 3D face.
2. The method for constructing a 3D dense face alignment model according to claim 1, wherein the establishing a multi-scale dual-attention basic block and a similar information enhancement module specifically includes: The multi-scale double-attention basic block is formed by combining a 1X 1 convolution block, a 3X 3 convolution block, a 5X 5 convolution block, an SE channel attention module and an SGE space grouping attention module; after the face deformation information after projection transformation is stored in a UV (ultraviolet) mapping form, the step of establishing the similar information enhancement module comprises the following steps of: Multiplying the face deformation information subjected to projection transformation by the face three-dimensional vertex set to obtain an inner product, and sending the inner product result to a Sigmoid activation module to obtain similarity probability distribution; Obtaining the inner product of the similarity probability distribution and the three-dimensional vertex set of the human face to obtain similar information points on the three-dimensional vertex set of the human face, and adding the similar information points and the three-dimensional vertex set of the human face to complete the establishment of the similar information enhancement module.
3. The 3D dense face alignment model construction system is characterized by comprising a memory and a processor, wherein the memory comprises a 3D dense face alignment model construction method program, and the 3D dense face alignment model construction method program realizes the following steps when being executed by the processor: cutting and converting the face image into a uniform size according to the position information of the face feature points; calculating and storing three-dimensional vertex information of the human face into the UV map based on target parameters in the human face data set to obtain a UV position map; and performing difference operation on the UV position map and the average face model after projection transformation to obtain face deformation information after projection transformation, and storing the face deformation information in a UV map form, wherein the formula is as follows: ; Wherein, the Representing the face deformation information after projection transformation, Representing a three-dimensional set of vertices of a face, Is a scaling factor that is used to scale the image, And Is a rotation matrix and a translation vector which are obtained in the face data set through target parameter calculation, Is the set of vertices on the average face model, Is the deformation information of the face shape in the face image; establishing a multi-scale dual-attention basic block and a similar information enhancement module to establish a 3D face regression model by utilizing face deformation information, wherein the 3D face regression model is composed of two networks of Encoder-Decoder structures in parallel; generating supervision data of network training based on the three-dimensional vertex information of the face; Using loss functions Monitoring regression results of 3D face branching network and utilizing loss function Monitoring the regression result of the human face deformation information branch network; , Representative of Or (b) , And Respectively denoted as the height and width of the UV map; Representing the position coordinates in the UV space as Three-dimensional coordinate information of model prediction of points of (3) symbol Supervision data of the corresponding item represented; The position coordinates of the representation are A preset weight value of the point of (a); Using loss functions Monitoring an alignment result of the 3D face on a face alignment task; Supervised training of a 3D face regression model to obtain a loss function for a 3D dense face alignment model The calculation formula is as follows: Wherein the loss factor =1、 =1、 =0.5; Applying a constraint term to key feature points (landmarks) of the face to the regressed 3D face The representation is: ; Wherein, n=68, The corresponding UV space position coordinates of the representation are Three-dimensional coordinate information of landmark points on the 3D face predicted by the model of (c), Is the corresponding UV space position coordinate in the supervision data is Is a three-dimensional coordinate information of landmark points on the 3D face.
4. A computer readable storage medium, characterized in that a 3D dense face alignment model building method program is included in the computer readable storage medium, which 3D dense face alignment model building method program, when executed by a processor, implements the steps of a 3D dense face alignment model building method according to any one of claims 1 to 2.

Description

3D dense face alignment model construction method, system and readable storage medium Technical Field The invention relates to the technical field of computer vision, in particular to a 3D dense face alignment model construction method, a system and a readable storage medium. Background The initial face alignment task, generally referred to as the process of locating the key facial feature points (landmarks) of the 2D of an input face image. However, the 2D face alignment method is beginning to be more focused by researchers due to the lack of detection capability of landmarks invisible in the input image. With the development of the 3D face alignment task, people are not limited to the detection of landmarks on the face, but further detect all the characterization points on the whole face, that is, implement the dense point alignment relationship on the 3D face. 3D dense face alignment plays a great role in many fields such as face recognition, face tracking, face animation, and human-computer interaction. The existing 3D dense face alignment methods can be roughly divided into two types, namely a first type of method for obtaining corresponding geometric information by fitting a 3D face registered in advance such as 3DMM and the like, and the method is also called a model-based method. The 3DMM generalizes the whole face attribute into a linear combination of the face shape and the face texture through PCA dimension reduction, so as to obtain a 3D face template. The original traditional method realizes dense correspondence between the 3D face and the input face image by calculating loss of the 3D face template and the input face image in a synthetic analysis mode. The rapid development of deep learning technology in recent years greatly optimizes the fitting flow based on a model method. For example, the 3DDFA method uses a cascaded CNN network to directly regress the face parameters of the 3DMM to solve the problem of dense alignment of the 3D face, and achieves quite good effect. However, for model-based methods, which are limited in their linear combination of 3-dimensional face templates predefined in potential space, resulting in limited expressive power, many researchers began using deep neural networks to directly predict specific expressions of 3-dimensional information corresponding to input face images, such as UV maps, volumn, etc., which are also known as model-free methods. PRN is a representative example of a model-free method in recent years, and PRN uses a UV position map corresponding to an input face image predicted by an encoder-decoder network based on ResNet's deep neural network. The UV position mapping stores the 3-dimensional position information of the input face as RGB information values thereof, and can realize end-to-end training of the spatial position information of the corresponding 3-dimensional vertex of the input face by returning the UV position mapping, thereby realizing efficient 3D dense face alignment. Disclosure of Invention The invention aims to provide a 3D dense face alignment model construction method, a system and a readable storage medium, and provides a model framework for realizing more accurate 3D dense face alignment by utilizing deformation information of faces to enhance 3D faces corresponding to the faces in an input image. The invention provides a method for constructing a 3D dense face alignment model, which comprises the following steps: Preprocessing a face image in a face data set to obtain face deformation information; establishing a multi-scale dual-attention basic block and a similar information enhancement module to establish a 3D face regression model by utilizing the face deformation information, wherein the 3D face regression model at least comprises the multi-scale dual-attention basic block and the similar information enhancement module; And utilizing a preset loss function to supervise and train the 3D face regression model to obtain the 3D dense face alignment model. In this scheme, the preprocessing is performed on the face image in the face data set to obtain face deformation information, which specifically includes: Cutting the face image according to the position information of the face feature points so as to convert the face image into a uniform size; Calculating and storing three-dimensional vertex information of the face into the UV map based on target parameters in the face data set to obtain a UV position map; And carrying out projection transformation according to the target parameters based on the average face model in the face data set to obtain the face deformation information after projection transformation, and storing the face deformation information after projection transformation in a UV map form. In this scheme, the difference value operation is performed on the UV position map and the average face model after projective transformation to obtain the face deformation information after projective transformation, and the