CN-122023669-A - Three-dimensional human body model reconstruction method, system, terminal and storage medium

CN122023669ACN 122023669 ACN122023669 ACN 122023669ACN-122023669-A

Abstract

The application discloses a three-dimensional human body model reconstruction method, a system, a terminal and a storage medium, wherein parameters of a human body parameterized model and a facial parameterized model and camera parameters are acquired, two-dimensional facial key points are extracted to optimize the parameters of the facial parameterized model and construct fine facial grids, an initial three-dimensional human body grid is constructed by combining the parameters of the human body parameterized model, a second two-dimensional key point and a two-dimensional segmentation mask are extracted, the initial three-dimensional human body grid is used as a current three-dimensional human body grid, a human body orientation label and the confidence level thereof are determined, so that the key point weight of the second two-dimensional key point is determined, the current three-dimensional human body grid is projected according to the camera parameters to obtain two-dimensional projection key points and projection contour masks, a loss value is calculated, the updated three-dimensional human body grid is obtained through parameter optimization and is used as the current three-dimensional human body grid, and the steps of determining the human body orientation label and the confidence level thereof are carried out until iteration is terminated. Thus, the accuracy of reconstructing the three-dimensional human body model can be improved.

Inventors

WANG RONGGANG
ZHENG XIAOYUN
WU JIAHAO
LIAO LIWEI
MENG XIANDONG
GAO WEN

Assignees

鹏城实验室

Dates

Publication Date: 20260512
Application Date: 20260403

Claims (10)

1. A method of three-dimensional manikin reconstruction, the method comprising: acquiring parameters of a human body parameterized model, parameters of a facial parameterized model and camera parameters according to an input image; extracting and obtaining a first two-dimensional key point corresponding to the input image through a first preset model, wherein the first two-dimensional key point comprises a two-dimensional face key point; Optimizing parameters of the face parameterization model according to the two-dimensional face key points, and constructing a fine face grid according to the optimized parameters; Constructing an initial three-dimensional human body grid according to the fine facial grid and parameters of the human body parameterized model; Extracting and obtaining a second two-dimensional key point corresponding to the input image and a two-dimensional segmentation mask for indicating a human body region in the input image through a second preset model; Taking the initial three-dimensional human body grid as a current three-dimensional human body grid, and determining a human body orientation label and a confidence corresponding to the human body orientation label according to the current three-dimensional human body grid; According to the human body orientation label and the confidence, determining the key point weight corresponding to each second two-dimensional key point respectively; Projecting the current three-dimensional human body grid to an image plane corresponding to the input image according to the camera parameters to obtain two-dimensional projection key points and projection contour masks; And determining a loss value corresponding to the current three-dimensional human body grid according to the projection contour mask, the two-dimensional segmentation mask, the two-dimensional projection key points, the second two-dimensional key points and the key point weights, performing parameter optimization on the current three-dimensional human body grid based on the loss value to obtain an updated three-dimensional human body grid, taking the updated three-dimensional human body grid as the current three-dimensional human body grid, and returning to execute the step of determining a human body orientation label and a confidence degree corresponding to the human body orientation label according to the current three-dimensional human body grid until a preset iteration optimization termination condition is met, and outputting the target three-dimensional human body grid.
2. The method of claim 1, wherein the first two-dimensional keypoints further comprise two-dimensional human body keypoints; The first preset model comprises a face key point detector for extracting two-dimensional face key points and a gesture estimator for extracting two-dimensional human body key points.
3. The method of reconstructing a three-dimensional mannequin according to claim 1, wherein optimizing parameters of the face parameterized model according to the two-dimensional face keypoints and constructing a fine face mesh according to the optimized parameters comprises: constructing a three-dimensional face grid based on parameters of the face parameterized model; projecting the three-dimensional face grid to an image plane corresponding to the input image according to the camera parameters, and obtaining projection face key points; Optimizing parameters of the face parameterization model according to the position deviation between the projection face key points and the two-dimensional face key points, returning to the step of executing the step of constructing the three-dimensional face grid based on the parameters of the face parameterization model until the preset face optimization iteration termination condition is met, obtaining optimized parameters, and constructing the fine face grid based on the optimized parameters.
4. The three-dimensional manikin reconstruction method according to claim 2, characterized in that said manikin is a SMPL An X model; The parameters of the human body parameterized model comprise body shape parameters, whole body posture parameters and hand posture parameters; the constructing an initial three-dimensional human body grid according to the fine facial grid and the parameters of the human body parameterized model comprises the following steps: Inputting the body shape parameters into the SMPL An X generator to generate an initial grid; using the fine facial grid to replace a head area in the initial grid to obtain a static posture joint grid; And performing linear mixed skin treatment on the static gesture combined grid according to the whole body gesture parameters and the hand gesture parameters to obtain the initial three-dimensional human body grid.
5. The method for reconstructing a three-dimensional human model according to claim 2, wherein determining a human orientation label and a confidence level corresponding to the human orientation label according to the current three-dimensional human mesh comprises: Extracting coordinates of a head center joint point, a left shoulder joint point and a right shoulder joint point in the current three-dimensional human body grid; calculating to obtain a shoulder midpoint coordinate according to the coordinate of the left shoulder joint point and the coordinate of the right shoulder joint point; Determining the depth difference between the head and the shoulder in the three-dimensional human body grid according to the mid-point coordinates of the shoulders and the coordinates of the head center joint point; And determining a human body orientation label from a plurality of preset orientation labels according to the depth difference between the head and the shoulder and a preset depth difference threshold value, and obtaining the confidence corresponding to the human body orientation label, wherein the plurality of preset orientation labels comprise a front face, a side face and a back face.
6. The method of reconstructing a three-dimensional phantom according to claim 5, wherein determining the keypoint weight corresponding to each of the second two-dimensional keypoints according to the human orientation label and the confidence level, respectively, comprises: acquiring a weight adjustment factor matched with the human body orientation label; Determining a key point weight coefficient according to the product of the weight adjustment factor and the confidence coefficient, and determining the key point weight corresponding to each second two-dimensional key point according to the key point weight coefficient; if the human body orientation label is the front, each second two-dimensional key point corresponds to an equal first weight adjustment factor; if the human body orientation label is a side surface, each second two-dimensional key point corresponds to an equal second weight adjustment factor, and the second weight adjustment factors are smaller than the first weight adjustment factors; If the human body orientation label is the back, the body contour key points in the second two-dimensional key points are correspondingly provided with third weight adjustment factors, the other key points are correspondingly provided with fourth weight adjustment factors, the fourth weight adjustment factors are smaller than the third weight adjustment factors, and the third weight adjustment factors are smaller than the second weight adjustment factors.
7. The method according to claim 5 or 6, wherein determining the loss value corresponding to the current three-dimensional human mesh according to the projection contour mask, the two-dimensional segmentation mask, the two-dimensional projection keypoints, the second two-dimensional keypoints, and the keypoint weights comprises: Calculating to obtain contour alignment loss according to the projection contour mask and the two-dimensional segmentation mask; calculating to obtain key point alignment loss according to the two-dimensional projection key points, the second two-dimensional key points and the key point weights; if the human body orientation label is a front surface or a side surface, determining a loss value corresponding to the current three-dimensional human body grid according to the contour alignment loss and the key point alignment loss; If the human body orientation label is a back surface, calculating back surface geometric prior loss according to the current three-dimensional human body grid, and determining a loss value corresponding to the current three-dimensional human body grid according to the back surface geometric prior loss, the contour alignment loss and the key point alignment loss, wherein the back surface geometric prior loss comprises head symmetry loss and body contour smoothness loss, the head symmetry loss is determined according to mirror symmetry errors of head vertexes in the current three-dimensional human body grid, and the body contour smoothness loss is determined based on variances of Euclidean distances between adjacent contour articulation points in the current three-dimensional human body grid.
8. A three-dimensional manikin reconstruction system, said system comprising: The parameter acquisition module is used for acquiring parameters of the human body parameterized model, parameters of the face parameterized model and camera parameters according to the input image; The first extraction module is used for extracting and obtaining first two-dimensional key points corresponding to the input image through a first preset model, wherein the first two-dimensional key points comprise two-dimensional face key points; the first optimization module is used for optimizing parameters of the face parameterization model according to the two-dimensional face key points and constructing a fine face grid according to the optimized parameters; The initial grid construction module is used for constructing an initial three-dimensional human body grid according to the fine facial grid and the parameters of the human body parameterized model; The second extraction module is used for extracting and obtaining a second two-dimensional key point corresponding to the input image and a two-dimensional segmentation mask for indicating a human body region in the input image through a second preset model; The orientation determining module is used for taking the initial three-dimensional human body grid as a current three-dimensional human body grid, and determining a human body orientation label and a confidence corresponding to the human body orientation label according to the current three-dimensional human body grid; The key point weight determining module is used for determining the key point weight corresponding to each second two-dimensional key point according to the human body orientation label and the confidence level; the projection module is used for projecting the current three-dimensional human body grid to an image plane corresponding to the input image according to the camera parameters to obtain two-dimensional projection key points and projection contour masks; And the second optimization module is used for determining a loss value corresponding to the current three-dimensional human body grid according to the projection contour mask, the two-dimensional segmentation mask, the two-dimensional projection key points, the second two-dimensional key points and the key point weights, performing parameter optimization on the current three-dimensional human body grid based on the loss value to obtain an updated three-dimensional human body grid, taking the updated three-dimensional human body grid as the current three-dimensional human body grid, and returning to trigger the orientation determination module to execute the steps of determining a human body orientation label and the confidence corresponding to the human body orientation label according to the current three-dimensional human body grid until a preset iteration optimization termination condition is met, and outputting the target three-dimensional human body grid.
9. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the three-dimensional mannequin reconstruction method according to any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the three-dimensional phantom reconstruction method according to any of claims 1 to 7.

Description

Three-dimensional human body model reconstruction method, system, terminal and storage medium Technical Field The application relates to the technical field of computer vision, in particular to a three-dimensional human model reconstruction method, a three-dimensional human model reconstruction system, a three-dimensional human model reconstruction terminal and a three-dimensional human model storage medium. Background With the development of science and technology, especially the development of computer vision, the application of three-dimensional reconstruction technology is becoming wider and wider. In the process of reconstructing a three-dimensional human model, it is necessary to recover the three-dimensional posture, hand motion and facial expression of a human body from a monocular image or video with high fidelity. At present, a skin multi-person linear model (SMPL, A Skinned Multi-Person Linear Model) or an SMPL-X model of a whole body hand and an expression face is generally adopted as a parameterized human body model, and the three-dimensional human body model reconstruction is realized by controlling the body form, the posture and the basic expression through low-dimensional parameters. However, the parametric human body models have limited parameter space freedom, and insufficient expressive force in high dynamic detail areas such as faces, which is not beneficial to improving the accuracy of three-dimensional human body model reconstruction. Accordingly, the related art has yet to be improved and developed. Disclosure of Invention The application mainly aims to provide a three-dimensional human body model reconstruction method, a system, a terminal and a storage medium, and aims to solve the technical problems that the expressive force is insufficient in high dynamic detail areas such as faces and the like when the three-dimensional human body model is reconstructed based on parameterized human body models such as an SMPL model and an SMPL-X model in the related technology, and the accuracy of the three-dimensional human body model reconstruction is not improved. To achieve the above object, a first aspect of the present application provides a three-dimensional human model reconstruction method, wherein the method includes: acquiring parameters of a human body parameterized model, parameters of a facial parameterized model and camera parameters according to an input image; Extracting and obtaining a first two-dimensional key point corresponding to the input image through a first preset model, wherein the first two-dimensional key point comprises a two-dimensional face key point; Optimizing parameters of the face parameterization model according to the two-dimensional face key points, and constructing a fine face grid according to the optimized parameters; Constructing an initial three-dimensional human body grid according to the fine facial grid and parameters of the human body parameterized model; Extracting and obtaining a second two-dimensional key point corresponding to the input image and a two-dimensional segmentation mask for indicating a human body region in the input image through a second preset model; Taking the initial three-dimensional human body grid as a current three-dimensional human body grid, and determining a human body orientation label and a confidence corresponding to the human body orientation label according to the current three-dimensional human body grid; determining the key point weight corresponding to each second two-dimensional key point according to the human body orientation label and the confidence coefficient; Projecting the current three-dimensional human body grid to an image plane corresponding to the input image according to the camera parameters to obtain two-dimensional projection key points and a projection contour mask; Determining a loss value corresponding to the current three-dimensional human body grid according to the projection contour mask, the two-dimensional segmentation mask, the two-dimensional projection key points, the second two-dimensional key points and the key point weights, performing parameter optimization on the current three-dimensional human body grid based on the loss value to obtain an updated three-dimensional human body grid, taking the updated three-dimensional human body grid as the current three-dimensional human body grid, and returning to execute the step of determining a human body orientation label according to the current three-dimensional human body grid and the confidence corresponding to the human body orientation label until a preset iteration optimization termination condition is met, and outputting the target three-dimensional human body grid. Optionally, the first two-dimensional key points further include two-dimensional human body key points; the first preset model comprises a face key point detector for extracting two-dimensional face key points and a gesture estimator for extracting two-dimensional human b