CN-121998814-A - Three-dimensional character skeleton-free gesture migration method based on point cloud

CN121998814ACN 121998814 ACN121998814 ACN 121998814ACN-121998814-A

Abstract

A three-dimensional role frameless gesture migration method based on point cloud comprises the steps of extracting global gesture features from source gesture point cloud through a gesture style encoder, modulating network normalization layer parameters of a self-adaptive gesture fusion module by utilizing the global gesture features, fusing target T gesture point cloud and global gesture features, predicting preliminary rotation and displacement, calculating self-attention scores by utilizing a target T gesture grid through an attention-based deformation module, carrying out fine adjustment on the preliminary rotation and displacement, obtaining migrated target gesture point cloud, and generating target role gesture grids after gesture migration together with the target T gesture grids. The invention can complete migration only by the source gesture point cloud and the target T gesture grid without relying on a framework, skin weight, the source T gesture grid or the same topological structure. The invention does not depend on additional labeling information in the training and reasoning stages, can realize high-fidelity posture migration for various stylized three-dimensional roles, and has stronger generalization capability and practicability.

Inventors

TANG JIE
YAN JIAQI
LIU JIE
WU GANGSHAN

Assignees

南京大学

Dates

Publication Date: 20260508
Application Date: 20260115

Claims (7)

1. The three-dimensional character skeleton-free gesture migration method based on the point cloud is characterized by migrating the gesture of a source character to a target character, and comprises the following steps of: Acquiring a source attitude point cloud and a target T-attitude grid; extracting features of the source attitude point cloud through an attitude style encoder to obtain an attitude feature representation; Inputting the gesture feature representation and target T gesture point cloud formed by all vertex coordinates of a target T gesture grid into an adaptive gesture fusion module, wherein the adaptive gesture fusion module adopts an adaptive normalization layer after a convolution layer, dynamically adjusts the mean value and variance of the target T gesture point cloud by using the gesture feature, injects the gesture feature representation into the space feature of the target role grid, and predicts the initial quaternion and displacement of each vertex of the target role grid; The output of the self-adaptive gesture fusion module and the input of the target T gesture grid are based on the deformation module of the attention, the self-attention mechanism is utilized to model the correlation between each vertex pair of the target T gesture grid, and the self-attention score is calculated and the initial quaternion and displacement are corrected according to the topological and spatial relationship of the target T gesture grid; And outputting a target role gesture point cloud according to the corrected result, and applying the target role gesture point cloud to the vertices of the target T gesture grid to obtain the target role grid with the active gesture.
2. The three-dimensional character skeleton-free posture migration method based on the point cloud of claim 1, wherein the step of constructing a deep neural network LPS is realized by three modules, namely a posture feature encoder Self-adaptive posture fusion module And an attention-based deformation module ; Let the source character be represented in the form of a point cloud in the pose p2, noted as Wherein The target character is expressed in a grid form under the T gesture p1 and is recorded as Wherein To define a set of triangular patches of grid connection relationships, As a set of vertex coordinates, The goal of the deep neural network LPS is to generate a mesh with a source pose p2 and a target character shape s1 Its dough sheet structure Consistent with the input target mesh, only the vertex coordinates change, formally expressed as: Gesture style encoder From source pose point cloud Extracting posture style characteristics from the Chinese character , For a preset feature dimension, the method is expressed as follows: Self-adaptive attitude fusion module With target mesh vertices And gesture features As input, an initial quaternion rotation for each vertex is predicted And displacement Expressed as: attention-based deformation module Using vertex coordinates of a target mesh Dough kneading sheet structure The initial transformation is refined through a self-attention mechanism, expressed as: obtaining final predicted vertex coordinates with new poses 。
3. The point cloud based three-dimensional character frameless pose migration method of claim 2, wherein the pose style encoder And constructing a Set Abstraction module based on PVCNN ++ model, and extracting local and global features of the source point cloud.
4. The three-dimensional character skeleton-free posture migration method based on point cloud as claimed in claim 2, wherein the self-adaptive posture fusion module The backbone network is based on PVCNN ++ architecture, after each of the Set abstract extraction and feature propagation Feature Propagation blocks shares the multi-layer perceptron SHAREDMLP and the point-voxel convolution layer PVConv layers, an adaptive group normalization layer ADaGN is introduced, the adaptive group normalization layer comprises a linear mapping layer and a group normalization layer, the linear mapping layer parameters in each ADaGN layer are not shared with each other, and the gesture features Firstly, the global attitude feature is mapped into the normalized modulation parameter of the linear mapping layer, and then the transformed feature vector is used for dynamically adjusting the mean mu and variance sigma parameters of the normalized layer of the group.
5. The method for point cloud based three-dimensional character skeleton-free gesture migration of claim 2, wherein the attention-based deformation module first passes through a multi-layer perceptron Calculation of key vectors directly from vertex coordinates : Then constructing a query network based on edge convolution The vertex coordinates and the grid patch structure are taken as input, and the query vector which is integrated with the geometric topological feature is obtained through calculation by aggregating neighborhood information Wherein For potential spatial dimensions: By calculating the dot product of the query vector and the key vector and scaling the dot product by a scaling factor Applying a softmax function after adjustment to obtain an attention score matrix Matrix The interaction force between vertexes based on geometry and topology is characterized: and carrying out weighted summation on the initial quaternion and the displacement by using an attention score matrix to obtain refined transformation: Finally, the weighted quaternion is added Conversion to a rotation matrix And with displacement Together applied to initial target mesh vertices Obtaining final predicted vertex coordinates with new gestures 。
6. The three-dimensional character skeleton-free posture migration method based on the point cloud according to claim 2, wherein when the deep neural network LPS is trained, a weighted sum of reconstruction loss, side length loss and side direction loss is adopted as a total loss function, and deviation of the prediction grid and the real grid in the vertex position, side length and side direction is restrained.
7. The three-dimensional character skeleton-free posture migration method based on the point cloud according to claim 6, wherein the loss function is specifically: (1) Reconstruction loss, namely calculating the L1 distance between the predicted vertex and the real vertex, and restraining the absolute position of the vertex: (2) Calculating the sum of absolute values of the length differences of the corresponding sides of the prediction grid and the real grid, and keeping the whole proportion and the local detail of the grid: Wherein the method comprises the steps of To be assembled from dough sheets A set of all edges extracted in the process; (3) The edge direction loss is that the cosine similarity between the predicted edge and the real edge vector is calculated, the direction consistency of the edge is restrained, and the naturalness of deformation and the structural integrity of the grid are improved: The total loss function is a weighted sum of the three losses: 。

Description

Three-dimensional character skeleton-free gesture migration method based on point cloud Technical Field The invention belongs to the technical field of three-dimensional computer vision and graphics, relates to human body posture modeling and three-dimensional character animation based on point cloud, and particularly relates to a three-dimensional character skeleton-free posture migration method based on point cloud. Background Pose migration of three-dimensional characters is a core task in computer graphics and animation, with the goal of assigning the pose of one character (the source character) to another character (the target character), which may have a different shape. Conventional methods typically rely on pre-defined skeleton structures and skin weights to drive mesh deformation by controlling the motion of the skeleton. However, for stylized, non-humanoid or topologically complex roles, manually creating skeleton and skin weights is a complex, time-consuming and expertise-demanding task, which greatly limits the versatility and application efficiency of such methods. A source pose point cloud (Source Point Cloud) refers to a point cloud representation of a source character at a given pose for providing pose drive information. The Target T-pose point cloud (Target T-pose Point Cloud) refers to a point cloud representation of the Target character in a binding pose (T-pose), and is composed of all vertex coordinates of a Target T-pose grid, and is used for representing an initial shape state of the Target character and serving as network input. The method and the system are used for extracting and fusing gesture features so as to predict deformation parameters of the vertices of the target grid and realize migration of the target role from the T gesture to the gesture corresponding to the source gesture. In order to overcome the dependence on the skeleton, the prior art proposes some skeleton-free gesture migration methods. For example, reference [2] is based on a neural posture migration method (NPT) of spatial adaptive instance normalization, reference [3] is oriented to a skeletonized three-dimensional role and has a posture migration method (SPT), reference [4] is based on a weak supervision three-dimensional posture migration method (WS 3 DPT) of key points, reference [5] is based on an unsupervised three-dimensional posture migration method (X-DualNet) of cross consistency and double reconstruction, and reference [6] is a neural mixed shape posture migration method (NBS) and the like. However, the method still has significant limitations that one type of method has limitations on the mesh topology/vertex sequence of the source character and the target character, which is difficult to meet in practical application, for example, NPT is only suitable for migration between characters with consistent topology, another type of method still needs to additionally input the gesture mesh of the source character and the corresponding T gesture mesh as an aid, so that the complexity of data preparation is increased, for example, the gesture input of SPT is composed of a source gesture mesh and a source T gesture mesh, and the training stage also needs monitoring information such as skin weight, which is difficult to be suitable in a scene lacking the T gesture data of the source character, and the other method relies on human body key points/human body shape priori, for example, WS3DPT monitors based on joint key points of the SMPL human body parameterization model, and has limited applicability when facing the stylized character or the accessory. On the other hand, the high-quality and complete three-dimensional grid with the gesture is directly obtained from a real scene such as a depth camera and a laser radar, the calculation cost is high, errors are prone to occur, and the three-dimensional gesture information can be more quickly and conveniently captured and truly expressed by the point cloud data. Therefore, the method for completing high-fidelity gesture migration only by using the source gesture point cloud and the target character grid is developed, and has important theoretical research significance and practical application value. Reference to the literature [1] Liu, Zhijian, et al. "Point-voxel cnn for efficient 3d deep learning." Advances in neural information processing systems 32 (2019). [2] Wang, Jiashun, et al. "Neural pose transfer by spatially adaptive instance normalization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. [3] Liao, Zhouyingcheng, et al. "Skeleton-free pose transfer for stylized 3d characters." European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022. [4] Chen, Jinnan, Chen Li, and Gim Hee Lee. "Weakly-supervised 3d pose transfer with keypoints." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023. [5] Song, Chaoyue, et al. "Unsupervised 3d pose transfer with cross consistency an