CN-121989231-A - Human robot whole body interaction control method and system based on physical perception redirection and space-time decoupling strategy

CN121989231ACN 121989231 ACN121989231 ACN 121989231ACN-121989231-A

Abstract

The invention discloses a human-shaped robot whole-body interaction control method and system based on physical perception redirection and space-time decoupling strategies, wherein the method firstly converts large-scale human-human interaction data into human-shaped robot interaction tracks which keep contact semantics and physical consistency through a physical perception interaction redirection module, and realizes accurate contact maintenance through multi-constraint objective function and two-stage optimization; the method comprises the steps of constructing a structured data set with stage marks, then utilizing a space-time decoupling action inference network, fusing long-short time sequence codes, phase attention and multi-scale space codes to generate a high-level action anchor point, and finally converting the high-level action anchor point into an executable command through a whole body controller. The method solves the problems of the traditional method such as contact semantic damage, insufficient space-time reasoning, difficult migration of the simulation real machine, and the like, remarkably improves interaction naturalness, robustness and practicality, and is suitable for various social interaction scenes such as hugging, handshake and the like.

Inventors

ZHENG WEISHI
HUANG WEIJIN
ZHANG YUEYI
XIA ZHIWEI
WEI YILIN

Assignees

中山大学

Dates

Publication Date: 20260508
Application Date: 20251231

Claims (10)

1. The human-shaped robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategies is characterized by comprising the following steps of: The method comprises the steps of obtaining large-scale human-human interaction data, converting the human-human interaction data into a human-shaped robot interaction track which keeps contact semantics and physical consistency through a physical perception interaction redirection module, and obtaining the human-shaped robot interaction track by constructing a total objective function which comprises kinematic similarity loss, contact semantics keeping loss, human side action fidelity loss and gesture regular terms and solving the total objective function through a two-stage optimization strategy; performing task division and stage labeling on the obtained human-humanoid robot interaction track, wherein the labeling stage comprises a preparation stage, an execution stage and a ending stage, and constructing a structured interaction data set for strategy learning; Based on the structured interaction data set, a time-space decoupling action reasoning network is utilized to learn a multi-task whole-body interaction strategy, wherein the time-space decoupling action reasoning network obtains time characteristics through coding history observation of a long-short time sequence coder, generates stage perception characteristics through a phase attention module, obtains space characteristics through coding a man-machine geometrical relationship through a multi-scale space module, fuses the time characteristics, the stage perception characteristics, the space characteristics and text task instruction embedded vectors, and inputs a diffusion type planning head to generate a high-level action anchor point sequence; And densifying the high-level action anchor point sequence into a reference track, and converting the reference track into an executable control command of a simulation or a real humanoid robot through a whole-body controller to realize whole-body interaction of the humanoid robot.
2. The humanoid robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategies according to claim 1, wherein the total objective function is: ; Wherein, the In order to achieve a loss of kinematic similarity, In order to keep the semantics lost in contact, For the loss of fidelity of the motion on the person side, Is a regular term of the gesture, 、、、 The weights of the loss terms are respectively given.
3. The humanoid robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategies according to claim 2, wherein the kinematic similarity loss is defined as follows: wherein T is the time length of the interaction sequence, D is the number of degrees of freedom of the robot, For the joint angle of the robot at time t, The joint angle of the human body reference gesture sequence at the time t is matched with the length of the robot skeleton; the contact semantic retention penalty is defined as follows: ; Wherein, the Is a paired distance matrix of time t in the original human-human interaction, To optimize the pair-wise distance matrix at time t in the post man-machine interaction, Is the Frobenius norm; the human-side motion fidelity loss is defined as follows: ; Wherein, the The original position of the human upper limb joint j at the time t, In order to optimize the position after the position, Index set for upper limb joint; the gesture regular term is Wherein In order for the time to be lost to smoothing, The gesture amplitude is regular, and alpha and beta are super parameters; The time smoothing loss is defined as: Wherein, the At the time of the discrete velocity(s), In the case of a discrete acceleration rate, Is a balance weight coefficient; the gesture amplitude is defined as: Wherein, the The joint angle of the d-th degree of freedom of the robot at the time t.
4. The humanoid robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategies according to claim 1, wherein the two-stage optimization strategy is specifically: The first stage carries out global optimization on the total objective function by using medium contact weight to obtain an initial solution; And in the second stage, the weight of the contact semantic retention loss is increased, and a small amount of iterative optimization is performed on the track.
5. The human-shaped robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategies according to claim 1, wherein the long-time characteristic is obtained by respectively encoding a long window and a short window by taking the current time t of the long-time sequence encoder as the center And short-term features And fused into a unified temporal feature through a learning mapping phi () ; The phase attention module outputs phase probability through a phase classifier P is the interaction stage set P= { preparation, execution and ending }, and the time characteristics are processed through the expert network corresponding to each stage, and the stage perception characteristics are obtained through weighted fusion , wherein, Expert output for phase phi; The multi-scale space module divides the space between the man-machine into three areas of near field, midfield and far field according to the distance, and respectively encodes information such as relative position, distance and orientation for each area to obtain multi-scale space characteristics and fuses the multi-scale space characteristics into: Wherein the method comprises the steps of 、、 Representing spatial coding functions for different distance scales, respectively, wherein An observation vector at time t; the observation vector is expressed as: Wherein, the For the translation of the human SMPL root, Is the position of the three-dimensional joint, Is in the state of a robot body.
6. The humanoid robot whole-body interaction control method based on the physical perception redirection and space-time decoupling strategy according to claim 5, wherein the input of the diffusion programming head is represented as a conditional vector: , wherein, An embedded vector for a text task instruction; the high-level action anchor point sequence is expressed as Wherein each anchor point Including root pose increment Root gesture quaternion Whole body joint angle vector 。
7. The humanoid robot whole-body interaction control method based on the physical perception redirection and the space-time decoupling strategy according to claim 6, wherein the training total objective of the space-time decoupling action inference network is as follows: ; Wherein, the In order to achieve the high-rise motion loss, In order to monitor the loss of phase, In order to achieve a geometric auxiliary loss, 、、 Is an adjustable weight.
8. The humanoid robot whole-body interaction control method based on physical perception redirection and space-time decoupling strategy according to claim 7, wherein the high-level motion loss is defined as: ; Wherein, the middle part 、、 A teacher action anchor point obtained by sampling from the redirection data set; 、、 An anchor point for predicting actions; 、、 Is a weight coefficient; The geometric assistance penalty includes a human orientation penalty and a keypoint penalty, the human orientation penalty defined as: ; Wherein, the For the current forward vector of the robot, Is a vector directed from the robot to the human; the keypoint location penalty is defined as: Wherein the method comprises the steps of And The three-dimensional positions of the key points k in the predicted and target trajectories, respectively.
9. The humanoid robot whole-body interaction control method based on the physical perception redirection and space-time decoupling strategy according to claim 1, wherein the densification adopts a mode of time interpolation and stride fusion specifically comprises the following steps: for a real humanoid robot, the root pose prediction is converted into a plane root speed instruction, and active braking logic based on the residual distance is combined; and in the preparation stage and the ending stage, the upper body gesture is subjected to smooth interpolation, and in the execution stage, the key interaction action is kept unchanged.
10. A humanoid robot system, characterized in that it comprises: At least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores computer program instructions executable by the at least one processor to enable the at least one processor to perform the humanoid robot whole-body interaction control method based on physical awareness redirection and spatiotemporal decoupling strategies of any of claims 1-9.

Description

Human robot whole body interaction control method and system based on physical perception redirection and space-time decoupling strategy Technical Field The invention belongs to the technical field of robot control, and particularly relates to a humanoid robot whole-body interaction control method and system based on physical perception redirection and space-time decoupling strategies. Background Along with the rapid development of humanoid robot technology, the humanoid robot with the humanoid structure and the high-freedom-degree joint gradually breaks through the limitation of single-machine operation, is widely integrated into human life and work scenes, and needs to realize natural cooperation and social interaction with human beings through whole-body physical interaction behaviors such as hugging, handshake, clapping, deliberation and the like. The core requirements of the man-human robot interaction (HHoI) are naturalness, safety and generalization capability, and the key support is to acquire large-scale and diversified interaction data and train a stable and reliable whole-body interaction strategy based on the interaction data. Currently, two technical paths exist for acquiring interactive data of a humanoid robot. The first path is to collect man-machine interaction data in a teleoperation or teaching mode, wherein an operator directly controls the humanoid robot through tools such as exoskeleton equipment, a control handle and the like, and records corresponding interaction actions. Although the method can ensure certain action precision, the method has the obvious defects that the method not only needs professional hardware equipment support, but also needs a great amount of manual time to operate, so that the data acquisition cost is high and the efficiency is low, and meanwhile, safety risks such as man-machine collision exist in the operation process, and the method is difficult to expand to the construction of a large-scale data set covering various interaction types and various human body types. The second path is to indirectly obtain human-machine interaction training data by mapping human actions to a humanoid robot through a reverse kinematics or learning type redirection method by utilizing existing large-scale human-human interaction (HHI) data. However, there are inherent differences between human and shape robots in physiological and structural characteristics such as height, limb proportion, joint movement range, etc., and conventional redirection methods often only use similarity of joint trajectories or postures as an optimization target, and do not use a "contact relationship" as an explicit constraint. In the scene with higher requirements on contact precision such as hand holding, hug and the like, the problems that a robot often has ' hands are not aligned with each other ', the hug becomes a shoulder-beating ' or even a ' blank wave ' and the like occur, the interaction semantics are seriously damaged, and the actual application requirements cannot be met. In addition, the existing strategy learning method still has obvious defects that most methods only perform imitative learning in the average sense of a high-dimensional state-action space, and the two core sub-problems of 'when interaction is initiated' and 'in what space region contact is realized' are not explicitly distinguished in an algorithm structure. The lack of effective representation of interaction stage makes it difficult for the strategy to grasp the timing of active contact and stable evacuation, and the lack of multi-scale space coding capability results in that the strategy cannot reasonably plan the approach path at long distance, and is difficult to precisely align the contact position at short distance, and high-quality interaction cannot be stably realized in both time and space dimensions, and is easy to fail when facing the change of human body type and action rhythm. Meanwhile, the simulation environment and the real robot platform have inherent differences, namely, the simulation environment and the real robot platform are obviously different in terms of a dynamics model, a control interface (such as the adaptation of joint angle and root speed), sensing noise, delay and the like. If a layered control and robust control mechanism designed for real machine conditions is lacking, a good whole body control strategy is only expressed in a simulation environment, and problems such as action jitter, overshoot, hysteresis and even instability often occur when the robot is migrated to a real robot, so that reliable landing is difficult to realize. In summary, how to convert large-scale HHI data into high-quality data which can be directly used for HHoI strategy training on the premise of fully guaranteeing physical consistency and contact semantics, and construct a whole-body interaction strategy with space-time decoupling reasoning capability, and meanwhile solve the migration adaptation problem from sim