US-20260124750-A1 - BIPEDAL ACTION MODEL FOR HUMANOID ROBOT

US20260124750A1US 20260124750 A1US20260124750 A1US 20260124750A1US-20260124750-A1

Abstract

The present disclosure provides a control system for a humanoid robot comprising a bipedal action model (BAM) with hierarchical architecture including a beta model executing cognitive tasks at lower frequency, ingesting multimodal sensory inputs including visual data and natural language instructions, and an alpha model executing reactive tasks at higher frequency, communicatively coupled to the beta model. The BAM is trained on retargeted robot training data derived from robot-free training data. At runtime, the BAM outputs continuous control commands as parallel-generated action chunks controlling at least 18 degrees of freedom. The system includes a wearable collection apparatus capturing movement data from a human operator without physical connection to the robot, and a retargeting module translating robot-free training data into robot training data by solving embodiment mismatches between human and robot kinematic structures.

Inventors

Corey Lynch
Toki Migimatsu
Yeygen Chebotar
Michael Ahn
Ivan Babushkin

Assignees

FIGURE AI INC.

Dates

Publication Date: 20260507
Application Date: 20251103

Claims (20)

1 . A control system for a humanoid robot, the system comprising: a bipedal action model (BAM) comprising a hierarchical architecture including: a beta model configured to execute on one or more processors to perform cognitive tasks at a first, lower frequency, the beta model ingesting multimodal sensory inputs including visual data and natural language instructions; and an alpha model configured to execute on one or more processors to perform reactive tasks at a second, higher frequency, the alpha model being communicatively coupled to receive a task-conditioning representation from the beta model; wherein the BAM is trained on a dataset comprising retargeted robot training data derived from robot-free training data; and wherein the BAM is configured to, at runtime, output a sequence of continuous control commands as parallel-generated action chunks to control motion of at least 18 degrees of freedom.
2 . The system of claim 1 , wherein the beta model has a larger number of parameters and a lower operating frequency than the alpha model.
3 . The system of claim 1 , wherein the BAM is deployed in a split configuration, wherein the beta model is executed on a remote AI system and the alpha model is executed on a local AI system physically integrated within the humanoid robot.
4 . The system of claim 1 , wherein the BAM is deployed in a fully local configuration, wherein both the beta model and the alpha model are executed on a local AI system physically integrated within the humanoid robot.
5 . The system of claim 1 , wherein the BAM is deployed in a fully remote configuration, wherein both the beta model and the alpha model are executed on a remote AI system, and wherein the humanoid robot operates as a thin client.
6 . The system of claim 1 , wherein the beta model is configured to output a latent vector, and the alpha model is configured to ingest the latent vector via a cross-attention mechanism to produce the continuous control commands.
7 . The system of claim 1 , wherein the continuous control commands are output as floating-point action vectors and are not selected from a discrete set of binned values.
8 . The system of claim 1 , wherein the robot-free training data was collected using a wearable collection apparatus comprising articulated arms and gloves with integrated sensors.
9 . The system of claim 1 , wherein the robot-free training data was retargeted using a kinematic mapping methodology that enforced a dynamic stability constraint to ensure a center of mass of the humanoid robot remained within a support polygon.
10 . A system for generating a bipedal action model (BAM) for a humanoid robot, the system comprising: a data collection system configured to generate robot-free training data, said data collection system comprising a wearable collection apparatus configured to be worn by a human operator, wherein the wearable collection apparatus includes a plurality of sensors configured to capture movement data of the human operator while the operator performs tasks without a physical or kinematic connection to the humanoid robot; a retargeting module communicatively coupled to the data collection system, the retargeting module comprising one or more processors configured to: receive the robot-free training data; and translate the robot-free training data into retargeted robot training data by applying a motion retargeting methodology to solve an embodiment mismatch between a kinematic structure of the human operator and a kinematic structure of the humanoid robot; and a training subsystem configured to train the bipedal action model (BAM) using the retargeted robot training data, wherein the trained BAM is configured to ingest multimodal sensory inputs and output continuous control commands to control a plurality of degrees of freedom of the humanoid robot.
11 . The system of claim 10 , wherein the wearable collection apparatus comprises: a base mount configured to be worn on a torso of the human operator; a pair of articulated arms pivotably attached to the base mount, each articulated arm comprising a plurality of rigid links coupled by sensor joints; and a pair of gloves, each glove coupled to a distal end of one of the articulated arms.
12 . The system of claim 11 , wherein the sensor joints of the articulated arms (S1-S7) are configured to substantially correspond with a relative location and orientation of actuators (J1-J7) of an arm assembly of the humanoid robot.
13 . The system of claim 11 , wherein each glove includes a plurality of hand position sensors configured to capture kinematic data of the operator's fingers, said hand position sensors comprising a plurality of mechanical linkages, wherein each mechanical linkage couples a fingertip receptacle to a respective finger encoder.
14 . The system of claim 13 , wherein each mechanical linkage comprises a deformable member configured to bend more easily in a first curling direction than in a second lateral direction.
15 . The system of claim 11 , wherein each glove includes a plurality of hand position sensors comprising an electromagnetic field (EMF) source configured to generate a controlled magnetic field and a plurality of magnetic sensors configured to detect the magnetic field, and wherein the system is configured to determine a position and rotation of the operator's fingers by analyzing signal strength attenuation or phase difference.
16 . The system of claim 11 , wherein each glove further comprises a plurality of motors configured to provide haptic feedback to the human operator.
17 . The system of claim 10 , wherein the retargeting module is configured to translate the robot-free training data using a kinematic mapping methodology by solving an inverse kinematics (IK) problem to match task-space positions of the human operator's end-effectors to corresponding end-effectors of the humanoid robot.
18 . The system of claim 17 , wherein the kinematic mapping methodology solves the IK problem subject to a plurality of constraints, said constraints including joint angle limits, self-collision avoidance, and a dynamic stability constraint operative to ensure a center of mass (CoM) of the humanoid robot remains within a support polygon.
19 . The system of claim 10 , wherein the retargeting module is configured to translate the robot-free training data using a learning-based methodology, said methodology comprising an encoder-decoder neural network trained to disentangle domain-invariant motion information from domain-specific performer information.
20 . The system of claim 10 , wherein the continuous control commands are output as floating-point action vectors, and wherein said commands control at least 18 degrees of freedom of the humanoid robot.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of and priority to U.S. Provisional Patent Application Nos. 63/715,270, filed Nov. 1, 2024, 63/722,057, filed Nov. 18, 2024, 63/725,279, filed Nov. 26, 2024, 63/753,670, filed Feb. 4, 2025, 63/760,617, filed Feb. 19, 2025, 63/776,429, filed Mar. 24, 2025, 63/801,451, filed May 7, 2025, 63/819,533, filed Jun. 6, 2025, 63/860,403, filed Aug. 8, 2025, 63/860,580, filed Aug. 8, 2025, 63/905,666, filed Oct. 26, 2025 and 63/905,711, filed Oct. 26, 2025, each of which is expressly incorporated by reference herein in its entirety. TECHNICAL FIELD This disclosure relates to systems, methods, and techniques for developing and deploying a bipedal action model (BAM) to control a humanoid robot. The humanoid robot includes a plurality of hardware and software components that are configured to substantially mimic the movements, functionality, and capabilities of a human. BACKGROUND The field of robotics has long pursued the goal of creating humanoid robots capable of performing complex tasks in unstructured, human-centric environments. A significant challenge in this pursuit is the development of control systems that can manage the vast number of degrees of freedom (DoF) inherent in a humanoid form. Conventional robotic control systems have traditionally been limited in their scope and capability. Many existing models are narrowly focused, designed to control only a specific part of the robot, such as a 7-DoF end-effector or arm. This approach effectively treats the robot as a disembodied limb, failing to coordinate the entire body. As a result, such systems cannot perform actions that require dynamic balance, postural adjustments, or the use of the torso and legs to extend reach and navigate obstacles. The movements produced are often rigid and limited to a constrained set of pre-programmed motions. Furthermore, a common deficiency in conventional systems is their reliance on generating discrete, or “binned,” action outputs. This method breaks down continuous motion into a finite set of poses or commands. The result is often jerky, imprecise, and unnatural movement, akin to a video with a low frame rate. This discretization introduces compounding errors over time, causing the robot to deviate from its intended path and struggle with tasks requiring fluid, continuous adjustments. These systems lack the temporal consistency needed for smooth, long-horizon tasks and are not robust enough to adapt to the unpredictable nature of real-world environments. Therefore, a significant need exists for a more advanced control architecture that can overcome these fundamental limitations. There is a demand for a system that can provide comprehensive, whole-body control over a high-degree-of-freedom humanoid robot and generate continuous, real-time control outputs to produce fluid, human-like motion, thereby enabling more effective and reliable performance in complex, dynamic settings. SUMMARY The presently disclosed subject matter is directed to a control system for a humanoid robot. The system comprises a bipedal action model (BAM) comprising a hierarchical architecture including a beta model configured to execute on one or more processors to perform cognitive tasks at a first, lower frequency, the beta model ingesting multimodal sensory inputs including visual data and natural language instructions, and an alpha model configured to execute on one or more processors to perform reactive tasks at a second, higher frequency, the alpha model being communicatively coupled to the beta model. The BAM is trained on a dataset comprising retargeted robot training data derived from robot-free training data. The BAM is configured to, at runtime, output a sequence of continuous control commands as parallel-generated action chunks to control a full-body motion of the humanoid robot, said full-body motion comprising at least 18 degrees of freedom. The presently disclosed subject matter is directed to a system for generating a bipedal action model (BAM) for a humanoid robot. The system comprises a data collection system configured to generate robot-free training data, said data collection system comprising a wearable collection apparatus configured to be worn by a human operator, wherein the wearable collection apparatus includes a plurality of sensors configured to capture movement data of the human operator while the operator performs tasks without a physical or kinematic connection to the humanoid robot. The system comprises a retargeting module communicatively coupled to the data collection system, the retargeting module comprising one or more processors configured to receive the robot-free training data and translate the robot-free training data into retargeted robot training data by applying a motion retargeting methodology to solve an embodiment mismatch between a kinematic structure of the human operator and a kinematic structure of the humanoid robot. The system comp