CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation

CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Long-horizon whole-body humanoid teleoperation remains challenging due to accumulated global pose drift, particularly on full-sized humanoids. Although recent learning-based tracking methods enable agile and coordinated motions, they typically operate in the robot’s local frame and neglect global pose feedback, leading to drift and instability during extended execution. In this work, we present CLOT, a real-time whole-body humanoid teleoperation system that achieves closed-loop global motion tracking via high-frequency localization feedback. CLOT synchronizes operator and robot poses in a closed loop, enabling drift-free human-to-humanoid mimicry over long timehorizons. However, directly imposing global tracking rewards in reinforcement learning, often results in aggressive and brittle corrections. To address this, we propose a data-driven randomization strategy that decouples observation trajectories from reward evaluation, enabling smooth and stable global corrections. We further regularize the policy with an adversarial motion prior to suppress unnatural behaviors. To support CLOT, we collect 20 hours of carefully curated human motion data for training the humanoid teleoperation policy. We design a transformer-based policy and train it for over 1300 GPU hours. The policy is deployed on a full-sized humanoid with 31 DoF (excluding hands). Both simulation and real-world experiments verify high-dynamic motion, high-precision tracking, and strong robustness in sim-to-real humanoid teleoperation. Motion data, demos and code can be found in our website.


💡 Research Summary

The paper introduces CLOT, a closed‑loop whole‑body teleoperation system for full‑size humanoid robots that eliminates accumulated global pose drift by incorporating high‑frequency localization feedback. Traditional learning‑based tracking methods operate in the robot’s local frame, ignoring global pose information, which leads to drift and instability over long horizons, especially for large robots with high centers of mass. CLOT solves this by using an optical motion‑capture system to obtain real‑time global pose estimates of both the human operator and the robot, feeding this data back into the control policy.

Directly adding a global‑tracking reward to reinforcement learning, however, tends to produce aggressive corrections and unsafe behaviors. To mitigate this, the authors propose an “Observation Pre‑shift” randomization technique: during training the observation window is randomly shifted forward in time while the reward continues to be evaluated against the current target. This decouples the observation trajectory from the reward trajectory, forcing the policy to implicitly learn smooth motion interpolation and stable global corrections.

The policy is built on a Transformer encoder that processes a tokenized sequence of proprioceptive states, historical observations, and global pose goals, enabling the capture of long‑range temporal dependencies required for complex loco‑manipulation. An adversarial motion prior (AMP) is added as an auxiliary reward to suppress unnatural joint accelerations and ensure realistic motion.

A bespoke dataset of 20 hours of human motion was collected with strict protocols to avoid unstable behaviors and to respect humanoid kinematic limits. The dataset covers basic locomotion, high‑dynamic whole‑body actions, balancing, martial arts, and dance, providing diverse training material.

Training employs Proximal Policy Optimization (PPO) with extensive domain randomization (mass, inertia, friction, sensor noise) and curriculum learning to bridge the sim‑to‑real gap. Over 1300 GPU‑hours were spent training the Transformer‑based actor‑critic network.

The resulting policy was deployed on the Adam Pro robot (31 DoF, excluding hands). Both simulation and real‑world experiments demonstrate high‑dynamic motions, precise tracking of the operator’s global trajectory, and robust disturbance rejection over extended time horizons. The system maintains drift‑free mimicry, enabling safe, long‑duration teleoperation for complex contact‑rich tasks.

In summary, CLOT contributes (1) a real‑time global pose feedback loop for whole‑body teleoperation, (2) the Observation Pre‑shift strategy for stable global tracking, (3) a high‑quality human motion dataset tailored to humanoid dynamics, and (4) a Transformer‑based policy regularized by an adversarial motion prior, collectively advancing the state of the art in long‑horizon, high‑performance humanoid teleoperation.


Comments & Academic Discussion

Loading comments...

Leave a Comment