The real-is-sim framework operating with an always-in-the-loop simulator. The learnt PushT policy acts on the simulator and the simulator synchronizes itself with the real world. The real robot mimics the simulated robot.

Abstract

We introduce real-is-sim, a new approach to integrating simulation into behavior cloning pipelines. In contrast to real-only methods, which lack the ability to safely test policies before deployment, and sim-to-real methods, which require complex adaptation to cross the sim-to-real gap, our framework allows policies to seamlessly switch between running on real hardware and running in parallelized virtual environments. At the center of real-is-sim is a dynamic digital twin, powered by the Embodied Gaussian simulator, that synchronizes with the real world at 60Hz. This twin acts as a mediator between the behavior cloning policy and the real robot. Policies are trained using representations derived from \emph{simulator} states and always act on the simulated robot, never the real one. During deployment, the real robot simply follows the simulated robot’s joint states, and the simulation is continuously corrected with real world measurements. This setup, where the simulator drives all policy execution and maintains real-time synchronization with the physical world, shifts the responsibility of crossing the sim-to-real gap to the digital twin's synchronization mechanisms, instead of the policy itself. We demonstrate real-is-sim on a long-horizon manipulation task (PushT), showing that virtual evaluations are consistent with real-world results. We further show how real-world data can be augmented with virtual rollouts and compare to policies trained on different representations derived from the simulator state including object poses and rendered images from both static and robot-mounted cameras. Our results highlight the flexibility of the real-is-sim framework across training, evaluation, and deployment stages.

Framework

The real-is-sim framework seamlessly transitions between online and offline modes because it uses an always-in-the-loop simulator as a mediator.

Learnt Policies

Policies learnt from different representations extracted from the simulator state.

Online Mode

Our system operating in online mode with Embodied Gaussians acting as a mediator between the policy and the real world.

Offline Mode

Our system operating in offline mode evaluating an imitation learning policy. Offline mode can render virtual cameras for multiple scenes at the same time and evaluate vision based policies as well as state based policies.

Gripper Virtual Camera

An imitation learning policy trained on a virtually rendered image from a virtual camera mounted on the end effector.

Static Virtual Camera

An imitation learning policy trained on a virtually rendered image from a static virtual camera.

Interesting Exploration Behaviour

The imitation learning policy explores the space to find the T-Block when the T-Block is not in view of its virtual gripper camera.

Desynchronization

The imitation learning policy creates actions that actively prevent the visual correction mechanism from resynchronizing the simulator.

Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin