Dynamics Randomization Revisited A Case Study for Quadrupedal Locomotion

Dynamics Randomization Revisited

A Case Study for Quadrupedal Locomotion

Zhaoming Xie, Xingye Da, Michiel van de Panne, Buck Babich, Animesh Garg

Arxiv Paper

Abstract

Understanding the gap between simulation and reality is critical for reinforcement learning with legged robots, which are largely trained in simulation. However, recent work has resulted in sometimes conflicting conclusions with regard to which factors are important for success, including the role of dynamics randomization. In this paper, we aim to provide clarity and understanding on the role of dynamics randomization in learning robust locomotion policies for the Laikago quadruped robot. Surprisingly, in contrast to prior work with the same robot model, we find that direct sim-to-real transfer is possible without dynamics randomization or on-robot adaptation schemes. We conduct extensive ablation studies in a sim-to-sim setting to understand the key issues underlying successful policy transfer, including other design decisions that can impact policy robustness. We further ground our conclusions via sim-to-real experiments with various gaits, speeds, and stepping frequencies.

Main Video

Dynamics Randomization is Not Necessary

The following policies are trained without dynamics randomization and can transfer directly to the physical robot.

Trot
Pace
Walk

Dynamics Randomization is Not Sufficient

We apply alternative design choices such as removing velocity feedback or use stiff proportional gain. We find that in these scenarios dynamics randomization is not sufficient.

No Velocity Feedback + Rand
kp=160 + Rand

Robustness Tests

We conduct extensive robustness tests. Most of the policies are trained without dynamics randomization.

Payload Test: Trot
Payload Test: Pace
Slope Test
Push Test: Trot
Latency Test: No Rand + 32ms Delay
Latency Test: Rand + 32ms Delay

Acknowledgements

This project is supported by the following.