Motion Priors Reimagined: Adapting Flat-Terrain Skills for Complex Quadruped Mobility

Abstract

Reinforcement learning (RL)-based legged locomotion controllers often require meticulous reward tuning to track velocities or goal positions while preserving smooth motion on various terrains. Motion imitation methods via RL using demonstration data reduce reward engineering but fail to generalize to novel environments. We address this by proposing a hierarchical RL framework in which a low-level policy is first pre-trained to imitate animal motions on flat ground, thereby establishing motion priors. A subsequent high-level, goal-conditioned policy then builds on these priors, learning residual corrections that enable perceptive locomotion, local obstacle avoidance, and goal-directed navigation across diverse and rugged terrains. Simulation experiments illustrate the effectiveness of learned residuals in adapting to progressively challenging uneven terrains while still preserving the locomotion characteristics provided by the motion priors. Furthermore, our results demonstrate improvements in motion regularization over baseline models trained without motion priors under similar reward setups. Real-world experiments with an ANYmal-D quadruped robot confirm our policy’s capability to generalize animal-like locomotion skills to complex terrains, demonstrating smooth and efficient locomotion and local navigation performance amidst challenging terrains with obstacles.

Motion Priors from Low-Level Policy

Walk

Pace

Canter

High-Level Policy

Flat Ground

Uneven Terrain

Stairs

Local Navigation

Comparative Results

With Baseline

With Zero Residual Penalty

Extended Results

Training with Only Five Reward Terms

Training with RL-based Low-Level Motions on overhanging obstacle-rich terrains

The low-level actions (left-bottom) are recorded during testing and later replayed using the low-level policy.
(Slight timing mismatches occur due to frame rate inconsistencies across different terrains in IsaacGym.)