Skip to content

unitreerobotics/unitree_rl_mjlab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unitree RL Mjlab

✳️ Overview

Unitree RL Mjlab is a reinforcement learning project built upon the mjlab, using MuJoCo as its physics simulation backend, currently supporting Unitree Go2, Unitree G1 and Unitree H1_2.

Mjlab combines Isaac Lab's proven API with best-in-class MuJoCo physics to provide lightweight, modular abstractions for RL robotics research and sim-to-real deployment.

MuJoCo
Physical

📦 Installation and Configuration

Please refer to setup.md for installation and configuration steps.

🔁 Process Overview

The basic workflow for using reinforcement learning to achieve motion control is:

TrainPlaySim2Real

  • Train: The agent interacts with the MuJoCo simulation and optimizes policies through reward maximization.
  • Play: Replay trained policies to verify expected behavior.
  • Sim2Real: Deploy trained policies to physical Unitree robots for real-world execution.

🛠️ Usage Guide

1. Velocity Tracking Training

Run the following command to train a velocity tracking policy:

python scripts/train.py Unitree-G1-Flat --env.scene.num-envs=4096

Multi-GPU Training: Scale to multiple GPUs using --gpu-ids:

python scripts/train.py Unitree-G1-Flat \
  --gpu-ids 0 1 \
  --env.scene.num-envs=4096
  • The first argument (e.g., Mjlab-Velocity-Flat-Unitree-G1) specifies the training task. Available velocity tracking tasks:
    • Unitree-Go2-Flat
    • Unitree-G1-Flat
    • Unitree-G1-23Dof-Flat
    • Unitree-H1_2-Flat
    • Unitree-A2-Flat
    • Unitree-R1-Flat

Note

For more details, refer to the mjlab documentation: mjlab documentation.

2. Motion Imitation Training

Train a Unitree G1 to mimic reference motion sequences.

2.1 Prepare Motion Files

Prepare csv motion files in mjlab/motions/g1/ and convert them to npz format:

python scripts/csv_to_npz.py \
--input-file src/assets/motions/g1/dance1_subject2.csv \
--output-name dance1_subject2.npz \
--input-fps 30 \
--output-fps 50

npz files will be stored at:src/motions/g1/...

2.2 Training

After generating the NPZ file, launch imitation training:

python scripts/train.py Unitree-G1-Tracking --motion_file=src/assets/motions/g1/dance1_subject2.npz --env.scene.num-envs=4096

Note

For detailed motion imitation instructions, refer to the BeyondMimic documentation: BeyondMimic documentation.

⚙️ Parameter Description

  • --env.scene: simulation scene configuration (e.g., num_envs, dt, ground type, gravity, disturbances)
  • --env.observations: observation space configuration (e.g., joint state, IMU, commands, etc.)
  • --env.rewards: reward terms used for policy optimization
  • --env.commands: task commands (e.g., velocity, pose, or motion targets)
  • --env.terminations: termination conditions for each episode
  • --agent.seed: random seed for reproducibility
  • --agent.resume: resume from the last saved checkpoint when enabled
  • --agent.policy: policy network architecture configuration
  • --agent.algorithm: reinforcement learning algorithm configuration (PPO, hyperparameters, etc.)

Training results are stored atlogs/rsl_rl/<robot>_(velocity | tracking)/<date_time>/model_<iteration>.pt

3. Simulation Validation

To visualize policy behavior in MuJoCo:

Velocity tracking:

python scripts/play.py Unitree-G1-Flat --checkpoint_file=logs/rsl_rl/g1_velocity/2026-xx-xx_xx-xx-xx/model_xx.pt

Motion imitation:

python scripts/play.py Unitree-G1-Tracking --motion_file=src/assets/motions/g1/dance1_subject2.npz --checkpoint_file=logs/rsl_rl/g1_tracking/2026-xx-xx_xx-xx-xx/model_xx.pt

Note

  • During training, policy.onnx and policy.onnx.data are also exported for deployment onto physical robots.

Visualization

Go2 G1 H1_2 G1_mimic
go2 g1 h1_2 g1_mimic

4. Real Deployment

Before deployment, install the required communication tools:

4.1 Power On the Robot

Start the robot in suspended state and wait until it enters zero-torque mode.

4.2 Enable Debug Mode

While in zero-torque mode, press L2 + R2 on the controller. The robot will enter debug mode with joint damping enabled.

4.3 Connect to the Robot

Connect your PC to the robot via Ethernet. Configure the network as:

  • Address:192.168.123.222
  • Netmask:255.255.255.0

Use ifconfig to determine the Ethernet device name for deployment.

4.4 Compilation

Example: Unitree G1 velocity control. Place policy.onnx and policy.onnx.data into: deploy/robots/g1/config/policy/velocity/v0/exported. Then compile:

cd deploy/robots/g1
mkdir build && cd build
cmake .. && make

4.5 Deployment

After Compilation, run:

cd deploy/robots/g1/build
./g1_ctrl --network=enp5s0

Arguments

  • network: Ethernet interface name (e.g., enp5s0)

Deployment Results

Go2 G1 H1_2 G1_mimic

🎉 Acknowledgements

This project would not be possible without the contributions of the following repositories:

  • mjlab: training and execution framework
  • whole_body_tracking: versatile humanoid motion tracking framework
  • rsl_rl: reinforcement learning algorithm implementation
  • mujoco_warp: GPU-accelerated rendering and simulation interface
  • mujoco: high-fidelity rigid-body physics engine

About

This is a repository for reinforcement learning implementation for Unitree robots, based on Mujoco.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages