Jax - Model-based Policy Optimization

This repository implements the papers When to Trust Your Model: Model-Based Policy Optimization and Value Gradient weighted Model-Based Reinforcement Learning in jax.

The underlying SAC code is built on jaxrl2.

This is a work in progress, and it builds on newer environment version. Therefore I cannot guarantee exactly equivalent results to the original papers.

Installation

The repository uses uv to make running simple. Make sure you have a GPU with CUDA 12 installed. If you want to run this on CPU, TPU, or an ARM machine, you will have to change the relevant jax package in the pyrpoject.toml file. Necessary python packages will be installed automatically if you execute the run script. In case you need a separate installation you can also create a virtualenv and run pip install -e .. Necessary dependencies will be installed.

Logging is handled via weights and biases. Please make sure you have a wandb account. In case your system is not set up for wandb but you have an account, you will be prompted to generate an API key automatically. Simply follow the instructions.

Running

To run the experiment, you can simply execute uv run mbpo/runner/train_online.py. The config is handled via hydra. The default config can be found in config/main.yaml.

Roadmap

hydra submitit integration
set default configs to paper values for each env
saving and loading of models and interrupt training
random distractions from paper
run script for paper experiments
modern SAC architectures
parallel multi-seed training
- difficult due to variable length training times in MBPO

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
config		config
mbpo		mbpo
scripts/vagram_paper		scripts/vagram_paper
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jax - Model-based Policy Optimization

Installation

Running

Roadmap

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

cvoelcker/jax_mbpo

Folders and files

Latest commit

History

Repository files navigation

Jax - Model-based Policy Optimization

Installation

Running

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages