Skip to content

Code for the paper: "Learning Reward Machines From Partially Observed Policies"

Notifications You must be signed in to change notification settings

mlshehab/learning_reward_machines

Repository files navigation

Learning Reward Machines From Partially Observed Policies

This repository contains code for our paper: Learning Reward Machines From Partially Observed Policies, In Review, 2025.

Create a virtual environment and activate it using:

cd ./lrm_fd
conda env create -f lrm.yml -n lrm
conda activate lrm

Each experiment from Section 5 of the paper is implemented in its own folder.

  • Section 5.1 --> ./gridworld_env
  • Section 5.2 --> ./blockworld_env
  • Section 5.3 --> ./reacher_env
  • Section 5.4 --> ./labyrinth_env

Generally, to run an experiment with the default hyperparameters, run:

cd ./{world}_env # world = {gridworld,blockworld,reacher,labyrinth}
python main.py

In order to reproduce all the results in the paper, simply run:

chmod +x run_experiments.sh
./run_experiments.sh

This will run all the experiments and save the output to results.

Warning

running all the experiments can take a lot of time depending on the compute resources available (~ 5 hrs). Some scripts (e.g. ./old_experiments/patrol_hallway.py) take longer so please be patient. :)

About

Code for the paper: "Learning Reward Machines From Partially Observed Policies"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •