Learning Reward Machines From Partially Observed Policies

This repository contains code for our paper: Learning Reward Machines From Partially Observed Policies, In Review, 2025.

Create a virtual environment and activate it using:

cd ./lrm_fd
conda env create -f lrm.yml -n lrm
conda activate lrm

Each experiment from Section 5 of the paper is implemented in its own folder.

Section 5.1 --> ./gridworld_env
Section 5.2 --> ./blockworld_env
Section 5.3 --> ./reacher_env
Section 5.4 --> ./labyrinth_env

Generally, to run an experiment with the default hyperparameters, run:

cd ./{world}_env # world = {gridworld,blockworld,reacher,labyrinth}
python main.py

In order to reproduce all the results in the paper, simply run:

chmod +x run_experiments.sh
./run_experiments.sh

This will run all the experiments and save the output to results.

Warning

running all the experiments can take a lot of time depending on the compute resources available (~ 5 hrs). Some scripts (e.g. ./old_experiments/patrol_hallway.py) take longer so please be patient. :)

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
blockworld_env		blockworld_env
dynamics		dynamics
gridworld_env		gridworld_env
labyrinth_env		labyrinth_env
old_experiments		old_experiments
reacher_env		reacher_env
reward_machine		reward_machine
rm_examples		rm_examples
utils		utils
.gitignore		.gitignore
README.md		README.md
lrm.yml		lrm.yml
run_experiments.sh		run_experiments.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Reward Machines From Partially Observed Policies

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

mlshehab/learning_reward_machines

Folders and files

Latest commit

History

Repository files navigation

Learning Reward Machines From Partially Observed Policies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages