See https://www.cse.unsw.edu.au/~cs9417ml/RL1/algorithms.html If I add this alternative, then maybe the project should have a more generic name like bb4-reinforcement-learning