Releases · Stable-Baselines-Team/stable-baselines3-contrib

05 Dec 11:32

araffin

v2.7.1

55dede1

Latest

Warning

Stable-Baselines3 (SB3) v2.7.1 will be the last one supporting Python 3.9 (end of life in October 2025)
We highly recommended you to upgrade to Python >= 3.10.

Bug fixes

Fix typo: change MaskablePPO tb_log_name from "PPO" to "MaskablePPO" by @Copilot in #307
Fix CI badge link in README.md by @araffin in #312

New Contributors

@Copilot made their first contribution in #307

Full Changelog: v2.7.0...v2.7.1

Contributors

araffin

Assets 2

25 Jul 13:03

araffin

v2.7.0

33889db

v2.7.0: Added support for n-step returns for off-policy algorithms

Breaking Changes

Upgraded to SB3 >= 2.7.0

New features

Add n-step returns support with n_steps parameter

Bug fixes

Use the FloatSchedule and LinearSchedule classes instead of lambdas in the ARS, PPO, and QRDQN implementations to improve model portability across different operating systems

New Contributors

@akanto made their first contribution in #294

Full Changelog: v2.6.0...v2.7.0

Contributors

akanto

Assets 2

24 Mar 15:01

araffin

v2.6.0

00a401d

v2.6.0: Fix for `MaskablePPO` with `SubprocVecEnv`, add Gymnasium v1.1 support

Breaking Changes:

Upgraded to Stable-Baselines3 >= 2.6.0
Renamed _dump_logs() to dump_logs()

New Features:

Added support for Gymnasium v1.1.0

Bug Fixes:

Fixed issues with SubprocVecEnv and MaskablePPO by using vec_env.has_attr() (pickling issues, mask function not present)

Full Changelog: v2.5.0...v2.6.0

Assets 2

27 Jan 12:33

araffin

v2.5.0

c070fc2

SB3-Contrib v2.5.0: NumPy v2.0 support

Breaking changes:

Upgraded to PyTorch 2.3.0
Dropped Python 3.8 support
Upgraded to Stable-Baselines3 >= 2.5.0

New Contributors

@kplers made their first contribution in #266

Full Changelog: v2.4.0...v2.5.0

Contributors

kplers

Assets 2

18 Nov 10:33

araffin

v2.4.0

d5ac968

SB3-Contrib v2.4.0: New algorithm (CrossQ), Gymnasium v1.0 support

Breaking Changes:

Upgraded to Stable-Baselines3 >= 2.4.0

New Features:

Added CrossQ algorithm, from "Batch Normalization in Deep Reinforcement Learning" paper (@danielpalen)
Added BatchRenorm PyTorch layer used in CrossQ (@danielpalen)
Added support for Gymnasium v1.0

Bug Fixes:

Updated QR-DQN optimizer input to only include quantile_net parameters (@corentinlger)
Updated QR-DQN paper link in docs (@corentinlger)
Fixed a warning with PyTorch 2.4 when loading a RecurrentPPO model (You are using torch.load with weights_only=False)
Fixed loading QRDQN changes target_update_interval (@jak3122)

Others:

Updated PyTorch version on CI to 2.3.1
Remove unnecessary SDE noise resampling in PPO/TRPO update
Switched to uv to download packages on GitHub CI

New Contributors

@corentinlger made their first contribution in #252
@jak3122 made their first contribution in #259
@danielpalen made their first contribution in #243

Full Changelog: v2.3.0...v2.4.0

Contributors

jak3122, danielpalenicek, and corentinlger

Assets 2

31 Mar 18:41

araffin

v2.3.0

5102922

SB3-Contrib v2.3.0: New defaults hyperparameters for QR-DQN

Breaking Changes:

Upgraded to Stable-Baselines3 >= 2.3.0
The default learning_starts parameter of QRDQN have been changed to be consistent with the other offpolicy algorithms

# SB3 < 2.3.0 default hyperparameters, 50_000 corresponded to Atari defaults hyperparameters
# model = QRDQN("MlpPolicy", env, learning_starts=50_000)
# SB3 >= 2.3.0:
model = QRDQN("MlpPolicy", env, learning_starts=100)

New Features:

Added rollout_buffer_class and rollout_buffer_kwargs arguments to MaskablePPO
Log success rate rollout/success_rate when available for on policy algorithms

Others:

Fixed train_freq type annotation for tqc and qrdqn (@Armandpl)
Fixed sb3_contrib/common/maskable/*.py type annotations
Fixed sb3_contrib/ppo_mask/ppo_mask.py type annotations
Fixed sb3_contrib/common/vec_env/async_eval.py type annotations

Documentation:

Add some additional notes about MaskablePPO (evaluation and multi-process) (@icheered)

Full Changelog: v2.2.1...v2.3.0

Contributors

Armandpl and icheered

Assets 2

17 Nov 23:37

araffin

v2.2.1

707cb0f

SB3-Contrib v2.2.1

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo
Stable-Baselines Jax (SBX): https://github.com/araffin/sbx

Breaking Changes:

Upgraded to Stable-Baselines3 >= 2.2.1
Switched to ruff for sorting imports (isort is no longer needed), black and ruff version now require a minimum version
Dropped x is False in favor of not x, which means that callbacks that wrongly returned None (instead of a boolean) will cause the training to stop (@iwishiwasaneagle)

New Features:

Added set_options for AsyncEval
Added rollout_buffer_class and rollout_buffer_kwargs arguments to TRPO

Others:

Fixed ActorCriticPolicy.extract_features() signature by adding an optional features_extractor argument
Update dependencies (accept newer Shimmy/Sphinx version and remove sphinx_autodoc_typehints)

Contributors

iwishiwasaneagle

Assets 2

20 Aug 12:15

araffin

v2.1.0

67d3eef

SB3-Contrib v2.1.0

Breaking Changes:

Removed Python 3.7 support
SB3 now requires PyTorch >= 1.13
Upgraded to Stable-Baselines3 >= 2.1.0

New Features:

Added Python 3.11 support

Bug Fixes:

Fixed MaskablePPO ignoring stats_window_size argument

Full Changelog: v2.0.0...v2.1.0

Assets 2

23 Jun 13:00

araffin

v2.0.0

de92025

SB3-Contrib v2.0.0: Gymnasium Support

Warning
Stable-Baselines3 (SB3) v2.0 will be the last one supporting python 3.7 (end of life in June 2023).
We highly recommended you to upgrade to Python >= 3.8.

To upgrade:

pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade

or simply (rl zoo depends on SB3 and SB3 contrib):

pip install rl_zoo3 --upgrade

Breaking Changes

Switched to Gymnasium as primary backend, Gym 0.21 and 0.26 are still supported via the shimmy package (@carlosluis, @arjun-kg, @tlpss)
Upgraded to Stable-Baselines3 >= 2.0.0

Bug fixes

Fixed QRDQN update interval for multi envs

Others

Fixed sb3_contrib/tqc/*.py type hints
Fixed sb3_contrib/trpo/*.py type hints
Fixed sb3_contrib/common/envs/invalid_actions_env.py type hints

Full Changelog: v1.8.0...v2.0.0

Contributors

carlosluis, arjun-kg, and tlpss

Assets 2

08 Apr 16:19

araffin

v1.8.0

a84ad3a

SB3-Contrib v1.8.0

Warning
Stable-Baselines3 (SB3) v1.8.0 will be the last one to use Gym as a backend.
Starting with v2.0.0, Gymnasium will be the default backend (though SB3 will have compatibility layers for Gym envs).
You can find a migration guide here.
If you want to try the SB3 v2.0 alpha version, you can take a look at PR #1327.

RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo

To upgrade:

pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade

or simply (rl zoo depends on SB3 and SB3 contrib):

pip install rl_zoo3 --upgrade

Breaking Changes:

Removed shared layers in mlp_extractor (@AlexPasqua)
Upgraded to Stable-Baselines3 >= 1.8.0

New Features:

Added stats_window_size argument to control smoothing in rollout logging (@jonasreiher)

Bug Fixes:

Deprecations:

Others:

Moved to pyproject.toml
Added github issue forms
Fixed Atari Roms download in CI
Fixed sb3_contrib/qrdqn/*.py type hints
Switched from flake8 to ruff

Documentation:

Added warning about potential crashes caused by check_env in the MaskablePPO docs (@AlexPasqua)

Contributors

AlexPasqua and jonaslreiter

Assets 2

Releases: Stable-Baselines-Team/stable-baselines3-contrib

v2.7.1: Fix tensorboard log name

Bug fixes

New Contributors

Contributors

Uh oh!

v2.7.0: Added support for n-step returns for off-policy algorithms

Breaking Changes

New features

Bug fixes

New Contributors

Contributors

Uh oh!

v2.6.0: Fix for `MaskablePPO` with `SubprocVecEnv`, add Gymnasium v1.1 support

Breaking Changes:

New Features:

Bug Fixes:

Uh oh!

SB3-Contrib v2.5.0: NumPy v2.0 support

Breaking changes:

New Contributors

Contributors

Uh oh!

SB3-Contrib v2.4.0: New algorithm (CrossQ), Gymnasium v1.0 support

Breaking Changes:

New Features:

Bug Fixes:

Others:

New Contributors

Contributors

Uh oh!

SB3-Contrib v2.3.0: New defaults hyperparameters for QR-DQN

Breaking Changes:

New Features:

Others:

Documentation:

Contributors

Uh oh!

SB3-Contrib v2.2.1

Breaking Changes:

New Features:

Others:

Contributors

Uh oh!

SB3-Contrib v2.1.0

Breaking Changes:

New Features:

Bug Fixes:

Uh oh!

SB3-Contrib v2.0.0: Gymnasium Support

Breaking Changes

Bug fixes

Others

Contributors

Uh oh!

SB3-Contrib v1.8.0

Breaking Changes:

New Features:

Bug Fixes:

Deprecations:

Others:

Documentation:

Contributors

Uh oh!