Read the Changelog
The PySuperKart2 gymnasium wrapper is a Python package, so installing is fairly easy
pip install pystk2-gymnasium
Optional extras:
pip install pystk2-gymnasium[cli] # CLI race runner (tqdm, torch)
pip install pystk2-gymnasium[record] # Race video recording (moviepy, Pillow)
pip install pystk2-gymnasium[remote] # Client-server mode (pyzmq)
pip install pystk2-gymnasium[web] # Web visualization dashboard (dash, plotly)Note that during the first run, SuperTuxKart assets are downloaded in the cache directory.
Each controlled kart is parametrized by pystk2_gymnasium.AgentSpec:
namedefines name of the player (displayed on top of the kart)rank_startdefines the starting position (None for random, which is the default)use_aiflag (False by default) to ignore actions (when callingstep, a SuperTuxKart bot is used instead of using the action)camera_modecan be set toAUTO(camera on for non STK bots),ON(camera on) orOFF(no camera).
- no graphics information is available (i.e. pixmap)
After importing pystk2_gymnasium, the following environments are available:
supertuxkart/full-v0is the main environment containing complete observations. The observation and action spaces are both dictionaries with continuous or discrete variables (see below). The exact structure can be found usingenv.observation_spaceandenv.action_space. The following options can be used to modify the environment:agentis anAgentSpec (see above)render_modecan be None orhumantrackdefines the SuperTuxKart track to use (None for random). The full list can be found inSTKRaceEnv.TRACKSafter initialization withinitialize.initialize(with_graphics: bool)has been called.num_kartdefines the number of karts on the track (3 by default)max_pathsthe maximum number of the (nearest) paths (a track is made of paths) to consider in the observation statelapsis the number of laps (1 by default)difficultyis the difficulty of the AI bots (lowest 0 to highest 2, default to 2)
Some environments are created using wrappers (see below for wrapper documentation),
supertuxkart/simple-v0(wrappers:ConstantSizedObservations) is a simplified environment with a fixed number of observations for paths (controlled bystate_paths, default 5), items (state_items, default 5), karts (state_karts, default 5)supertuxkart/flattened-v0(wrappers:ConstantSizedObservations,PolarObservations,FlattenerWrapper) has observation and action spaces simplified at the maximum (onlydiscreteandcontinuouskeys)supertuxkart/flattened_continuous_actions-v0(wrappers:ConstantSizedObservations,PolarObservations,OnlyContinuousActionsWrapper,FlattenerWrapper) removes discrete actions (default to 0) so this is steer/acceleration only in the continuous domainsupertuxkart/flattened_multidiscrete-v0(wrappers:ConstantSizedObservations,PolarObservations,DiscreteActionsWrapper,FlattenerWrapper) is like the previous one, but with fully multi-discrete actions.acceleration_stepsandsteer_steps(default to 5) control the number of discrete values for acceleration and steering respectively.supertuxkart/flattened_discrete-v0(wrappers:ConstantSizedObservations,PolarObservations,DiscreteActionsWrapper,FlattenerWrapper,FlattenMultiDiscreteActions) is like the previous one, but with fully discretized actions
The reward
where
Wrappers can be used to modify the environment.
pystk2_gymnasium.ConstantSizedObservations( env, state_items=5, state_karts=5, state_paths=5 ) ensures that the number of observed items,
karts and paths is constant. By default, the number of observations per category
is 5.
pystk2_gymnasium.PolarObservations(env) changes Cartesian
coordinates to polar ones (angle in the horizontal plane, angle in the vertical plan, and distance) of all 3D vectors.
pystk2_gymnasium.DiscreteActionsWrapper(env, acceleration_steps=5, steer_steps=7) discretizes acceleration and steer actions (5 and 7 values respectively).
This wrapper groups all continuous and discrete spaces together.
pystk2_gymnasium.FlattenerWrapper(env) flattens actions and
observations. The base environment should be a dictionary of observation
spaces. The transformed environment is a dictionary made with two entries,
discrete and continuous (if both continuous and discrete
observations/actions are present in the initial environment, otherwise it is
either the type of discrete or continuous). discrete is MultiDiscrete
space that combines all the discrete (and multi-discrete) observations, while
continuous is a Box space.
pystk2_gymnasium.FlattenMultiDiscreteActions(env) flattens a multi-discrete
action space into a discrete one, with one action per possible unique choice of
actions. For instance, if the initial space is
supertuxkart/multi-full-v0 can be used to control multiple karts. It takes an
agents parameter that is a list of AgentSpec. Observations and actions are a
dictionary of single-kart ones where string keys that range from 0 to
n-1 with n the number of karts.
To use different gymnasium wrappers, one can use a MonoAgentWrapperAdapter.
Let's look at an example to illustrate this:
from pystk_gymnasium import AgentSpec
agents = [
AgentSpec(use_ai=True, name="Yin Team", camera_mode=CameraMode.ON),
AgentSpec(use_ai=True, name="Yang Team", camera_mode=CameraMode.ON),
AgentSpec(use_ai=True, name="Zen Team", camera_mode=CameraMode.ON)
]
wrappers = [
partial(MonoAgentWrapperAdapter, wrapper_factories={
"0": lambda env: ConstantSizedObservations(env),
"1": lambda env: PolarObservations(ConstantSizedObservations(env)),
"2": lambda env: PolarObservations(ConstantSizedObservations(env))
}),
]
make_stkenv = partial(
make_env,
"supertuxkart/multi-full-v0",
render_mode="human",
num_kart=5,
agents=agents,
wrappers=wrappers
)Agents used with the CLI are Python modules (typically pystk_actor.py) that
define the following:
| Name | Required | Description |
|---|---|---|
create_state() |
no | Returns the initial state for the agent (default: None for stateless agents) |
get_actor(module_dir, obs_space, act_space) |
yes | Returns an actor callable actor(state, obs) -> action |
env_name |
no | Gymnasium environment ID (default: "supertuxkart/full-v0") |
player_name |
no | Name displayed above the kart |
get_wrappers() |
no | Returns a list of additional wrapper callables |
The module_dir argument is the path to the agent's directory, which can be
used to load model weights or other resources. For stateful agents,
create_state() is called once per race and the returned state object is passed
to actor(state, obs) at every step.
Example (stateless heuristic agent):
import math
import numpy as np
env_name = "supertuxkart/simple-v0"
player_name = "Heuristic"
def create_state():
return None
def get_actor(module_dir, observation_space, action_space):
def actor(state, obs):
paths_end = obs["paths_end"]
if len(paths_end) > 0:
angle_zx = float(paths_end[0][0])
steer = np.clip(angle_zx / math.pi * 2, -1.0, 1.0)
else:
steer = 0.0
return {
"acceleration": np.array([1.0], dtype=np.float32),
"steer": np.array([steer], dtype=np.float32),
"brake": 0, "drift": 0,
"fire": 1 if int(obs.get("attachment", 0)) != 0 else 0,
"nitro": 1, "rescue": 0,
}
return actorAgents can be packaged as a zip file, a directory containing pystk_actor.py,
or a Python module on the import path.
The pystk2 command-line tool provides commands for running races locally or in a distributed client-server setup.
Runs a race with one or more agents loaded locally.
pystk2 race agent1.zip agent2.zip --num-karts 5 --track lighthouse --laps 2Positional arguments:
agents— One or more agent sources: path to a zip file, a directory containingpystk_actor.py, or a Python module name. Append@:Nameto override the player name (e.g.agent.zip@:Alice).
Options:
| Option | Default | Description |
|---|---|---|
--num-karts |
3 | Total number of karts in the race |
--track |
random | Track name |
--laps |
1 | Number of laps |
--max-paths |
unlimited | Maximum path nodes ahead in observations |
--output FILE |
— | Write JSON race results to file |
--error-handling |
raise |
raise to propagate agent errors, catch to use random actions |
--action-timeout |
none | Per-action timeout in seconds (Unix only) |
--hide |
off | Run without graphics (headless) |
--web |
off | Enable web visualization dashboard (requires dash/plotly) |
--web-port |
8050 | Port for the web dashboard |
--record FILE |
— | Save race video (e.g. race.mp4, race.webm) |
--cameras |
auto | Number of cameras (max 8) |
--screen-width |
1280 | Camera width in pixels when recording |
--screen-height |
720 | Camera height in pixels when recording |
--render-sub-steps |
1 | Physics sub-steps per action when recording (higher = smoother video) |
--adapter PATH |
— | Python file providing a custom create_actor function |
--max-steps |
unlimited | Maximum steps before stopping |
Starts a persistent server that loads agents and responds to action requests from a race client over ZMQ. The server stays alive across multiple races until interrupted with Ctrl+C.
# Serve one agent
pystk2 race-server my_agent.zip
# Serve multiple agents on a custom port
pystk2 race-server agent_a.zip agent_b/ --address tcp://*:5556Requires pyzmq: pip install pystk2-gymnasium[remote]
Positional arguments:
agents— One or more agent sources (same format aspystk2 race).
Options:
| Option | Default | Description |
|---|---|---|
--address |
tcp://*:5555 |
ZMQ bind address |
--adapter PATH |
— | Python file providing a custom create_actor function |
--action-timeout |
none | Per-action timeout in seconds (Unix only) |
--threads |
half CPU cores | Number of worker threads for concurrent client sessions |
Connects to one or more race servers, runs the STK environment locally, sends observations to each server and receives actions.
# Single server
pystk2 race-client --server tcp://localhost:5555 --num-karts 3 --track lighthouse
# Multiple servers (each student runs their own server)
pystk2 race-client \
--server tcp://student-a:5555 \
--server tcp://student-b:5555 \
--num-karts 5 --track lighthouse --max-steps 500Requires pyzmq: pip install pystk2-gymnasium[remote]
Options:
| Option | Default | Description |
|---|---|---|
--server ADDR |
(required, repeatable) | Server address (e.g. tcp://localhost:5555) |
--num-karts |
3 | Total number of karts in the race |
--track |
random | Track name |
--laps |
1 | Number of laps |
--max-paths |
unlimited | Maximum path nodes ahead in observations |
--output FILE |
— | Write JSON race results to file |
--error-handling |
raise |
raise to propagate agent errors, catch to use random actions |
--hide |
off | Run without graphics (headless) |
--web |
off | Enable web visualization dashboard |
--web-port |
8050 | Port for the web dashboard |
--record FILE |
— | Save race video |
--cameras |
auto | Number of cameras (max 8) |
--screen-width |
1280 | Camera width in pixels when recording |
--screen-height |
720 | Camera height in pixels when recording |
--render-sub-steps |
1 | Physics sub-steps per action when recording (higher = smoother video) |
--max-steps |
unlimited | Maximum steps before stopping |
--max-steps-after-first |
unlimited | Maximum steps to continue after the first kart finishes |
--karts-finished |
all | Stop the race after this many karts have finished |
--timeout |
60 | ZMQ recv timeout per request in seconds |
The client-server mode is designed for settings where each participant runs their own agent server and a race organizer runs the client:
Student A (server) Student B (server) Organizer (client)
pystk2 race-server pystk2 race-server pystk2 race-client
agent_a.zip agent_b.zip --server A:5555
--address tcp://*:5555 --address tcp://*:5555 --server B:5555
Key design points:
- Wrappers applied server-side: The client sends raw observations from the base
supertuxkart/multi-full-v0environment. Each server builds the full wrapper chain (registered wrappers fromenv_name+ agent'sget_wrappers()) and applies them before calling actors, then un-wraps actions before returning them to the client. - Persistent servers: Servers stay alive between races. After a client sends
CLOSE, the server returns to waiting for the nextINIT. - Concurrent sessions: The server uses a thread pool (
--threads) to handle multiple client races simultaneously. - One server = one or more agents: A single server can load multiple agents.
- Protocol: ZMQ ROUTER/REQ over TCP with pickle serialization.
When --record is used, the race is captured to a video file. Supported
formats: .mp4, .mkv, .avi, .webm, .ogv, .mov. Frame durations are
derived from in-game timestamps for accurate timing. Each controlled agent gets
a distinct kart model and color, and an end card showing final results is
appended to the video.
Use --render-sub-steps to capture intermediate physics frames for smoother
video without changing the action rate. Requires: pip install pystk2-gymnasium[record].
The --adapter PATH option loads a Python file that customizes how actors are
created. The adapter must define:
create_actor(get_actor, module_dir, obs_space, act_space)— wraps the agent'sget_actorto add custom logic (e.g. loading model weights).
It may optionally define:
prepare_module_dir(path)— called on each agent's directory before importing (e.g. to create a missing__init__.py).
See examples/bbrl_adapter.py for a reference implementation.
All the 3D vectors are within the kart referential (z front, x left, y
up):
distance_down_track: The distance from the startenergy: remaining collected energyfront: front of the kart (3D vector)attachment: the item attached to the kart (bonus box, banana, nitro/big, nitro/small, bubble gum, easter egg)attachment_time_left: how much time the attachment will be keptitems_position: position of the items (3D vectors)items_type: type of the itemjumping: is the kart jumpingkarts_position: position of other karts, beginning with the ones in frontmax_steer_anglethe max angle of the steering (given the current speed)center_path_distance: distance to the center of the pathcenter_path: vector to the center of the pathpaths_start,paths_end,paths_width: 3D vectors to the paths start and end, and vector of their widths (scalar). The paths are sorted so that the first element of the array is the current one.paths_distance: the distance of the paths starts and ends (vector of dimension 2)powerup: collected power-upshield_timeskeed_factorvelocity: velocity vector
import gymnasium as gym
from pystk2_gymnasium import AgentSpec
# STK gymnasium uses one process
if __name__ == '__main__':
# Use a a flattened version of the observation and action spaces
# In both case, this corresponds to a dictionary with two keys:
# - `continuous` is a vector corresponding to the continuous observations
# - `discrete` is a vector (of integers) corresponding to discrete observations
env = gym.make("supertuxkart/flattened-v0", render_mode="human", agent=AgentSpec(use_ai=False))
ix = 0
done = False
state, *_ = env.reset()
while not done:
ix += 1
action = env.action_space.sample()
state, reward, terminated, truncated, _ = env.step(action)
done = truncated or terminated
# Important to stop the STK process
env.close()