Conversation
81c644b to
85df945
Compare
85df945 to
b158504
Compare
brentyi
left a comment
There was a problem hiding this comment.
+100, seems super useful!
I'm wondering if any if these things are possible in a simple/not overengineered way:
Does it make sense to allow customization of how metrics are "reduced" between steps and before logging? In the extreme case it'd be nice, for example, to be able to specify logging for std of metrics, stds of per-episode means, histograms, histograms of per-episode stds, etc.
Can we implement any of the existing things that are logged (eg rewards) as default terms in the metrics manager?
|
Thanks for the review @brentyi!
The rsl_rl logger only supports scalars. Everything in
In principle yes. Note however that rewards use dt-scaled sums divided by |
Makes sense! To check my understanding: we wouldn't be able to reuse intermediates right? And we could compute std across episodes within a single timestep but not across timesteps within an episode? These are a bit annoying but fine.
Makes sense. It seems kind of nice to consolidate logic but I don't feel strongly about this. |
Adds a MetricsManager so users can log custom per-step metrics without
hacking reward functions or adding zero-weight reward terms. Metrics
terms use the same callable signature as rewards (env, **params) but
have no weight, no dt scaling, and no normalization by episode length.
Episode values are true per-step averages (sum / step_count) logged
under "Episode_Metrics/{term_name}".
Closes #584
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b158504 to
ac34d2c
Compare
Summary
Adds a
MetricsManagerso users can log custom per-step metrics during training without hacking reward functions or adding zero-weight reward terms. Closes #584.MetricsManager,MetricsTermCfg,NullMetricsManagerinmanagers/metrics_manager.pyManagerBasedRlEnvconfig,step(), and_reset_idx()(env, **params) → Tensor[num_envs]sum / step_count), so a metric in [0,1] stays in [0,1] in wandbNullMetricsManager)Example usage
On episode reset,
Episode_Metrics/joint_vel_magappears inextras["log"]and flows to wandb/tensorboard automatically.Test plan
uv run pytest tests/test_metrics_manager.py— 6 targeted testsuv run pytest tests/test_rewards.py— no regressionuv run ty check/uv run pyright— cleanuv run ruff check && uv run ruff format— clean🤖 Generated with Claude Code