Replies: 8 comments 4 replies
-
|
Thank you for your comments! You may find commit
LeRobot (at least the time of benchmarking at the paper) indexes and stores the timestamp in float for every value and sync at run time - this can be bypassed with two ways (1) efficient encoding of time with binary-friendly storage format that stores with relative times, this saves a lot of space on saving the timestamps of the mapping of the video frames and action data (2) using metadata to store a fixed frequency can further saves the space, the data can be saved in chunks with parquet or other formats
I haven't done further analysis since that was the scale of the original datasets. Now it's probably worth to re-do some new and larger datasets, but it was a few classic ones at the time of paper writing. But I do agree with you that the RLDS is efficient in large scale datasets from what I heard (no experimental data here, but Octo paper seems to say so). I'm unable to extrapolate further on the data to make any claims. I'm currently rewriting as Ray dataset to improve the scalability.
The intuition is to keep both the memory-disk throughput and CPU busy for RoboDM. HDF5 is definitely large and sometimes slow if loading with high throughput, this is what you observed in Fig4 in the previous question. RLDS seems to be a strange case for sequential access, which indeed it overuses a lot of CPU at the time of testing |
Beta Was this translation helpful? Give feedback.
-
|
@KeplerC May I ask what's the difference between commit a35a695 and 5bbb8b ? |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks for your reply! I encounterd another issue. Under 5bbb8b, when using the https://github.com/BerkeleyAutomation/robodm/blob/5bbb8bdc8bc8e31b723bb2af319e8ac970beba45/examples/openx_loader.py to convert RLDS to VLA format in lossy version, some trajectories will have file size 0. Do you have an idea about the reason? Thanks! |
Beta Was this translation helpful? Give feedback.
-
|
Does this happen to all RLDS or just a specific dataset? What's the minimal way of reproducing? Have you tried the official OXE downloaded from google research (it might be hardcoded with OXE if that’s the issue)? Also does it happen to flushing process or before flushing?
Recently I’m maxed out by a few deadlines so I can only be responsive as possible, but unfortunately debugging older commit might not be the highest priority. My current priority list after my deadline is (1) solidify the current ray dataset implementation, (2) connect robodm to a more general dataset converter and loader (3) usability (e.g. partitions, multiprocess recording, ROS2 integration, visualization), CICD. Feel free to contribute or advise. If you need to use robodm as a baseline, feel free to use the main branch as it should be more stable. If you use all the compression, etc, it should come with what the original commit has. Dataloading throughput should also be maxed out as far as I tested.
… On Nov 6, 2025, at 11:06 PM, Xinyu Zeng ***@***.***> wrote:
gentle ping @KeplerC <https://github.com/KeplerC>
—
Reply to this email directly, view it on GitHub <#44 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGAAQTZ627SLGSH7MVIWLJD33RAHFAVCNFSM6AAAAACJARA6PGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZHEZTKMQ>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
|
I see. A quick test on my machine works. Probably if you still want to see how old branch works, give me your environment with pip / conda list / docker. I will try to reproduce to see when I have time. I haven’t touched on the older commit for some time. Feel free to nudge me and you can also send me email privately to schedule a meeting for debugging if you find it helpful (after Dec 1).
Migrating the benchmark script over is a good idea - feel free to PR because Im also interested to see how it goes with ray dataset.
… On Nov 6, 2025, at 11:39 PM, Xinyu Zeng ***@***.***> wrote:
Thanks for the quick response! It is the nyu_door_opening_surprising_effectiveness dataset. I will try to switch to the main branch, but it lacks the original benchmarking scripts (and perhaps also the lerobot loader). Maybe I can cherrypick the benchmarking scripts from mkv branch to the main branch. Take your time with the deadlines!
—
Reply to this email directly, view it on GitHub <#44 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGAAQT22C35V7BNBVCF5JRT33REBRAVCNFSM6AAAAACJARA6PGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZHE3DAMQ>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
|
I made the benchmark work on the latest branch. Code at https://github.com/XinyuZeng/robodm/tree/add_bench_to_latest. It contains a uv.lock for environment sync. The numbers I got does not match the paper but I believe it is because the versions update in those formats and configs etc. Aside from that,
|
Beta Was this translation helpful? Give feedback.
-
|
Feel free to PR the benchmark script if you feel like it’s ready. I will
try to reproduce your environment to see - likely it’s caused by the
version of AV that the frame format doesn’t match the nyu dataset, and
common symptom is that it fails silently.
For loading different trajectories, the time of development, it was mainly
inspired by Octo paper’s appendix section about data mixing
(which was SOTA at the time), so the batch is a batch of trajectories from
different trajectories / datasets, instead of the steps. For smaller scale
DP training, indeed steps is sufficient.
For the second question, just to make sure i understand your question and
the context. My understanding for RLDS is that it has a prefetch buffer, so
the continuous reading makes sense to evaluate as long as exhausting buffer
quickly; LeRobot (at least my understanding to the time of benchmark,
haven’t followed up recently) extracts single frame at a time.
On the side, I agree with you that we should also bring the evaluation up
to today’s data loading convention in policy training - we can work
together on this to survey and figure out how to evaluate what training
pipelines need. Do you think we can collaborate and do a systematic
benchmark together here for the community?
…On Sat, Nov 8, 2025 at 1:07 AM Xinyu Zeng ***@***.***> wrote:
I made the benchmark work on the latest branch. Code at
https://github.com/XinyuZeng/robodm/tree/add_bench_to_latest. It contains
a uv.lock for environment sync. The numbers I got does not match the paper
but I believe it is because the versions update in those formats and
configs etc.
Aside from that,
1. Why in the experiment, batch_size=8 means loading 8 episodes? I
though in VLA training, we should load 8 steps, not 8 full episodes.
2. When batch_size>1, I think it makes a difference if we a) use
ds.take(batch_size) as we can use from RLDS and LeRobot, or b) read
one data at a time sequentially in a for-loop. I saw the original benchmark
code contains both the patterns and I think maybe we should make them
consistent.
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGAAQTZUH36RXYJC4IEISY333WXDRAVCNFSM6AAAAACJARA6PGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOJQHEZTCNI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Really appreciate it if you could help clarify those. Thanks! @KeplerC
Beta Was this translation helpful? Give feedback.
All reactions