Skip to content

reduce ccd/multiccd memory with nccdmax/naccdmax#1116

Open
thowell wants to merge 2 commits intogoogle-deepmind:mainfrom
thowell:nccd
Open

reduce ccd/multiccd memory with nccdmax/naccdmax#1116
thowell wants to merge 2 commits intogoogle-deepmind:mainfrom
thowell:nccd

Conversation

@thowell
Copy link
Collaborator

@thowell thowell commented Feb 4, 2026

this pr introduces nccdmax/naccdmax settings for reducing the memory requirements for ccd/multiccd. instead of allocating fields for epa and multicontact with nconmax/naconmax, allocate for the size of the ccd collider with the maximum contacts nccdmax/naccdmax.

there are potentially significant memory savings for any scene that has a ccd collider and at least one more different collider (can be a primitive collider).

implementation from #966

mjwarp-testspeed benchmarks/aloha_pot/scene.xml --nconmax=24 --njmax=128 --memory
Loading model from: benchmarks/aloha_pot/scene.xml...

Model
  nq: 24 nv: 23 nu: 14 nbody: 26 ngeom: 204
Option
  integrator: EULER
  cone: ELLIPTIC
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 128

Rolling out 1000 steps at dt = 0.002...

Summary for 8192 parallel rollouts

Total JIT time: 0.61 s
Total simulation time: 3.27 s
Total steps per second: 2,502,983
Total realtime factor: 5,005.97 x
Total time per step: 399.52 ns
Total converged worlds: 8192 / 8192

Model memory 5.38 MiB (0.37% of used memory):
 (no field >= 1% of used memory)
Data memory 397.70 MiB (27.13% of used memory):
 geom_xpos: 19.12 MiB (1.30%)
 geom_xmat: 57.38 MiB (3.91%)
 qM: 18.00 MiB (1.23%)
 qLD: 16.53 MiB (1.13%)
 efc.J: 96.00 MiB (6.55%)
Other memory: 1062.91 MiB (72.50% of used memory)
Total memory: 1466.00 MiB (3.01% of total device memory)
mjwarp-testspeed benchmarks/aloha_pot/scene.xml --nconmax=24 --njmax=128 --nccdmax=12 --memory
Loading model from: benchmarks/aloha_pot/scene.xml...

Model
  nq: 24 nv: 23 nu: 14 nbody: 26 ngeom: 204
Option
  integrator: EULER
  cone: ELLIPTIC
  solver: NEWTON iterations: 100 ls_iterations: 50
  is_sparse: False
  ls_parallel: False
  broadphase: NXN broadphase_filter: PLANE|SPHERE|OBB
Data
  nworld: 8192 naconmax: 196608 njmax: 128

Rolling out 1000 steps at dt = 0.002...

Summary for 8192 parallel rollouts

Total JIT time: 0.58 s
Total simulation time: 3.27 s
Total steps per second: 2,505,272
Total realtime factor: 5,010.54 x
Total time per step: 399.16 ns
Total converged worlds: 8192 / 8192

Model memory 5.38 MiB (0.53% of used memory):
 (no field >= 1% of used memory)
Data memory 397.70 MiB (39.07% of used memory):
 geom_xpos: 19.12 MiB (1.88%)
 geom_xmat: 57.38 MiB (5.64%)
 qM: 18.00 MiB (1.77%)
 qLD: 16.53 MiB (1.62%)
 efc.J: 96.00 MiB (9.43%)
Other memory: 614.91 MiB (60.40% of used memory)
Total memory: 1018.00 MiB (2.09% of total device memory)

SPS: 2,502,983 -> 2,505,272
total memory: 1466.00 MiB -> 1018.00 MiB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant