Skip to content

Use simple cache key equality check#911

Draft
nicoburns wants to merge 7 commits intoDioxusLabs:mainfrom
nicoburns:correct-caching
Draft

Use simple cache key equality check#911
nicoburns wants to merge 7 commits intoDioxusLabs:mainfrom
nicoburns:correct-caching

Conversation

@nicoburns
Copy link
Collaborator

Objective

Ensure that Taffy correctly computes layouts when re-layouting with a warm/populated cache.

Context

Blitz is seeing bugs in the layout that only occur when incremental mode is enabled (which corresponds to having a Taffy cache populated from a previous frame). There are reports of similar bugs from Floem.

In the screenshots below, note how in incremental mode the text in the top-right ("Create Account" and "Log In") wraps, whereas in non-incremental mode it does. It is not supposed to wrap and does not do so in other browsers.

NON-incremental modeIncremental mode
Screenshot 2026-01-31 at 16 56 46 Screenshot 2026-01-31 at 16 56 29

Benchmarks

This appears to have little effect on some benchmarks. But it is a 40-50% regression on the Flexbox "Deep tree (auto size)" benchmarks and the "mixed tree" benchmarks and a 15-25% regression on the CSS Grid "deep tree" benchmarks.

I'm also seeing perf regressions around the 50-80% mark when doing a full (non-incremental) re-layouts of some websites in Blitz. https://en.wikipedia.org/wiki/Barack_Obama is ~36ms -> ~46ms. https://www.bbc.co.uk/news is ~9ms -> 16ms. Layouts with populated cache are very fast ~100 microseconds. I don't have good numbers for the "partial cache" case.

The good news is that it doesn't seem to affect scaling behaviour. It's a ~flat perf regression regardless of tree size.

cargo bench
   Compiling taffy v0.9.2 (/Users/nico/code/oss/taffy)
   Compiling taffy_benchmarks v0.1.0 (/Users/nico/code/oss/taffy/benches)
    Finished `bench` profile [optimized] target(s) in 6.30s
     Running unittests src/lib.rs (target/release/deps/taffy_benchmarks-4a3d205e86ae3bb9)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/flexbox.rs (target/release/deps/flexbox-cbf27e9137a1920f)
Gnuplot not found, using plotters backend
yoga 'huge nested'/Taffy 0.7 /10000
                        time:   [5.5369 ms 5.5921 ms 5.6491 ms]
                        change: [−2.3938% −0.9506% +0.5954%] (p = 0.22 > 0.05)
                        No change in performance detected.

Wide tree/Taffy 0.7 (2-level hierarchy)/10000
                        time:   [7.0380 ms 7.1307 ms 7.3113 ms]
                        change: [−4.3709% +3.0942% +9.8721%] (p = 0.43 > 0.05)
                        No change in performance detected.

Deep tree (auto size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [5.0202 ms 5.1008 ms 5.1881 ms]
                        change: [+51.424% +54.215% +57.129%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Deep tree (auto size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [12.300 ms 12.421 ms 12.552 ms]
                        change: [+33.458% +38.480% +42.594%] (p = 0.00 < 0.05)
                        Performance has regressed.

Deep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [2.4441 ms 2.4623 ms 2.4923 ms]
                        change: [−9.5346% −6.6524% −3.4755%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 10 measurements (40.00%)
  2 (20.00%) low mild
  2 (20.00%) high severe
Deep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [6.3630 ms 6.4582 ms 6.6748 ms]
                        change: [−2.3044% +1.6328% +5.5245%] (p = 0.46 > 0.05)
                        No change in performance detected.

super deep/Taffy 0.7 /100
                        time:   [430.22 µs 447.87 µs 460.40 µs]
                        change: [+39.519% +44.728% +49.049%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/grid.rs (target/release/deps/grid-fa3bd6dc95a03bd2)
Gnuplot not found, using plotters backend
grid/wide/31x31/961     time:   [1.0470 ms 1.1202 ms 1.2003 ms]
                        change: [−5.6796% +3.9343% +13.747%] (p = 0.45 > 0.05)
                        No change in performance detected.
grid/wide/100x100/10000 time:   [15.392 ms 16.370 ms 17.271 ms]
                        change: [−11.077% −7.1656% −3.3548%] (p = 0.00 < 0.05)
                        Performance has improved.
grid/wide/316x316/99856 time:   [217.80 ms 222.32 ms 227.62 ms]
                        change: [−9.5941% −7.3845% −5.1144%] (p = 0.00 < 0.05)
                        Performance has improved.

grid/deep/2x2/1024      time:   [2.9880 ms 3.0262 ms 3.0604 ms]
                        change: [+13.679% +17.257% +21.333%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/3x3/6561      time:   [15.620 ms 15.844 ms 16.097 ms]
                        change: [+21.902% +24.069% +26.205%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/2x2/16384     time:   [56.478 ms 56.713 ms 57.157 ms]
                        change: [+19.212% +21.136% +23.037%] (p = 0.00 < 0.05)
                        Performance has regressed.

Benchmarking grid/superdeep/1x1/100: Collecting 10 samples in estimated 5.0033 s (13k itera
grid/superdeep/1x1/100  time:   [357.68 µs 360.16 µs 362.37 µs]
                        change: [+41.724% +44.073% +46.613%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking grid/superdeep/1x1/1000: Collecting 10 samples in estimated 5.1381 s (1320 ite
grid/superdeep/1x1/1000 time:   [3.6229 ms 3.6789 ms 3.7103 ms]
                        change: [+22.321% +25.045% +27.599%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/mixed.rs (target/release/deps/mixed-879dff8856f2a28c)
Gnuplot not found, using plotters backend
Benchmarking mixed_flex_grid/mixed/depth_2_width_4: Collecting 100 samples in estimated 6.3
mixed_flex_grid/mixed/depth_2_width_4
                        time:   [2.5616 ms 2.5842 ms 2.6104 ms]
                        change: [+7.9924% +9.4217% +10.971%] (p = 0.00 < 0.05)
                        Performance has regressed.

@nicoburns nicoburns added bug Something isn't working performance Layout go brr controversial This work requires a heightened standard of review due to implementation or design complexity labels Jan 31, 2026
@nicoburns
Copy link
Collaborator Author

@jrmoulton Could you give this a go and see if it resolves the issues you're seeing? This should rebase cleanly on top of of Taffy 0.9 if you don't want to upgrade to main just to test.

Signed-off-by: Nico Burns <nico@nicoburns.com>
@jrmoulton
Copy link

This does solve the case that I had shared with you but I still have another case that is broken that is fixed by aggressively clearing the cache on the text node on every frame.

Below is a case with your fix applied but not aggressively clearing the cache. If I aggressively clear the cache it doesnt' wrap.

bad-wrap.mp4

For more context:

This case I am aggressively clearing the cache but without your fix applied and it doesn't get enough space. If I use your fix (even without manually clearing the cache) this case works.

did-not-wrap.mp4

Signed-off-by: Nico Burns <nico@nicoburns.com>
Signed-off-by: Nico Burns <nico@nicoburns.com>
@nicoburns
Copy link
Collaborator Author

@jrmoulton I've pushed another update that you may wish to test. Expect terrible performance with this one. But that ought to be fixable if it solves the correctness issues.

@jrmoulton
Copy link

This change does fix all issues I was having without me doing any additional clearing of the cache ❤️

@nicoburns
Copy link
Collaborator Author

Hmm... I think this may only be working because it's thrashing the cache (effectively aggressively clearing it for us). When I try to get the performance back it breaks again on the layouts I'm testing.

@jrmoulton Your layout looks a particularly simple example that stills breaks, and I would be keen to turn into a test case. Would you be able to post independently runnable code (even Floem code) that reproduces the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working controversial This work requires a heightened standard of review due to implementation or design complexity performance Layout go brr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants