Skip to content

i#7731: Optimize reuse distance collection using splay tree#7732

Open
artur-abd wants to merge 17 commits intoDynamoRIO:masterfrom
artur-abd:i7731-optimize-reuse-distance
Open

i#7731: Optimize reuse distance collection using splay tree#7732
artur-abd wants to merge 17 commits intoDynamoRIO:masterfrom
artur-abd:i7731-optimize-reuse-distance

Conversation

@artur-abd
Copy link

@artur-abd artur-abd commented Nov 26, 2025

Improve reuse distance using splay tree instead of linked list. Splay tree have asymptotic O(log n) amortizated time for all operations instead of O(n / const) for linked list with skip list.
Perfomance improve achives in avarage 10-40% based on different benchmarks.

Fixes #7731

@artur-abd artur-abd changed the title i#7731: Replace linked list with splay tree i#7731: Optimize reuse distance collection using splay tree Nov 26, 2025
@artur-abd artur-abd marked this pull request as draft November 27, 2025 05:35
@artur-abd artur-abd force-pushed the i7731-optimize-reuse-distance branch 2 times, most recently from 4b6bffb to 3961c95 Compare November 27, 2025 13:24
Improve reuse distance using splay tree instead of linked list. Splay tree have asymptotic O(log n) amortizated time for all operations instead of O(n / const) for linked list with skip list.
Perfomance improve achives in avarage 10-40% based on different benchmarks.

Fixes DynamoRIO#7731
@artur-abd artur-abd force-pushed the i7731-optimize-reuse-distance branch from 3961c95 to 1b8aa74 Compare November 27, 2025 13:26
@artur-abd artur-abd marked this pull request as ready for review November 27, 2025 13:37
@derekbruening
Copy link
Contributor

Thank you for contributing. Please mark for review when ready. Also, please don't force-push a shared branch: see https://dynamorio.org/page_code_reviews.html about squash-and-merge, where the final merge commit message comes from, and how force pushes break the review process. Use separate incremental commits with commit messages that describe what each one is doing (and not just repeating the merge message).

Fixes description for line_ref_splay_t and it's methods

Fixes: DynamoRIO#7731
@artur-abd
Copy link
Author

artur-abd commented Dec 1, 2025

Thanks for your comments. PR is ready for review.
Tests for ci-windows / vs2022-64 (pull_request) is failed due to a timeout in other modules. Self triggered tests passed.

@abhinav92003 abhinav92003 self-requested a review December 2, 2025 19:47
// The earlier a cache line was accessed last time, the deeper that cache line
// is in the list.
// We use a splay to keep track of the cache line reuse distance.
// The lefmost node of the splay tree is the most recently accessed cache line.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lefmost node of the splay tree is the most recently accessed cache line.

I thought splay trees have the most recently accessed node closer to the root, not to the left.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true only when inserting a new element. If you perform a splay operation on an element that is not newly inserted, this property will not hold.

, tail_(NULL)
// The splay tree is a binary search tree that using splay operatio for balancing, it
// allow to delete and insert an element at any position in O(log n) amortizated time.
struct line_ref_splay_t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially done with reviewing the splay tree functions here. A couple high level comments:

  • I think it'll be useful to move this implementation to ext/drcontainers
  • We also need a dedicated set of unit tests for the splay tree impl. It is complex enough to warrant separate test coverage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be useful to move this implementation to ext/drcontainers

This splay tree implementation is written with explicit keys, so it is quite specific. Also, implementing it as a C-style container would reduce performance because it require additional dereference.

Copilot AI review requested due to automatic review settings January 23, 2026 12:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes drcachesim’s reuse-distance collection by replacing the linked-list (+skip list) data structure with a splay tree to reduce per-access overhead.

Changes:

  • Replaced the internal reuse-distance tracking structure with a splay tree implementation.
  • Updated reuse-distance tool/shard wiring to use the new node type and tree container.
  • Added unit tests for the splay tree and marked skip-list-related CLI options as deprecated.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
clients/drcachesim/tools/reuse_distance.h Replaces list/skip-list logic with a splay tree, adds splay operations and subtree-size tracking.
clients/drcachesim/tools/reuse_distance.cpp Updates shard initialization and reuse-distance processing to use the new node/tree types.
clients/drcachesim/tests/reuse_distance_test.cpp Adds a dedicated splay-tree correctness/invariant test.
clients/drcachesim/common/options.h Renames the skip-list distance option variable to reflect deprecation.
clients/drcachesim/common/options.cpp Deprecates skip-list-related option help text (and keeps CLI flag name).
clients/drcachesim/analyzer_multi.cpp Updates option wiring to use the renamed deprecated option variable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@artur-abd
Copy link
Author

@abhinav92003, thanks for review.
I've added tests and marked the skip list options as deprecated.
However, I'm not sure that extracting this splay tree implementation into ext is a good idea, for the reason I mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

drcachecim/reuse_distance Optimize perfomance

3 participants