Skip to content

Conversation

@philiptaron
Copy link
Contributor

@philiptaron philiptaron commented Jan 27, 2026

Summary

This PR adds builtins.filterAttrs and introduces a tombstone system for the Bindings layer that enables memory-efficient attribute deletion through layering rather than copying.

Changes

  1. Tombstone infrastructure - Bindings now support tombstone values (nullptr) that mark attributes as deleted in layered bindings. This enables efficient removal operations where we layer deletions on top rather than copying all remaining attributes.

  2. removeAttrs optimization - For large attrsets (≥8 elements), removeAttrs now layers tombstones on top of the source instead of copying all non-removed attributes.

  3. builtins.filterAttrs - New builtin that filters attribute sets based on a predicate. Uses tombstones for large attrsets when most attributes pass the filter.

Example

builtins.filterAttrs (name: value: value > 5) { a = 3; b = 6; c = 10; }
# => { b = 6; c = 10; }

Performance

Tested with lib.filterAttrs updated to use builtins.filterAttrs or <fallback>, comparing Nix 2.33.1 (baseline) vs 2.33.1+filterAttrs (this PR).

filterAttrs microbenchmark

100 iterations of lib.filterAttrs on a 10,000-attribute set:

Version Time Memory Improvement
Baseline (no builtin) 0.55s 198 MB
This PR 0.32s 143 MB 42% faster, 28% less memory

NixOS test evaluations

Test Baseline This PR Delta
nixosTests.simple 2.16s 2.23s +3%
nixosTests.firefox 2.78s 2.74s -1%

nixpkgs ci/eval (5 chunks of 5000 attributes)

Chunk Baseline Time PR Time Time Δ Baseline Mem PR Mem Mem Δ
0 17.07s 17.79s +4.2% 2704 MB 2775 MB +2.6%
1 19.01s 19.50s +2.6% 3241 MB 3360 MB +3.7%
2 13.41s 13.62s +1.6% 2774 MB 2692 MB -2.9%
3 3.78s 3.88s +2.6% 797 MB 776 MB -2.7%
4 4.95s 5.09s +2.8% 921 MB 925 MB +0.5%
Total 58.22s 59.88s +2.9% mixed

Summary

  • filterAttrs-heavy workloads: 42% faster, 28% less memory
  • General evaluation: ~3% slower, memory neutral

The tombstone infrastructure adds overhead to general attribute operations. The wins manifest when filterAttrs is used heavily.

Implementation details

  • Tombstones are nullptr values in the Bindings Attr array
  • Bindings::get() returns nullptr when it finds a tombstone (attribute deleted)
  • Bindings::iterator skips tombstones during iteration (K-way merge)
  • hasTombstonesInChain flag enables fast-path when no tombstones exist (bit-packed into numLayers field to avoid struct size increase)
  • Threshold of 8 elements determines when to use tombstones vs copying

Acknowledgments

Inspired by DeterminateSystems#291 - thanks to the original authors. This implementation extends the concept with tombstone-based memory optimization hinted at by @xokdvium.

@philiptaron philiptaron requested a review from edolstra as a code owner January 27, 2026 17:49
@philiptaron philiptaron requested review from xokdvium and removed request for edolstra January 27, 2026 17:49
@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Jan 27, 2026
Comment on lines 3376 to 3384
for (auto & i : *args[1]->attrs()) {
Value * vName = Value::toPtr(state.symbols[i.name]);
Value * callArgs[] = {vName, i.value};
Value res;
state.callFunction(*args[0], callArgs, res, noPos);
if (state.forceBool(
res, pos, "while evaluating the return value of the filtering function passed to builtins.filterAttrs"))
attrs.insert(i.name, i.value);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, with the layering optimisation we have an opportunity to avoid copying anything at all. I did have some PoC branch that added tombstone values in case the attrset we are removing stuff from is much larger than the number attributes that get filtered out. This way we'd easily avoid the hot path when no stuff actually gets filtered out (I think that's exactly how it's used in the successful case in callPackage).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that checks out. How POC is the POC tombstone branch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I think I've implemented enough stuff for the k-way merge iterator to skip nullptr values (so upper attrset "layers" can remove attrs from the resulting bindings and without actually modifying the lower ones).

I think something like: loop over the bindings and collect the ones that should be removed until we reach a sensible count that we actually have to allocate. The primary question would be about what sort of heuristic do we apply? Is removing 8 attrs out of 128 enough to trigger this optimisation?

I can look into digging up that branch and putting that up as an alternative.

Copy link
Contributor Author

@philiptaron philiptaron Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be data driven here. As you point out, it's used in callPackage -- which is very used everywhere.

I added tracing to lib.filterAttrs and evaluated NixOS tests:

Results

Metric nixosTests.simple nixosTests.gnome
Total calls 2,910 3,882
Empty sets (0 → 0) 1,461 (50%) 2,146 (55%)
Platform cmp (117 → 112) 692 (24%) 696 (18%)
Tiny (1-5 elements) 490 (17%) 705 (18%)
Large (>100 elements) 697 (24%) 704 (18%)

What's being filtered

Sizes What it is
768 → 706 Kernel config options (8139TOO_8129, ACCESSIBILITY...)
289 → 232 Font configuration
217 → 217 NixOS users/groups (unchanged - no-op)
117 → 112 Platform/system config (lib.systems.equals) - 696 calls
0 → 0 Empty sets from module system - ~50% of all calls

Key Sources

  1. lib.systems.equals (lib/systems/default.nix:46) - 696 calls comparing platforms by filtering out 5 function-valued attributes from 117-element system configs. Same filtering repeated hundreds of times on same data.

  2. lib/types.nix:873 (module system) - filters merged attribute definitions. Generates most of the 0 → 0 empty-set calls.

  3. lib/customisation.nix:290 (callPackageWith) - checks for missing required arguments.

Thoughts

~55% of calls filter empty sets, ~24% repeat the same platform filtering. The C++ builtin helps (especially with the early empty-set bailout), but there are also nixpkgs-specific changes possible:

  • Cache filtered results in lib.systems.elaborate
  • Avoid filterAttrs in module system hot paths when input is known empty

@github-actions github-actions bot added the c api Nix as a C library with a stable interface label Jan 27, 2026
push(cursor);

// If this is a tombstone (nullptr value), skip it and all shadowed entries
if (current->value == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the invariant should be that the iterator never points to a tombstone. Then no changes are needed to downstream.

@philiptaron philiptaron marked this pull request as draft January 27, 2026 22:10
@philiptaron
Copy link
Contributor Author

Marking as draft while iterating.

Comment on lines 3549 to 3559
for (const auto & i : input) {
Value * vName = Value::toPtr(state.symbols[i.name]);
Value * callArgs[] = {vName, i.value};
Value res;
state.callFunction(*args[0], callArgs, res, noPos);
bool keep = state.forceBool(
res, pos, "while evaluating the return value of the filtering function passed to builtins.filterAttrs");
filterResults.emplace_back(i.name, keep);
if (keep)
++numKept;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like std::views::filter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should we instead count the stuff that gets erased? Not sure about the tradeoffs here, but my intuition is that the usecases tend to erase (but this needs empirical evidence from nixpkgs I suppose)

Comment on lines 3544 to 3547
// First pass: evaluate filter for all attributes and collect results
boost::container::small_vector<std::pair<Symbol, bool>, 64> filterResults;
filterResults.reserve(input.size());
size_t numKept = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems inevitable if we want to avoid over-allocation.

Comment on lines 530 to 535
if (attr.value == nullptr) {
if (base.get(attr.name))
++tombstonesShadowingBase;
else
++tombstonesNotShadowing;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, the iterator should never point to a tombstone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator doesn't. These are the internal iterators over attrs, which is the FAM above.

Comment on lines 3537 to 3540
if (input.empty()) {
v = *args[1];
return;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does change the lazyness of the function -- if attrs is empty, then the function isn't forced!

@@ -364,8 +383,12 @@ public:
const Bindings * currentChunk = this;
while (currentChunk) {
const Attr * maybeAttr = getInChunk(*currentChunk);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getInChunk is just a binary search that finds an Attr by name, regardless of whether the value is nullptr.

Comment on lines 530 to 535
if (attr.value == nullptr) {
if (base.get(attr.name))
++tombstonesShadowingBase;
else
++tombstonesNotShadowing;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator doesn't. These are the internal iterators over attrs, which is the FAM above.

@@ -484,6 +512,8 @@ private:
auto attrs = std::span(bindings->attrs, bindings->numAttrs);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iterator is over the FAM in the class above, which contains tombstones.

@philiptaron philiptaron force-pushed the builtins.filterAttrs branch 2 times, most recently from aba11ff to b3e7061 Compare January 27, 2026 23:51
@philiptaron
Copy link
Contributor Author

philiptaron commented Jan 28, 2026

FWIW, I'm seeing performance regressions with the tombstone approach that I don't see with the original approach when I'm testing against Nixpkgs. The original approach has no regressions I can find with the package and the NixOS test evals. What's in this branch right now is a consistent 3-5% slower, though it does use less memory.

This adds support for tombstone values (nullptr) in the Bindings layer
system, enabling attribute deletion through layering rather than copying.

Changes:
- Bindings::get() treats nullptr values as "attribute deleted"
- Bindings::iterator skips tombstone entries during iteration
- BindingsBuilder::finishSizeIfNecessary() properly accounts for
  tombstones when calculating the size of layered bindings

This infrastructure enables efficient filterAttrs implementation by
layering tombstones on top of the original attrset instead of copying
all kept attributes.
The tombstone infrastructure was adding ~5% overhead to general evaluation
even when no tombstones were being used, due to extra checks in hot paths.

Profiling showed BindingsBuilder::push_back went from 0.09% to 1.77% (20x!)
due to the tombstone count tracking on every attribute insertion.

This commit:
- Packs hasTombstonesInChain into numLayers field (bit 31) to avoid
  increasing Bindings struct size (stays at 24 bytes vs 32 bytes unpacked)
- Adds hasTombstonesInLayer flag to BindingsBuilder (set only by insertTombstone)
- Adds insertTombstone() method for explicit tombstone insertion
- Keeps push_back() completely clean - no tombstone checks
- Fast path in finishSizeIfNecessary() when no tombstones anywhere

The flag is propagated through layers: true if base has tombstones OR
if this layer adds tombstones via insertTombstone().

Normal evaluation (no filterAttrs) has ~4% overhead from the tombstone
checks in get() and iterator.
Rewrite builtins.removeAttrs to use the tombstone layering system
introduced in the previous commits. This avoids copying all remaining
attributes when removing from large attribute sets.

Changes:
- Add fast paths for common cases:
  - Empty removal list: return original attrset unchanged
  - Empty source attrset: return emptyBindings
  - No matching attributes: return original attrset unchanged
- For attrsets >= 8 elements with available layer capacity:
  - Layer tombstones on top instead of copying all non-removed attrs
  - Only insert tombstones for attributes that actually exist
- Fall back to copy-based set_difference for small attrsets or when
  the layer list is full (max 8 layers)

Performance results (nixpkgs firefox/chromium evaluation):
- Memory: 3-5% reduction in set allocation bytes
- CPU: Neutral (within measurement noise)
- Sets created: 1.4-1.8% fewer allocations

The tombstone approach trades a small per-lookup cost for significant
memory savings when removeAttrs is called on large attribute sets,
which is common in nixpkgs (e.g., removing meta, passthru from mkDerivation).
@philiptaron
Copy link
Contributor Author

philiptaron commented Jan 28, 2026

I think I found a way through the thicket. I'm still seeing eval times (measured by the ci/eval machinery in Nixpkgs) being consistently 2.5% higher, though. So tombstones might be a memory win but not a win on eval time.

@philiptaron philiptaron changed the title libexpr: add builtins.filterAttrs libexpr: add builtins.filterAttrs with tombstone optimization Jan 28, 2026
@philiptaron philiptaron changed the title libexpr: add builtins.filterAttrs with tombstone optimization libexpr: add builtins.filterAttrs with tombstone optimization Jan 28, 2026
@roberth
Copy link
Member

roberth commented Jan 28, 2026

Please add these as test cases for laziness:

nix-repl> builtins.attrNames (lib.filterAttrs (name: value: name == "a") { a = abort "a"; b = abort "b"; c = abort "c"; })                          
[ "a" ]

nix-repl> lib.filterAttrs (abort "f") { }
{ }

Add a new builtin function filterAttrs that filters attribute sets based
on a predicate function. The predicate receives both the attribute name
and value, returning true for attributes to keep.

Implementation uses tombstones for large attrsets (>= 8 elements) when
layer capacity is available. Instead of copying all matching attributes,
it layers tombstones for non-matching attributes on top of the source.
This is beneficial when most attributes pass the filter (common case).

Fast paths:
- Empty attrset: return unchanged
- All attributes pass: return unchanged
- No attributes pass: return emptyBindings

Falls back to copy-based approach for small attrsets or when the
layer list is full.

Example usage:
  builtins.filterAttrs (name: value: value > 5) { a = 3; b = 6; c = 10; }
  => { b = 6; c = 10; }
@xokdvium
Copy link
Contributor

FWIW, the k-way merge has some low-hanging fruit to improve CPU time like avoiding heap churn for long runs of non-intersecting symbols (should be very common with by-name structure).

@philiptaron
Copy link
Contributor Author

philiptaron commented Jan 28, 2026

Added the lazy test cases. Now I'm really going to bed instead of obsessing about why there's 2.5% difference.

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -0,0 +1,2 @@
# Test that filterAttrs doesn't call the function when the attrset is empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could evaluate without calling, and that's basically the mistake this is supposed to catch

Suggested change
# Test that filterAttrs doesn't call the function when the attrset is empty
# Test that filterAttrs doesn't evaluate the function when the attrset is empty

Copy link
Member

@roberth roberth Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-topic: for the second time I read this code and freak out because GitHub shows it after the == "a" test case.
Maybe we should do .nix and .nix.exp so that the list starts with the code and then the result; more natural.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c api Nix as a C library with a stable interface with-tests Issues related to testing. PRs with tests have some priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants