libexpr: add `builtins.filterAttrs` with tombstone optimization #15100

philiptaron · 2026-01-27T17:49:45Z

Summary

This PR adds builtins.filterAttrs and introduces a tombstone system for the Bindings layer that enables memory-efficient attribute deletion through layering rather than copying.

Changes

Tombstone infrastructure - Bindings now support tombstone values (nullptr) that mark attributes as deleted in layered bindings. This enables efficient removal operations where we layer deletions on top rather than copying all remaining attributes.
removeAttrs optimization - For large attrsets (≥8 elements), removeAttrs now layers tombstones on top of the source instead of copying all non-removed attributes.
builtins.filterAttrs - New builtin that filters attribute sets based on a predicate. Uses tombstones for large attrsets when most attributes pass the filter.

Example

builtins.filterAttrs (name: value: value > 5) { a = 3; b = 6; c = 10; }
# => { b = 6; c = 10; }

Performance

Tested with lib.filterAttrs updated to use builtins.filterAttrs or <fallback>, comparing Nix 2.33.1 (baseline) vs 2.33.1+filterAttrs (this PR).

filterAttrs microbenchmark

100 iterations of lib.filterAttrs on a 10,000-attribute set:

Version	Time	Memory	Improvement
Baseline (no builtin)	0.55s	198 MB	—
This PR	0.32s	143 MB	42% faster, 28% less memory

NixOS test evaluations

Test	Baseline	This PR	Delta
`nixosTests.simple`	2.16s	2.23s	+3%
`nixosTests.firefox`	2.78s	2.74s	-1%

nixpkgs ci/eval (5 chunks of 5000 attributes)

Chunk	Baseline Time	PR Time	Time Δ	Baseline Mem	PR Mem	Mem Δ
0	17.07s	17.79s	+4.2%	2704 MB	2775 MB	+2.6%
1	19.01s	19.50s	+2.6%	3241 MB	3360 MB	+3.7%
2	13.41s	13.62s	+1.6%	2774 MB	2692 MB	-2.9%
3	3.78s	3.88s	+2.6%	797 MB	776 MB	-2.7%
4	4.95s	5.09s	+2.8%	921 MB	925 MB	+0.5%
Total	58.22s	59.88s	+2.9%	—	—	mixed

Summary

filterAttrs-heavy workloads: 42% faster, 28% less memory
General evaluation: ~3% slower, memory neutral

The tombstone infrastructure adds overhead to general attribute operations. The wins manifest when filterAttrs is used heavily.

Implementation details

Tombstones are nullptr values in the Bindings Attr array
Bindings::get() returns nullptr when it finds a tombstone (attribute deleted)
Bindings::iterator skips tombstones during iteration (K-way merge)
hasTombstonesInChain flag enables fast-path when no tombstones exist (bit-packed into numLayers field to avoid struct size increase)
Threshold of 8 elements determines when to use tombstones vs copying

Acknowledgments

Inspired by DeterminateSystems#291 - thanks to the original authors. This implementation extends the concept with tombstone-based memory optimization hinted at by @xokdvium.

xokdvium · 2026-01-27T20:21:24Z

src/libexpr/primops.cc

+    for (auto & i : *args[1]->attrs()) {
+        Value * vName = Value::toPtr(state.symbols[i.name]);
+        Value * callArgs[] = {vName, i.value};
+        Value res;
+        state.callFunction(*args[0], callArgs, res, noPos);
+        if (state.forceBool(
+                res, pos, "while evaluating the return value of the filtering function passed to builtins.filterAttrs"))
+            attrs.insert(i.name, i.value);
+    }


FWIW, with the layering optimisation we have an opportunity to avoid copying anything at all. I did have some PoC branch that added tombstone values in case the attrset we are removing stuff from is much larger than the number attributes that get filtered out. This way we'd easily avoid the hot path when no stuff actually gets filtered out (I think that's exactly how it's used in the successful case in callPackage).

Yeah, that checks out. How POC is the POC tombstone branch?

Hm I think I've implemented enough stuff for the k-way merge iterator to skip nullptr values (so upper attrset "layers" can remove attrs from the resulting bindings and without actually modifying the lower ones).

I think something like: loop over the bindings and collect the ones that should be removed until we reach a sensible count that we actually have to allocate. The primary question would be about what sort of heuristic do we apply? Is removing 8 attrs out of 128 enough to trigger this optimisation?

I can look into digging up that branch and putting that up as an alternative.

We can be data driven here. As you point out, it's used in callPackage -- which is very used everywhere.

I added tracing to lib.filterAttrs and evaluated NixOS tests:

Results

Metric nixosTests.simple nixosTests.gnome

Total calls 2,910 3,882

Empty sets (0 → 0) 1,461 (50%) 2,146 (55%)

Platform cmp (117 → 112) 692 (24%) 696 (18%)

Tiny (1-5 elements) 490 (17%) 705 (18%)

Large (>100 elements) 697 (24%) 704 (18%)

What's being filtered

Sizes What it is

768 → 706 Kernel config options (8139TOO_8129, ACCESSIBILITY...)

289 → 232 Font configuration

217 → 217 NixOS users/groups (unchanged - no-op)

117 → 112 Platform/system config (lib.systems.equals) - 696 calls

0 → 0 Empty sets from module system - ~50% of all calls

Key Sources

lib.systems.equals (lib/systems/default.nix:46) - 696 calls comparing platforms by filtering out 5 function-valued attributes from 117-element system configs. Same filtering repeated hundreds of times on same data.

lib/types.nix:873 (module system) - filters merged attribute definitions. Generates most of the 0 → 0 empty-set calls.

lib/customisation.nix:290 (callPackageWith) - checks for missing required arguments.

Thoughts

~55% of calls filter empty sets, ~24% repeat the same platform filtering. The C++ builtin helps (especially with the early empty-set bailout), but there are also nixpkgs-specific changes possible:

Cache filtered results in lib.systems.elaborate

Avoid filterAttrs in module system hot paths when input is known empty

xokdvium · 2026-01-27T22:09:38Z

src/libexpr/include/nix/expr/attr-set.hh

                push(cursor);
+
+            // If this is a tombstone (nullptr value), skip it and all shadowed entries
+            if (current->value == nullptr) {


I think the invariant should be that the iterator never points to a tombstone. Then no changes are needed to downstream.

philiptaron · 2026-01-27T22:10:37Z

Marking as draft while iterating.

xokdvium · 2026-01-27T22:14:14Z

src/libexpr/primops.cc

+    for (const auto & i : input) {
+        Value * vName = Value::toPtr(state.symbols[i.name]);
+        Value * callArgs[] = {vName, i.value};
+        Value res;
+        state.callFunction(*args[0], callArgs, res, noPos);
+        bool keep = state.forceBool(
+            res, pos, "while evaluating the return value of the filtering function passed to builtins.filterAttrs");
+        filterResults.emplace_back(i.name, keep);
+        if (keep)
+            ++numKept;
+    }


Looks like std::views::filter?

Also should we instead count the stuff that gets erased? Not sure about the tradeoffs here, but my intuition is that the usecases tend to erase (but this needs empirical evidence from nixpkgs I suppose)

xokdvium · 2026-01-27T22:15:20Z

src/libexpr/primops.cc

+    // First pass: evaluate filter for all attributes and collect results
+    boost::container::small_vector<std::pair<Symbol, bool>, 64> filterResults;
+    filterResults.reserve(input.size());
+    size_t numKept = 0;


Seems inevitable if we want to avoid over-allocation.

xokdvium · 2026-01-27T22:15:59Z

src/libexpr/include/nix/expr/attr-set.hh

+                if (attr.value == nullptr) {
+                    if (base.get(attr.name))
+                        ++tombstonesShadowingBase;
+                    else
+                        ++tombstonesNotShadowing;
+                }


Ditto, the iterator should never point to a tombstone.

The iterator doesn't. These are the internal iterators over attrs, which is the FAM above.

philiptaron · 2026-01-27T22:01:26Z

src/libexpr/primops.cc

+    if (input.empty()) {
+        v = *args[1];
+        return;
+    }


This does change the lazyness of the function -- if attrs is empty, then the function isn't forced!

philiptaron · 2026-01-27T22:29:03Z

src/libexpr/include/nix/expr/attr-set.hh

@@ -364,8 +383,12 @@ public:
        const Bindings * currentChunk = this;
        while (currentChunk) {
            const Attr * maybeAttr = getInChunk(*currentChunk);


getInChunk is just a binary search that finds an Attr by name, regardless of whether the value is nullptr.

philiptaron · 2026-01-27T22:49:03Z

src/libexpr/include/nix/expr/attr-set.hh

+                if (attr.value == nullptr) {
+                    if (base.get(attr.name))
+                        ++tombstonesShadowingBase;
+                    else
+                        ++tombstonesNotShadowing;
+                }


The iterator doesn't. These are the internal iterators over attrs, which is the FAM above.

philiptaron · 2026-01-27T22:49:52Z

src/libexpr/include/nix/expr/attr-set.hh

@@ -484,6 +512,8 @@ private:
        auto attrs = std::span(bindings->attrs, bindings->numAttrs);


This iterator is over the FAM in the class above, which contains tombstones.

philiptaron · 2026-01-28T00:51:46Z

FWIW, I'm seeing performance regressions with the tombstone approach that I don't see with the original approach when I'm testing against Nixpkgs. The original approach has no regressions I can find with the package and the NixOS test evals. What's in this branch right now is a consistent 3-5% slower, though it does use less memory.

This adds support for tombstone values (nullptr) in the Bindings layer system, enabling attribute deletion through layering rather than copying. Changes: - Bindings::get() treats nullptr values as "attribute deleted" - Bindings::iterator skips tombstone entries during iteration - BindingsBuilder::finishSizeIfNecessary() properly accounts for tombstones when calculating the size of layered bindings This infrastructure enables efficient filterAttrs implementation by layering tombstones on top of the original attrset instead of copying all kept attributes.

The tombstone infrastructure was adding ~5% overhead to general evaluation even when no tombstones were being used, due to extra checks in hot paths. Profiling showed BindingsBuilder::push_back went from 0.09% to 1.77% (20x!) due to the tombstone count tracking on every attribute insertion. This commit: - Packs hasTombstonesInChain into numLayers field (bit 31) to avoid increasing Bindings struct size (stays at 24 bytes vs 32 bytes unpacked) - Adds hasTombstonesInLayer flag to BindingsBuilder (set only by insertTombstone) - Adds insertTombstone() method for explicit tombstone insertion - Keeps push_back() completely clean - no tombstone checks - Fast path in finishSizeIfNecessary() when no tombstones anywhere The flag is propagated through layers: true if base has tombstones OR if this layer adds tombstones via insertTombstone(). Normal evaluation (no filterAttrs) has ~4% overhead from the tombstone checks in get() and iterator.

Rewrite builtins.removeAttrs to use the tombstone layering system introduced in the previous commits. This avoids copying all remaining attributes when removing from large attribute sets. Changes: - Add fast paths for common cases: - Empty removal list: return original attrset unchanged - Empty source attrset: return emptyBindings - No matching attributes: return original attrset unchanged - For attrsets >= 8 elements with available layer capacity: - Layer tombstones on top instead of copying all non-removed attrs - Only insert tombstones for attributes that actually exist - Fall back to copy-based set_difference for small attrsets or when the layer list is full (max 8 layers) Performance results (nixpkgs firefox/chromium evaluation): - Memory: 3-5% reduction in set allocation bytes - CPU: Neutral (within measurement noise) - Sets created: 1.4-1.8% fewer allocations The tombstone approach trades a small per-lookup cost for significant memory savings when removeAttrs is called on large attribute sets, which is common in nixpkgs (e.g., removing meta, passthru from mkDerivation).

philiptaron · 2026-01-28T07:13:08Z

I think I found a way through the thicket. I'm still seeing eval times (measured by the ci/eval machinery in Nixpkgs) being consistently 2.5% higher, though. So tombstones might be a memory win but not a win on eval time.

roberth · 2026-01-28T07:38:52Z

Please add these as test cases for laziness:

nix-repl> builtins.attrNames (lib.filterAttrs (name: value: name == "a") { a = abort "a"; b = abort "b"; c = abort "c"; })                          
[ "a" ]

nix-repl> lib.filterAttrs (abort "f") { }
{ }

Add a new builtin function filterAttrs that filters attribute sets based on a predicate function. The predicate receives both the attribute name and value, returning true for attributes to keep. Implementation uses tombstones for large attrsets (>= 8 elements) when layer capacity is available. Instead of copying all matching attributes, it layers tombstones for non-matching attributes on top of the source. This is beneficial when most attributes pass the filter (common case). Fast paths: - Empty attrset: return unchanged - All attributes pass: return unchanged - No attributes pass: return emptyBindings Falls back to copy-based approach for small attrsets or when the layer list is full. Example usage: builtins.filterAttrs (name: value: value > 5) { a = 3; b = 6; c = 10; } => { b = 6; c = 10; }

xokdvium · 2026-01-28T08:23:40Z

FWIW, the k-way merge has some low-hanging fruit to improve CPU time like avoiding heap churn for long runs of non-intersecting symbols (should be very common with by-name structure).

philiptaron · 2026-01-28T08:23:52Z

Added the lazy test cases. Now I'm really going to bed instead of obsessing about why there's 2.5% difference.

roberth

Thanks!

roberth · 2026-01-28T13:38:35Z

tests/functional/lang/eval-okay-filterattrs-empty.nix

@@ -0,0 +1,2 @@
+# Test that filterAttrs doesn't call the function when the attrset is empty


It could evaluate without calling, and that's basically the mistake this is supposed to catch

Suggested change

# Test that filterAttrs doesn't call the function when the attrset is empty

# Test that filterAttrs doesn't evaluate the function when the attrset is empty

roberth · 2026-01-28T13:40:52Z

tests/functional/lang/eval-okay-filterattrs.exp

Off-topic: for the second time I read this code and freak out because GitHub shows it after the == "a" test case.
Maybe we should do .nix and .nix.exp so that the list starts with the code and then the result; more natural.

philiptaron requested a review from edolstra as a code owner January 27, 2026 17:49

philiptaron requested review from xokdvium and removed request for edolstra January 27, 2026 17:49

github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Jan 27, 2026

xokdvium requested changes Jan 27, 2026

View reviewed changes

amaanq mentioned this pull request Jan 27, 2026

libexpr: Add filterAttrs builtin #15102

Closed

philiptaron force-pushed the builtins.filterAttrs branch from 95d9e27 to c0f71cc Compare January 27, 2026 21:59

github-actions bot added the c api Nix as a C library with a stable interface label Jan 27, 2026

xokdvium reviewed Jan 27, 2026

View reviewed changes

philiptaron force-pushed the builtins.filterAttrs branch from c0f71cc to 33cee85 Compare January 27, 2026 22:09

philiptaron marked this pull request as draft January 27, 2026 22:10

xokdvium reviewed Jan 27, 2026

View reviewed changes

philiptaron commented Jan 27, 2026

View reviewed changes

philiptaron force-pushed the builtins.filterAttrs branch 2 times, most recently from aba11ff to b3e7061 Compare January 27, 2026 23:51

philiptaron force-pushed the builtins.filterAttrs branch from b3e7061 to 52a7103 Compare January 28, 2026 06:44

philiptaron added 3 commits January 27, 2026 22:47

philiptaron force-pushed the builtins.filterAttrs branch from 52a7103 to 11f3da7 Compare January 28, 2026 06:48

philiptaron changed the title ~~libexpr: add builtins.filterAttrs~~ libexpr: add builtins.filterAttrs with tombstone optimization Jan 28, 2026

philiptaron changed the title ~~libexpr: add builtins.filterAttrs with tombstone optimization~~ libexpr: add builtins.filterAttrs with tombstone optimization Jan 28, 2026

philiptaron force-pushed the builtins.filterAttrs branch from 11f3da7 to 972868c Compare January 28, 2026 08:22

roberth reviewed Jan 28, 2026

View reviewed changes

Metric	`nixosTests.simple`	`nixosTests.gnome`
Total calls	2,910	3,882
Empty sets (0 → 0)	1,461 (50%)	2,146 (55%)
Platform cmp (117 → 112)	692 (24%)	696 (18%)
Tiny (1-5 elements)	490 (17%)	705 (18%)
Large (>100 elements)	697 (24%)	704 (18%)

Sizes	What it is
768 → 706	Kernel config options (`8139TOO_8129`, `ACCESSIBILITY`...)
289 → 232	Font configuration
217 → 217	NixOS users/groups (unchanged - no-op)
117 → 112	Platform/system config (`lib.systems.equals`) - 696 calls
0 → 0	Empty sets from module system - ~50% of all calls

		@@ -484,6 +512,8 @@ private:
		auto attrs = std::span(bindings->attrs, bindings->numAttrs);

		@@ -0,0 +1,2 @@
		# Test that filterAttrs doesn't call the function when the attrset is empty

	# Test that filterAttrs doesn't call the function when the attrset is empty
	# Test that filterAttrs doesn't evaluate the function when the attrset is empty

Uh oh!

libexpr: add builtins.filterAttrs with tombstone optimization #15100

Are you sure you want to change the base?

libexpr: add builtins.filterAttrs with tombstone optimization #15100

Uh oh!

Conversation

philiptaron commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Example

Performance

filterAttrs microbenchmark

NixOS test evaluations

nixpkgs ci/eval (5 chunks of 5000 attributes)

Summary

Implementation details

Acknowledgments

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philiptaron Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Results

What's being filtered

Key Sources

Thoughts

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philiptaron commented Jan 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philiptaron commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

philiptaron commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roberth commented Jan 28, 2026

Uh oh!

xokdvium commented Jan 28, 2026

Uh oh!

philiptaron commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roberth left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roberth Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

libexpr: add `builtins.filterAttrs` with tombstone optimization #15100

libexpr: add `builtins.filterAttrs` with tombstone optimization #15100

philiptaron commented Jan 27, 2026 •

edited

Loading

philiptaron Jan 27, 2026 •

edited

Loading

philiptaron commented Jan 28, 2026 •

edited

Loading

philiptaron commented Jan 28, 2026 •

edited

Loading

philiptaron commented Jan 28, 2026 •

edited

Loading

roberth Jan 28, 2026 •

edited

Loading