Improvements to the array module #10578

richcarl · 2026-01-21T18:26:30Z

Takes care of a couple of old TODO notes: adds write caching, which speeds up sequential writes by upwards of 300%, and adds pruning of the data structure when the array shrinks, so that the unused parts can be GC:d.

github-actions · 2026-01-21T18:27:15Z

CT Test Results

3 files 130 suites 1h 22m 44s ⏱️
2 615 tests 2 562 ✅ 51 💤 2 ❌
3 115 runs 3 057 ✅ 56 💤 2 ❌

For more details on these failures, see this check.

Results for commit 0daddab.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

richcarl · 2026-01-22T10:26:12Z

Your commiting-BEAM-file check is drunk.

kikofernandez · 2026-01-22T11:51:59Z

Yes, it detected that one of your contributions contains *.beam files that should not be committed...
It does not apply to this commit, but it will prevent in the long-term external contributions that modify *.beam files
I will try to fix today/tomorrow, or please provide a quick fix. I provide below the lines that need to be updated
https://github.com/erlang/otp/blob/master/.github/workflows/main.yaml#L81-L82

richcarl · 2026-01-22T13:00:23Z

Yes, it detected that one of your contributions contains *.beam files that should not be committed...

The diff included changes that had happened on the target branch (master), not just the ones in the contribution. I rebased and pushed a fix that I think should work for the future.

dgud · 2026-01-28T11:13:37Z

I wait with this until the other (discussed changes) are incorporated.

Measurements gives that a bucket size of 16 seems to be the best size nowadays. Measurements where done on size 8, 10, 16 and 32. The power of 2 size also enables us to use bit operations instead of rem/div, that gave even more performance even if the div/rem operations where optimized. Further optimizations might be to remove the bit-size storage in the nodes, and only keep that on the top-level array.

Remove unused tests in source code.

michalmuskala · 2026-01-31T18:54:15Z

lib/stdlib/src/array.erl

-	end,
-    #array{size = Size, max = M, default = Default, elements = E}.
+    E = find_max(Size - 1, ?SHIFT),
+    C = ?NEW_CACHE(Default),


I wonder to what extent it would be worth it to try and optimise the operations when the default is unchanged to return literals

Don't think I understand?

Unchanged the default is 'undefined', which should be an immediate, can't be better than that, can it?
But even if it the user set default to an literal, wouldn't that just be a pointer to literal area, what do you have in mind?

I mean to have the macros like ?NEW_CACHE or others that dynamically create tuples for values to instead return a literal tuple if Default == ?DEFAULT - avoiding some of the allocations

lib/stdlib/src/array.erl

Keep the size information in meta-data only, and forward recursively. Reduces memory and removes one bit shift on recursions. Fixed coverage and removed some dead code.

michalmuskala · 2026-02-06T09:35:48Z

lib/stdlib/src/array.erl

 -record(array, {size :: non_neg_integer(),	%% number of defined entries
-		max  :: non_neg_integer(),	%% maximum number of entries
-						%% in current tree
+		fix  :: boolean(),	        %% not automatically growing
 		default,	%% the default value (usually 'undefined')
-                elements :: elements(_)         %% the tuple tree
+                cache :: cache(),               %% cached leaf tuple
+                cache_index :: non_neg_integer(),
+                elements :: elements(_),         %% the tuple tree
+                bits :: integer() %% in bits
 	       }).


Would it make sense to store some fields together to save on garbage when the array is updated - the old implementation had just 4 top-level fields, this has 7, so every update operation creates at least 3 words more of garbage

I have thought about it, but most fields are needed regularly enough that you don't want to follow an extra pointer to access them. But a linear type analysis allowing in-place updates would be nice...

bjorng · 2026-02-07T05:42:17Z

pdict_SUITE:mixed/1 in the kernel application fails:

=== Location: [{erlang,tuple_to_list},
 {array,prune,492},
 {array,shrink_1,488},
 {array,resize,455},
 {pdict_SUITE,do_mixed,456},
 {pdict_SUITE,mixed,412},
 {test_server,ts_tc,1796},
 {test_server,run_test_case_eval1,1305}]
=== === Reason: bad argument
  in function  tuple_to_list/1
     called as tuple_to_list(7548)
     *** argument 1: not a tuple
  in call from array:prune/4 (array.erl:492)
  in call from array:shrink_1/4 (array.erl:488)
  in call from array:resize/2 (array.erl:455)
  in call from pdict_SUITE:do_mixed/5 (pdict_SUITE.erl:456)
  in call from pdict_SUITE:mixed/1 (pdict_SUITE.erl:412)
  in call from test_server:ts_tc/3 (test_server.erl:1796)
  in call from test_server:run_test_case_eval1/6 (test_server.erl:1305)

bjorng · 2026-02-07T07:08:08Z

The indent2_SUITE:arr/1 and opaque_SUITE:array/1 test cases for Dialyzer now also fail.

And added more regression tests.

richcarl force-pushed the array-improvements branch from fa44367 to d460655 Compare January 22, 2026 12:58

richcarl force-pushed the array-improvements branch from d460655 to 68c7f2e Compare January 22, 2026 13:02

richcarl added 3 commits January 22, 2026 20:30

array: don't cache max and use a boolean for fixed/relaxed

c0c79f0

array: add caching

c606409

array: prune representation when shrinking

f1b794f

richcarl force-pushed the array-improvements branch from 68c7f2e to f1b794f Compare January 22, 2026 19:31

IngelaAndin added the team:PS Assigned to OTP team PS label Jan 23, 2026

dgud self-assigned this Jan 23, 2026

IngelaAndin added the waiting waiting for changes/input from author label Jan 27, 2026

dgud added 2 commits January 31, 2026 12:02

Cleanup (remove tests in source) and whitespace

8bb8b6d

Remove unused tests in source code.

richcarl force-pushed the array-improvements branch from 415894b to 8bb8b6d Compare January 31, 2026 11:12

michalmuskala reviewed Jan 31, 2026

View reviewed changes

lib/stdlib/src/array.erl Outdated Show resolved Hide resolved

Remove the size field in nodes

fa06f5b

Keep the size information in meta-data only, and forward recursively. Reduces memory and removes one bit shift on recursions. Fixed coverage and removed some dead code.

richcarl force-pushed the array-improvements branch from afbd032 to fa06f5b Compare February 5, 2026 20:48

array_SUITE: whitespace; use standard asserts

1df37cf

dgud added testing currently being tested, tag is used by OTP internal CI and removed waiting waiting for changes/input from author labels Feb 6, 2026

dgud previously approved these changes Feb 6, 2026

View reviewed changes

michalmuskala reviewed Feb 6, 2026

View reviewed changes

bjorng removed the testing currently being tested, tag is used by OTP internal CI label Feb 7, 2026

dgud dismissed their stale review via 552db0d February 7, 2026 15:50

Fix shrink bugs

30f9bf6

And added more regression tests.

richcarl force-pushed the array-improvements branch from 552db0d to cbf14e5 Compare February 8, 2026 13:35

dgud and others added 2 commits February 8, 2026 14:44

Fix dialyzer array tests

a375a9f

Improve shrink code

0daddab

richcarl force-pushed the array-improvements branch from cbf14e5 to 0daddab Compare February 8, 2026 13:44

IngelaAndin added this to the OTP-29.0 milestone Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to the array module #10578

Improvements to the array module #10578

richcarl commented Jan 21, 2026

Uh oh!

github-actions bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

richcarl commented Jan 22, 2026

Uh oh!

kikofernandez commented Jan 22, 2026

Uh oh!

richcarl commented Jan 22, 2026

Uh oh!

dgud commented Jan 28, 2026

Uh oh!

michalmuskala Jan 31, 2026

Uh oh!

dgud Jan 31, 2026

Uh oh!

michalmuskala Jan 31, 2026

Uh oh!

Uh oh!

michalmuskala Feb 6, 2026 •

edited

Loading

Uh oh!

richcarl Feb 6, 2026

Uh oh!

bjorng commented Feb 7, 2026

Uh oh!

bjorng commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Improvements to the array module #10578

Are you sure you want to change the base?

Improvements to the array module #10578

Conversation

richcarl commented Jan 21, 2026

Uh oh!

github-actions bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CT Test Results

Artifacts

Uh oh!

richcarl commented Jan 22, 2026

Uh oh!

kikofernandez commented Jan 22, 2026

Uh oh!

richcarl commented Jan 22, 2026

Uh oh!

dgud commented Jan 28, 2026

Uh oh!

michalmuskala Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

dgud Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

michalmuskala Jan 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michalmuskala Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richcarl Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

bjorng commented Feb 7, 2026

Uh oh!

bjorng commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

github-actions bot commented Jan 21, 2026 •

edited

Loading

michalmuskala Feb 6, 2026 •

edited

Loading