this is a very simple bench, inspired by Andrew Kelley's talk on data-oriented programming.
i first started with a simple cache alignment benchmark, which i subsequently deleted. you can see the remnants of that test in metadata_layouts.rs
i wanted to look into this because i noticed that const META_SIZE: usize = 73 in bench_hashing.rs in the agave repo.
i tracked it down to ACCOUNTSDB.RS LINE 4829 - 4862
const META_SIZE: usize = 8 /* lamports */ + 1 /* executable */ + 32 /* owner */ + 32 /* pubkey */;in this line you can see that pubkey has to cross cache line 1 into cache line 2, 64 bytes + 9 bytes.
i attempted to rearrange and mess around with the padding of the structs to see if optimizing for the cache line would make it faster even if it took up more memory. it didn't. no difference at all.
the actual test i ended up with is trying to redo the hashwriting with blake3 to take a Struct of Arrays (SoA) instead of Array of Structs (AoS) and see if that improved cache locality and sped things up.
the three tests are as follows; a simple A/B test with SoA and AoS
- batch hashing 10,000 accounts
- add up the lamports of 10,000 accounts (i.e reading from one field)
- creating 10,000 accounts
the results are reliably as follows;
barely any difference (both have to access all fields anyways in their entirety)
SoA wins 3-4x. because sequential memory access, cheaper to traverse one vec (SoA) to find the account data you need rather than search ALL accounts to find specific data from one account (AoS). In this test we add up the lamports of all 10,000 accounts. This lets us ignore all the other data fields.
AoS wins ~1.5-2x creating an account is 1 entry push versus SoA having to do four seperate operations Vec<>(lamports, executable bool, owner, pubkey) for each account