Skip to content
This repository was archived by the owner on Jan 24, 2024. It is now read-only.

Speeding Up Geth

Nicolas Laurent edited this page Feb 13, 2022 · 5 revisions

Below are notes that mostly transcribe George Hotz' ideas on how to make Geth 10x faster.

George would actually claim he can make Geth 100x faster, but 10x of that is requiring faster hardware and larger disks to increase the gas limit.

Blazing Fast Sync

An anecdote from George is that he recently a full archive node sync. It took him ~2 days with Geth, and ~4 days. This is (a) incredibly fast and (b) the opposite of the commonly expected results. He also noted that during the sync, it was often the case that the process was CPU-bound instead of IO-bound (i.e. Geth's main process at 100% CPU utilization).

How did George make the sync so fast? He had a few tricks:

  • He increase Geth's (and Erigon's) database cache size to 64GB (up from 16MB).
  • LevelDB was writing on a distributed disk made up of multiple NVME SSDs (must be clarified, he did mention 4 1TB NVME drives in RAID in one of our conversations).
  • He ran the sync over a GB ethernet connection.
    • However, unclear to me (norswap) if this was a bottleneck at all.

Erigon's degraded performance can be explained by two facts:

  • It separate sync in phases: first download all headers, then validate all header, then download all blocks, etc..
    • Unclear to me (norswap) why this speed up the sync, I would expected the disk to be the clear bottleneck. Does improving cache locality & make codepaths "warmer" when running the blocks really make such a difference? I do trust the Erigon team that his does work in general.
  • It writes more data to disk (related to Erigon's DB architecture, which as far as my limited understanding go, allows reading key-value pairs without traversing the state Merkle trie.

In general, George's solution to making Geth 10x faster comes down to:

  1. Improving the database.
  2. Optmizing Geth under the assumption that disk is not the bottleneck, but the CPU usage is.

Let's examine those in turn.

Improving the Database

TODO

Optimizing Geth's CPU usage

TODO

Clone this wiki locally