Feature/forking by michalpalkowski · Pull Request #372 · dojoengine/katana

michalpalkowski · 2025-12-02T13:42:18Z

Description

This PR enables Katana to fork from a remote Starknet network and continue producing blocks locally with correct state root computation.
The core challenge solved here is maintaining valid Merkle proofs when working with partial state.

Why these changes were needed

When forking from a remote network, Katana doesn't have the complete state trie locally—it only has the data it explicitly fetches. This creates a fundamental problem:

State root computation requires full tries
To compute a valid state root after applying local transactions, you need access to the entire Merkle trie structure.
Fetching the entire state is impractical
A full Starknet state is large, making it impossible to download during fork initialization.
Naive approaches fail
Simply applying state updates locally without the underlying trie structure produces incorrect state roots, breaking consensus validation.

The solution: Partial Tries with Multiproofs

This PR introduces partial trie support, allowing Katana to:

Insert state updates to local partial trie created of leaves which have changed in the block while maintaining cryptographic correctness.
Fetch Merkle multiproofs of leaves from different blocks even though they do not exist in partial trie, these leaves will be saved in partial tries and later used to calculate proper state root instead of fetching again from rpc.
Reconstruct the minimal trie paths needed to compute valid roots.
Continue producing blocks with verifiable state roots.

The key insight is that you don't need the full trie—you only need the Merkle proof paths for the nodes you're modifying, plus the original root as an anchor point.

On the first iteration, when creating a fork, we construct partial tries based solely on multiproofs fetched from the RPC, using its latest roots and the paths to the leaves we want to insert. On subsequent iterations, we use the locally constructed partial tries and their values. For nodes that do not exist in the partial tries but do exist on the RPC, we lazily fetch them using proofs. This ensures that the state root matches what the RPC’s state root would be if the same values were inserted there.

Why the API changed

The TrieWriter trait now includes a compute_state_root method that can be overridden by ForkedProvider.

This is necessary because:

Regular providers have full local tries and can compute roots directly.
Forked providers must fetch proofs from remote RPC and use partial trie operations.
The default implementation works for non-forked cases, while forked providers override it with proof-based logic.

SNOS compatibility

These changes also enable SNOS (Starknet OS) proof generation for forked chains.
SNOS requires access to storage proofs to generate validity proofs, which is now possible through the multiproof infrastructure.

Testing

Applying local transactions with storage updates, contract deployments, and class declarations.
State root correctness across multiple blocks.

…e#373) Spawn a separate task on the cpu-bound blocking task for performing the actual state trie computation to avoid blocking the async executor.

crates/rpc/rpc-server/src/starknet/mod.rs

michalpalkowski · 2025-12-12T08:00:50Z

crates/node/src/lib.rs

-            let global_class_cache = class_cache.build_global()?;
+            // Try to use existing global cache if already initialized (useful for tests with multiple nodes)
+            // Otherwise, build and initialize a new global cache
+            let global_class_cache = match ClassCache::try_global() {


change - to run two katana instances in parallel in one test

michalpalkowski · 2025-12-12T08:05:11Z

crates/rpc/rpc-server/src/starknet/trace.rs


-        let indices = provider.block_body_indices(block_id)?.ok_or(BlockNotFound)?;
-        let tx_hashes = provider.transaction_hashes_in_range(indices.into())?;
+        let traces = self


michalpalkowski · 2025-12-12T08:05:28Z

crates/storage/fork/src/lib.rs

+    pub fn get_block_transactions_traces(
+        &self,
+        block_id: BlockHashOrNumber,
+    ) -> Result<Option<TraceBlockTransactionsResponse>, BackendClientError> {


michalpalkowski · 2025-12-12T08:05:36Z

crates/storage/fork/src/lib.rs

+                    BlockHashOrNumber::Num(n) => ConfirmedBlockIdOrTag::Number(n),
+                };
+
+                self.dedup_request(


michalpalkowski · 2025-12-12T08:05:44Z

crates/storage/provider/provider-api/src/transaction.rs

        &self,
        block_id: BlockHashOrNumber,
-    ) -> ProviderResult<Option<Vec<TypedTransactionExecutionInfo>>>;
+    ) -> ProviderResult<Option<Vec<TxTraceWithHash>>>;


michalpalkowski · 2025-12-12T08:05:56Z

crates/storage/provider/provider/src/providers/db/mod.rs

+    ) -> ProviderResult<Option<Vec<TxTraceWithHash>>> {
        if let Some(index) = self.block_body_indices(block_id)? {
-            let traces = self.transaction_executions_in_range(index.into())?;
+            let traces = self.transaction_executions_in_range(index.clone().into())?;


fix -traces -chudas

michalpalkowski · 2025-12-12T08:06:17Z

crates/storage/provider/provider/src/providers/db/mod.rs

-                .get::<tables::ContractInfoChangeSet>(addr)?
-                .ok_or(ProviderError::MissingContractInfoChangeSet { address: addr })?;
+            let new_change_set =
+                if let Some(mut change_set) = self.0.get::<tables::ContractInfoChangeSet>(addr)? {


fix - karyi- already on main

michalpalkowski · 2025-12-12T08:09:14Z

crates/storage/provider/provider/src/providers/fork/mod.rs

-            Err(err) => Err(err),
-        }
+        let local_latest = match self.local_db.latest_number() {
+            Ok(num) => num,


fix - to run fork without any block

michalpalkowski · 2025-12-12T08:13:42Z

crates/storage/provider/provider/src/providers/fork/mod.rs

+        let fork_point = self.block_id();
+        let latest_num = self.latest_number()?;
+
+        if latest_num > fork_point {


fix - to run fork without any block

michalpalkowski · 2025-12-12T08:13:59Z

crates/storage/provider/provider/src/providers/fork/mod.rs

        &self,
        block_id: BlockHashOrNumber,
-    ) -> ProviderResult<Option<Vec<TypedTransactionExecutionInfo>>> {
+    ) -> ProviderResult<Option<Vec<TxTraceWithHash>>> {


fix for traces

michalpalkowski · 2025-12-12T08:15:22Z

crates/storage/provider/provider/src/providers/fork/mod.rs

        executions: Vec<TypedTransactionExecutionInfo>,
    ) -> ProviderResult<()> {
+        // BUGFIX: Before inserting state updates, ensure all contracts referenced in nonce_updates
+        // have their ContractInfo in local_db. For forked contracts, the class_hash may only exist


michalpalkowski · 2025-12-12T08:44:37Z