Skip to content

Feature/forking#372

Open
michalpalkowski wants to merge 23 commits intodojoengine:mainfrom
michalpalkowski:feature/forking
Open

Feature/forking#372
michalpalkowski wants to merge 23 commits intodojoengine:mainfrom
michalpalkowski:feature/forking

Conversation

@michalpalkowski
Copy link

@michalpalkowski michalpalkowski commented Dec 2, 2025

Description

This PR enables Katana to fork from a remote Starknet network and continue producing blocks locally with correct state root computation.
The core challenge solved here is maintaining valid Merkle proofs when working with partial state.


Why these changes were needed

When forking from a remote network, Katana doesn't have the complete state trie locally—it only has the data it explicitly fetches. This creates a fundamental problem:

  • State root computation requires full tries
    To compute a valid state root after applying local transactions, you need access to the entire Merkle trie structure.

  • Fetching the entire state is impractical
    A full Starknet state is large, making it impossible to download during fork initialization.

  • Naive approaches fail
    Simply applying state updates locally without the underlying trie structure produces incorrect state roots, breaking consensus validation.


The solution: Partial Tries with Multiproofs

This PR introduces partial trie support, allowing Katana to:

  • Insert state updates to local partial trie created of leaves which have changed in the block while maintaining cryptographic correctness.
  • Fetch Merkle multiproofs of leaves from different blocks even though they do not exist in partial trie, these leaves will be saved in partial tries and later used to calculate proper state root instead of fetching again from rpc.
  • Reconstruct the minimal trie paths needed to compute valid roots.
  • Continue producing blocks with verifiable state roots.

The key insight is that you don't need the full trie—you only need the Merkle proof paths for the nodes you're modifying, plus the original root as an anchor point.

On the first iteration, when creating a fork, we construct partial tries based solely on multiproofs fetched from the RPC, using its latest roots and the paths to the leaves we want to insert. On subsequent iterations, we use the locally constructed partial tries and their values. For nodes that do not exist in the partial tries but do exist on the RPC, we lazily fetch them using proofs. This ensures that the state root matches what the RPC’s state root would be if the same values were inserted there.


Why the API changed

The TrieWriter trait now includes a compute_state_root method that can be overridden by ForkedProvider.

This is necessary because:

  • Regular providers have full local tries and can compute roots directly.
  • Forked providers must fetch proofs from remote RPC and use partial trie operations.
  • The default implementation works for non-forked cases, while forked providers override it with proof-based logic.

SNOS compatibility

These changes also enable SNOS (Starknet OS) proof generation for forked chains.
SNOS requires access to storage proofs to generate validity proofs, which is now possible through the multiproof infrastructure.


Testing

  • Applying local transactions with storage updates, contract deployments, and class declarations.
  • State root correctness across multiple blocks.

let global_class_cache = class_cache.build_global()?;
// Try to use existing global cache if already initialized (useful for tests with multiple nodes)
// Otherwise, build and initialize a new global cache
let global_class_cache = match ClassCache::try_global() {
Copy link
Author

@michalpalkowski michalpalkowski Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change - to run two katana instances in parallel in one test


let indices = provider.block_body_indices(block_id)?.ok_or(BlockNotFound)?;
let tx_hashes = provider.transaction_hashes_in_range(indices.into())?;
let traces = self
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

pub fn get_block_transactions_traces(
&self,
block_id: BlockHashOrNumber,
) -> Result<Option<TraceBlockTransactionsResponse>, BackendClientError> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

BlockHashOrNumber::Num(n) => ConfirmedBlockIdOrTag::Number(n),
};

self.dedup_request(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

&self,
block_id: BlockHashOrNumber,
) -> ProviderResult<Option<Vec<TypedTransactionExecutionInfo>>>;
) -> ProviderResult<Option<Vec<TxTraceWithHash>>>;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

) -> ProviderResult<Option<Vec<TxTraceWithHash>>> {
if let Some(index) = self.block_body_indices(block_id)? {
let traces = self.transaction_executions_in_range(index.into())?;
let traces = self.transaction_executions_in_range(index.clone().into())?;
Copy link
Author

@michalpalkowski michalpalkowski Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix -traces -chudas

.get::<tables::ContractInfoChangeSet>(addr)?
.ok_or(ProviderError::MissingContractInfoChangeSet { address: addr })?;
let new_change_set =
if let Some(mut change_set) = self.0.get::<tables::ContractInfoChangeSet>(addr)? {
Copy link
Author

@michalpalkowski michalpalkowski Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - karyi- already on main

Err(err) => Err(err),
}
let local_latest = match self.local_db.latest_number() {
Ok(num) => num,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - to run fork without any block

let fork_point = self.block_id();
let latest_num = self.latest_number()?;

if latest_num > fork_point {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - to run fork without any block

&self,
block_id: BlockHashOrNumber,
) -> ProviderResult<Option<Vec<TypedTransactionExecutionInfo>>> {
) -> ProviderResult<Option<Vec<TxTraceWithHash>>> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix for traces

executions: Vec<TypedTransactionExecutionInfo>,
) -> ProviderResult<()> {
// BUGFIX: Before inserting state updates, ensure all contracts referenced in nonce_updates
// have their ContractInfo in local_db. For forked contracts, the class_hash may only exist
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

let local_latest = match self.local_provider.0.latest_number() {
Ok(num) => num,
Err(ProviderError::MissingLatestBlockNumber) => fork_point,
Err(err) => return Err(err),
Copy link
Author

@michalpalkowski michalpalkowski Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run fork without genesis

// TEMPFIX:
//
// This check is required due to the limitation on how we're storing updates for
// contracts that were deployed before the fork point. For those contracts,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - karyi

if let res @ Some(..) = self.local_provider.nonce(address)? {
Ok(res)
if let Some(nonce) = self.local_provider.nonce(address)? {
// TEMPFIX:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - karyi

let fork_point = self.fork_provider.block_id;
let latest_block_number = match self.local_provider.0.latest_number() {
Ok(num) => num,
// return the fork block number if local db return this error. this can only happen whne
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

Err(ProviderError::MissingLatestBlockNumber) => self.fork_provider.block_id,
Err(err) => return Err(err),
};
let latest_block_number = self.latest_block_number()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

Err(ProviderError::MissingLatestBlockNumber) => self.fork_provider.block_id,
Err(err) => return Err(err),
};
let latest_block_number = self.latest_block_number()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

Err(ProviderError::MissingLatestBlockNumber) => self.fork_provider.block_id,
Err(err) => return Err(err),
};
let latest_block_number = self.latest_block_number()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

Err(ProviderError::MissingLatestBlockNumber) => self.fork_provider.block_id,
Err(err) => return Err(err),
};
let latest_block_number = self.latest_block_number()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

Err(ProviderError::MissingLatestBlockNumber) => self.fork_provider.block_id,
Err(err) => return Err(err),
};
let latest_block_number = self.latest_block_number()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix to run forking without genesis

// TODO: this is technically wrong, we probably should insert the
// `ClassChangeHistory` entry on the state update level instead.
let entry = ContractClassChange::deployed(address, hash);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - this was changing historical class hashes

let block_id = self.target_block();

let provider_mut = self.fork_provider.db.provider_mut();
provider_mut.tx().put::<tables::NonceChangeHistory>(block, entry)?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - this was changing historical nonces

return Ok(None);
}

if let class @ Some(..) =
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - here we want to get values from fork point not the current fork block as it might not exist on main instance


if let Some(compiled_hash) =
self.fork_provider.backend.get_compiled_class_hash(hash, self.local_provider.block())?
self.fork_provider.backend.get_compiled_class_hash(hash, self.fork_provider.block_id)?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - here we want to get values from fork point not the current fork block as it might not exist on main instance

let provider_mut = self.fork_provider.db.provider_mut();
provider_mut.tx().put::<tables::StorageChangeSet>(key, block_list)?;
provider_mut.tx().put::<tables::StorageChangeHistory>(block, change_entry)?;
provider_mut.commit()?;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - this was changing historical storage values


assert_eq!(actual_block_env, Some(expected_block_env));
let expected_executions: Vec<TxTraceWithHash> = expected_block
.body
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix - traces - chudas

.contracts_proof
.contract_leaves_data
.iter()
.zip(contract_addresses.iter())
Copy link
Author

@michalpalkowski michalpalkowski Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this zip might cause problems, need better solution

@kariy kariy force-pushed the main branch 2 times, most recently from e3bd68b to e897bbb Compare December 30, 2025 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants