feat(rpc): add `gettxoutsetinfo` endpoint #595

dorianvp · 2025-10-08T15:21:50Z

On top of #592.

Motivation

#210.

This PR also specifies the algorithm used to generate the UTXO set snapshot hash.

Solution

Implement gettxoutsetinfo.

PR Checklist

The PR name is suitable for the release notes.
The solution is tested.
The documentation is up to date.

…riber`

dorianvp · 2025-10-14T00:41:40Z

This PR is mostly ready. I only left a couple of review to upgrade some types, but functionally it is ready for review.

integration-tests/tests/state_service.rs

zancas · 2025-10-16T03:53:33Z

I see several equivalence tests, how about some non-equivalance tests proving sensitivity to variance in txoutsets?

zaino-fetch/src/jsonrpsee/response/common/block.rs

…cash/zips/blob/main/zips/zip-0000.rst#zip-format-and-structure

…o feat/rpc-gettxoutsetinfo

dorianvp · 2025-10-18T04:41:05Z

docs/json_rpc/gettxoutsetinfo/canonical_utxo_set_snapshot_hash.md

+- `UTXO set`: a finite multimap keyed by outpoints `(txid, vout)` to outputs `(value_zat, scriptPubKey)`, where:
+
+  - `txid` is a 32-byte transaction hash (internal byte order).
+  - `vout` is a 32-bit output index (0-based).


Leaving a note here:

If we serialize per unspent as txid || value || script and a transaction contains two outputs with identical (value, script), then two different UTXO sets that differ only by which index is unspent will serialize to the same bytes (and hash).

vout is a misleading name for this; I would call it output_index because vout is generally used to refer to the vector of outputs of a transaction; I was confused by this name before I got to this line.

nuttycom · 2025-10-20T14:12:43Z

docs/json_rpc/gettxoutsetinfo/canonical_utxo_set_snapshot_hash.md

+
+The snapshot **MUST** be ordered as follows, independent of the node’s in-memory layout:
+
+1. Sort by `txid` ascending, comparing the raw 32-byte values as unsigned bytes.


This seems like a bad serialization, because it requires recomputation over the entire UTXO set whenever a new block is received. The UTXO set can be very large; it would be much better to choose a snapshot protocol where snapshot hashes can incrementally build on the snapshot hash prior to the addition of a new block.

If the UTXO set were stored in a B-tree data structure that internally kept Merkle hashes at the nodes, then it might be okay to use the Merkle root of that data structure for the snapshot identifier. It would need to be the case that the fanout of the B-tree and the insertion semantics were well-specified to ensure that everyone uses the same hashing approach.

One possibility that would allow for this to work as-specified would be to use a separate B-tree (implementing a set, rather than a map) for producing the hashes; since the txid commits to the effects of each transaction, one could build the snapshot identifier alongside the actual data, but building that identifier in parallel would have a risk of data inconsistencies with the primary store.

In general, I feel like the UTXO set would be best represented as a persistent data structure with good amortized append costs.

The UTXO set can be very large; it would be much better to choose a snapshot protocol where snapshot hashes can incrementally build on the snapshot hash prior to the addition of a new block.

sparse-merkle-tree could be useful here, Zaino could:

Implement Value for a struct representing the transaction output data to which hash_serialized is committing,

Implement StoreReadOps/StoreWriteOps for on-disk storage of the tree,

Update the tree and a cache with the other fields in TxOutSetInfo when Zaino is indexing blocks, and

Return the cached TxOutSetInfo from the RPC method.

arya2

There are some performance and concurrency issues with this design. It would be better to update a cached TxOutSetInfo as Zaino indexes blocks and to use a merkle tree to update the hash_serialized field.

arya2 · 2025-10-27T23:52:13Z

zaino-state/src/backends/state.rs

+
+        let mut state = state.clone();
+
+        let zebra_state::ReadResponse::Tip { 0: tip } = state


Nitpick: Why destructure it this way rather than as a tuple struct?

Suggested change

let zebra_state::ReadResponse::Tip { 0: tip } = state

let zebra_state::ReadResponse::Tip(tip) = state

arya2 · 2025-10-27T23:57:47Z

zaino-state/src/backends/state.rs

        }
    }

+    /// Fetches all UTXOs from the state service, and returns them in a map.


Loading every UTXO into memory shouldn't be necessary and could require more memory than is available on some systems.

arya2 · 2025-10-28T00:01:31Z

docs/json_rpc/gettxoutsetinfo/canonical_utxo_set_snapshot_hash.md

+
+## Motivation
+
+Different nodes (e.g., `zcashd`, Zebra, indexers) may expose distinct internals or storage layouts. Operators often need a cheap way to verify “we’re looking at the same unspent set” without transporting the entire set. A canonical, versioned snapshot hash solves this.


I see that zcashd had this method returning a hash_serialized already, but, why is it not enough to check that the block hashes match?

arya2 · 2025-10-28T00:05:37Z

docs/json_rpc/gettxoutsetinfo/canonical_utxo_set_snapshot_hash.md

+
+## Inputs
+
+To compute the snapshot hash, the implementation needs:


Why include anything other than the UTXOs as inputs in the snapshot hash? Shouldn't we already know that we're looking at the same UTXO set if the best block hashes match?

arya2 · 2025-10-28T00:34:03Z

docs/json_rpc/gettxoutsetinfo/canonical_utxo_set_snapshot_hash.md

+
+The snapshot **MUST** be ordered as follows, independent of the node’s in-memory layout:
+
+1. Sort by `txid` ascending, comparing the raw 32-byte values as unsigned bytes.


The UTXO set can be very large; it would be much better to choose a snapshot protocol where snapshot hashes can incrementally build on the snapshot hash prior to the addition of a new block.

sparse-merkle-tree could be useful here, Zaino could:

Implement Value for a struct representing the transaction output data to which hash_serialized is committing,

Implement StoreReadOps/StoreWriteOps for on-disk storage of the tree,

Update the tree and a cache with the other fields in TxOutSetInfo when Zaino is indexing blocks, and

Return the cached TxOutSetInfo from the RPC method.

arya2 · 2025-10-28T01:21:51Z

zaino-state/src/backends/state.rs

+            let blk = state
+                .call(ReadRequest::Block(HashOrHeight::Height(Height(h))))


It could be faster to read just the unspent outputs by iterating over TransactionLocation { height: 0, index: 0 }..={ height: target_height, index: OutputIndex::MAX } in the utxo_by_out_loc column family and then read any remaining blocks to add any UTXOs that were in blocks that were commited during the iteration.

arya2 · 2025-10-28T01:24:24Z

zaino-state/src/backends/state.rs

+            };
+
+            for tx in block.transactions.clone() {
+                let txid = tx.hash();


It may be faster to lookup the transaction hashes in the tx_loc_by_hash column family.

arya2 · 2025-10-28T01:36:45Z

zaino-state/src/backends/state.rs

+    async fn get_txout_set_info(&self) -> Result<GetTxOutSetInfo, Self::Error> {
+        let txouts = Self::get_txout_set(&self.read_state_service).await.unwrap();
+
+        let best_block_hash = self.get_best_blockhash().await.unwrap().hash();


get_txout_set() may not (and on Mainnet, generally won't) read transactions from all blocks up to the best block hash being read here. It would be better to return the tip hash from get_txout_set().

dorianvp · 2025-11-03T18:11:42Z

Hey folks, thanks for reviewing this! I'll be addressing these comments soon

zancas · 2025-11-04T05:00:38Z

@dorianvp this RPC involves specification which is scheduled in the next Mile Stone. I am requesting to move this RPC to that Mile Stone.

dorianvp self-assigned this Oct 8, 2025

dorianvp linked an issue Oct 8, 2025 that may be closed by this pull request

1.14: Add Zcash RPC gettxoutsetinfo #210

Open

dorianvp added 3 commits October 10, 2025 00:20

chore(gettxoutsetinfo): start writing common types

ee1beae

chore(gettxoutsetinfo): initial impl of get_txout_set

070b1cd

docs(gettxoutsetinfo): draft utxo set spec

1a4a340

dorianvp force-pushed the feat/rpc-gettxoutsetinfo branch from 876bba8 to 1a4a340 Compare October 10, 2025 03:21

dorianvp added 14 commits October 10, 2025 15:24

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

729436b

chore(gettxoutsetinfo): initial impl of ZAINO-UHS-01

b4bfdbb

chore(gettxoutsetinfo): add doc comments

6948367

chore(gettxoutsetinfo): use utxoset_hash_v1 in `StateServiceSubsc…

d71e1b9

…riber`

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

4acaddb

chore(gettxoutsetinfo): add doc comments

438f546

test(gettxoutsetinfo): add fetch_service test

03c9e75

test(gettxoutsetinfo): add top-level comment

485af98

test(gettxoutsetinfo): add top-level comment

6b7bc67

test(gettxoutsetinfo): add state_service_get_txout_set_info

6ac601a

chore(gettxoutsetinfo): run clippy

ace5bc5

chore(gettxoutsetinfo): enable endpoint & add tests

b816cc1

test(gettxoutsetinfo): add uhs tests

4b1bf7c

chore(gettxoutsetinfo): comments

e24895d

dorianvp added 6 commits October 13, 2025 22:24

chore(gettxoutsetinfo): typed network enum

ea1dbfd

chore(gettxoutsetinfo): re-export zaino_common::Network

b1f40ca

chore(gettxoutsetinfo): address todos

f0b0f0e

chore(gettxoutsetinfo): add byte_order_tests

ab8b70d

chore(gettxoutsetinfo): add utxo_serialized_size

5c92768

chore(gettxoutsetinfo): fix last todo

eebeab3

dorianvp marked this pull request as ready for review October 14, 2025 05:04

dorianvp requested review from AloeareV and idky137 October 14, 2025 05:05

zancas reviewed Oct 16, 2025

View reviewed changes

integration-tests/tests/state_service.rs Show resolved Hide resolved

zancas reviewed Oct 16, 2025

View reviewed changes

zaino-fetch/src/jsonrpsee/response/common/block.rs Outdated Show resolved Hide resolved

chore(gettxoutsetinfo): small spec corrections

6d86ef3

pacu mentioned this pull request Oct 16, 2025

PR Reviews pacu/zcash-dev-rel-engineer#345

Closed

zancas and others added 12 commits October 16, 2025 22:09

add header and terminology, propose refinement to abstract

c4d8622

ZI-ng-P: 0

6dbda1b

start moving sections to more closely align with https://github.com/z…

37377d9

…cash/zips/blob/main/zips/zip-0000.rst#zip-format-and-structure

reference in references, BCP 14

758b19d

fix space in footnote

2c42ba9

fix footnote

217f917

futz with terminology

6d04353

bold consensus network

28c5618

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

335cd6f

make rpc name part of Title

2036641

Merge remote-tracking branch 'zingolabs/feat/rpc-gettxoutsetinfo' int…

d875938

…o feat/rpc-gettxoutsetinfo

docs(gettxoutsetinfo): use network instead of consensus network

abc2d6a

dorianvp commented Oct 18, 2025

View reviewed changes

dorianvp and others added 3 commits October 18, 2025 17:02

docs(gettxoutsetinfo): replace network with genesis_block_hash

a311fc9

chore(gettxoutsetinfo): remove BlockHash, update spec & impl

d2d43c1

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

0c152f1

nuttycom reviewed Oct 20, 2025

View reviewed changes

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

0954755

arya2 reviewed Oct 28, 2025

View reviewed changes

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

1f5415e

Merge branch 'dev' into feat/rpc-gettxoutsetinfo

b1e68d0

zancas assigned AloeareV and unassigned dorianvp Nov 7, 2025


		The snapshot MUST be ordered as follows, independent of the node’s in-memory layout:

		1. Sort by `txid` ascending, comparing the raw 32-byte values as unsigned bytes.


		let mut state = state.clone();

		let zebra_state::ReadResponse::Tip { 0: tip } = state

	let zebra_state::ReadResponse::Tip { 0: tip } = state
	let zebra_state::ReadResponse::Tip(tip) = state


		## Motivation

		Different nodes (e.g., `zcashd`, Zebra, indexers) may expose distinct internals or storage layouts. Operators often need a cheap way to verify “we’re looking at the same unspent set” without transporting the entire set. A canonical, versioned snapshot hash solves this.


		## Inputs

		To compute the snapshot hash, the implementation needs:

		let blk = state
		.call(ReadRequest::Block(HashOrHeight::Height(Height(h))))

feat(rpc): add gettxoutsetinfo endpoint #595

Are you sure you want to change the base?

feat(rpc): add gettxoutsetinfo endpoint #595

Uh oh!

Conversation

dorianvp commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

PR Checklist

Uh oh!

dorianvp commented Oct 14, 2025

Uh oh!

Uh oh!

zancas commented Oct 16, 2025

Uh oh!

Uh oh!

dorianvp Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arya2 Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arya2 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arya2 Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dorianvp commented Nov 3, 2025

Uh oh!

zancas commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat(rpc): add `gettxoutsetinfo` endpoint #595

feat(rpc): add `gettxoutsetinfo` endpoint #595

dorianvp commented Oct 8, 2025 •

edited

Loading

dorianvp Oct 18, 2025 •

edited

Loading

arya2 Oct 28, 2025 •

edited

Loading

arya2 Oct 28, 2025 •

edited

Loading