A CLI tool to analyze and extract data from CometBFT databases including state.db, tx_index.db, and more for inspection and debugging purposes.
- Multi-database support: Extract data from various CometBFT databases
state.db: Blockchain state, validators, consensus params, ABCI responsestx_index.db: Transaction index data with height and event indexes
- Streaming output: Memory-efficient streaming JSON output for large databases
- Skip-full-dump mode: Generate summary-only analysis to save memory
- Decode and analyze: Automatically decode protobuf data structures
- Comprehensive statistics: Generate detailed reports about database contents
- JSON export: Export data to human-readable JSON format
- Size analysis: Identify largest keys and size distribution by type
- Progress reporting: Real-time progress updates every 10,000 keys
- Modular CLI: Easy to extend with new database extraction commands
go install github.com/altuslabsxyz/cda@latestOr clone and build:
git clone https://github.com/altuslabsxyz/cda.git
cd cda
go buildcda <command> [flags]extract state-db: Extract and analyze data from state.dbextract tx-index-db: Extract and analyze data from tx_index.dbhelp: Display help informationversion: Display version information
-d, ---db: Path to the database directory (required)-o, ---output: Output directory for JSON files (default varies by command)--skip-full-dump: Skip full data dump to save memory (only generate summary)-h, --help: Display help information
# Display help
cda --help
cda extract --help
cda extract state-db --help
cda extract tx-index-db --help
# Extract state.db from a CometBFT node
cda extract state-db --db ~/.cometbft/data
# Extract with custom output directory
cda extract state-db --db ~/.cometbft/data --output ./analysis_results
# Extract tx_index.db
cda extract tx-index-db --db ~/.cometbft/data
# Extract tx_index.db with custom output
cda extract tx-index-db --db ~/.cometbft/data --output ./tx_analysis
# Quick analysis - summary only (recommended for large production databases)
# Only generates summary.json (~10-200MB) instead of full dumps (100GB+)
cda extract state-db --db ~/.cometbft/data --skip-full-dump
cda extract tx-index-db --db ~/.cometbft/data --skip-full-dump
# Analyze from a custom chain
cda extract state-db --db ~/.mychain/data --output ./mychain_analysisThe extract state-db command generates the following files in the output directory:
| File | Description | Typical Size | Notes |
|---|---|---|---|
summary.json |
Overall statistics and key type distribution | ~10-20MB | Always generated, includes metadata |
stateKey_dump.json |
Current blockchain state (block height, validators, consensus params, etc.) | ~5KB | Single entry with current state |
validatorsKey_dump.json |
Validator set history at each height | ~20-50MB | Size depends on validator count and chain history |
consensusParamsKey_dump.json |
Consensus parameters history | ~50-150MB | Records parameter changes over time |
abciResponsesKey_dump.json |
ABCI responses for each block | ~100-200GB+ | |
lastABCIResponseKey_dump.json |
Most recent ABCI response | ~4MB | Latest block execution result |
offlineStateSyncHeightKey_dump.json |
State sync height (if applicable) | ~100B | Only present if state sync was used |
unknown_dump.json |
Any unrecognized keys (e.g., genesisDoc) | ~30KB | Rarely populated |
abciResponsesKey_dump.json file can be extremely large (100GB+) for chains with long history. Consider using --skip-full-dump to generate only the summary if disk space is limited.
The extract tx-index-db command generates the following files in the output directory:
| File | Description | Typical Size | Notes |
|---|---|---|---|
summary.json |
Overall statistics, key type distribution, height range, and transaction count | ~100-200KB | Always generated, includes height range and tx count |
tx_hash_dump.json |
Transaction results with execution details, events, gas usage | ~100-200GB+ | |
height_index_dump.json |
Height-based transaction indexes | ~50-150MB | Maps heights to transaction hashes |
event_index_dump.json |
Event-based transaction indexes for querying | ~5-10GB | Enables event-based transaction lookup |
tx_hash_dump.json file can be extremely large (100GB+) depending on transaction volume and chain history. Use --skip-full-dump if you only need summary statistics.
The tool displays a summary on the console including:
- Total number of keys
- Key type distribution with sizes
- Top 10 largest keys
- Key breakdown by type
Example output:
=== State.db Analysis ===
Database path: /Users/user/.aultd/data
Output directory: ./analysis_results
=== SUMMARY ===
Total Keys: 9061
Key Type Distribution:
abciResponsesKey : 3018 keys | Total: 10556929 bytes | Avg: 3498.0 bytes
validatorsKey : 3020 keys | Total: 6166 bytes | Avg: 2.0 bytes
consensusParamsKey : 3019 keys | Total: 147804 bytes | Avg: 49.0 bytes
stateKey : 1 keys | Total: 619 bytes | Avg: 619.0 bytes
...
The analyzer recognizes the following key types from CometBFT's state store:
| Key Type | Description | Storage Pattern |
|---|---|---|
stateKey |
Current blockchain state | Single entry |
validatorsKey:{height} |
Validator set at specific height | Per height/checkpoint |
consensusParamsKey:{height} |
Consensus parameters at height | Per height when changed |
abciResponsesKey:{height} |
ABCI responses for block | Per block (if not pruned) |
lastABCIResponseKey |
Most recent ABCI response | Single entry |
offlineStateSyncHeightKey |
State sync height | Single entry |
- CometBFT v0.38.x
- Database backend: GoLevelDB
- Debug state database issues
- Analyze database size and growth patterns
- Inspect historical validator sets
- Review consensus parameter changes
- Extract block execution results
- Identify data pruning candidates
cda/
├── main.go # Entry point
├── cmd/ # CLI commands
│ ├── root.go # Root command
│ ├── extract.go # Extract parent command
│ ├── extract_state_db.go # State.db extraction command
│ └── extract_tx_index.go # Tx_index.db extraction command
└── analyzer/ # Analysis logic
├── types.go # Common types and StreamWriter
├── state_db.go # State.db analysis
├── tx_index.go # Tx_index.db analysis
└── output.go # Output utilities
To add support for a new database:
- Create analysis logic in
analyzer/with decode functions for the database key types - Create a new command in
cmd/(e.g.,extract_blockstore.go) - The command will automatically be registered via the
init()function - Update the README with the new command usage
Example extractors already implemented:
analyzer/state_db.go- State database analysisanalyzer/tx_index.go- Transaction index database analysis
- The tool opens the database in read-only mode
- Make sure the CometBFT node is not running when analyzing the database
- Uses streaming output to efficiently handle databases of any size
- Progress is reported every 10,000 keys during extraction
- Large databases may produce very large output files (100GB+)
- Full extraction of production chains can take significant time (30+ minutes) and disk space
- Consider using
--skip-full-dumpin these scenarios:- Limited disk space (outputs only summary.json, typically <200MB)
- Quick database health checks
- Key distribution analysis without full data export
- Memory-constrained environments
- Summary files alone provide comprehensive statistics including:
- Total key counts by type
- Size distribution and largest keys
- Height ranges and transaction counts (for tx_index.db)
- Database health metrics
MIT License
Contributions are welcome! Please feel free to submit a Pull Request.