Skip to content

Add Prometheus metrics tracking for ClickHouse and RPC operations#257

Merged
YaroShkvorets merged 5 commits intomainfrom
add-clickhouse-rpc-metrics
Feb 3, 2026
Merged

Add Prometheus metrics tracking for ClickHouse and RPC operations#257
YaroShkvorets merged 5 commits intomainfrom
add-clickhouse-rpc-metrics

Conversation

@YaroShkvorets
Copy link
Member

This pull request introduces comprehensive Prometheus monitoring for the TRON ERC-20 token scraper, adds detailed documentation for available metrics, and refines metric tracking for ClickHouse and RPC operations. It also updates some configuration and code structure for improved observability and maintainability.

Monitoring and Metrics Enhancements:

  • Added detailed Prometheus metrics for ClickHouse operations (read/write), RPC requests (method/status), and task completion, including new histogram and counter metrics, and exposed them on a configurable port. [1] [2] [3] [4] [5] [6]
  • Updated the README.md to include a new Monitoring section, describing all available metrics, their labels, and example Prometheus queries. [1] [2]

Codebase and Configuration Updates:

  • Refactored code to integrate metric tracking in lib/batch-insert.ts, lib/clickhouse.ts, and lib/rpc.ts for ClickHouse and RPC operations, using new helper functions. [1] [2] [3]
  • Improved configuration metrics reporting to handle unset environment variables gracefully.
  • Updated the biome.json schema path to use a local node modules reference for better compatibility.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Prometheus-based observability for key scraper operations (ClickHouse reads/writes and RPC requests), and documents the available metrics for monitoring and alerting.

Changes:

  • Introduces new Prometheus histogram metrics for ClickHouse operations and RPC request latency, plus helper tracking functions.
  • Instruments lib/rpc.ts, lib/clickhouse.ts, and lib/batch-insert.ts to record success/error latency metrics.
  • Updates documentation (README) and adjusts Biome schema reference for local resolution.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
lib/rpc.ts Tracks per-attempt RPC latency with success/error labels.
lib/prometheus.ts Adds histogram metrics + tracking helpers; tweaks config info labeling behavior.
lib/clickhouse.ts Records ClickHouse read operation latency on success/error paths.
lib/batch-insert.ts Records ClickHouse write (batch insert) latency on success/error paths.
biome.json Switches $schema to local node_modules path.
README.md Adds a Monitoring section describing metrics and sample queries.

Comment on lines 44 to 48
const rpcRequests = new promClient.Histogram({
name: 'scraper_rpc_requests',
help: 'Duration of RPC requests in seconds',
labelNames: ['method', 'status'],
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This histogram tracks a duration in seconds but the metric name doesn’t include a unit suffix. Prometheus naming conventions recommend suffixing duration metrics with _seconds (e.g., scraper_rpc_requests_seconds) to avoid ambiguity and improve compatibility with standard dashboards/alerts. Since this is newly introduced, it’s a good time to align the name (and update README queries accordingly).

Copilot uses AI. Check for mistakes.
Comment on lines 159 to 163
/**
* Track a ClickHouse operation
* @param operationType - Type of operation ('read' or 'write')
* @param durationSeconds - Duration of the operation in seconds
*/
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSDoc for trackClickHouseOperation is out of sync with the actual function signature: it mentions durationSeconds but the function takes status and startTime and computes the duration internally. Please update the doc (and include status) so callers don’t get misled.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot adjust JSDoc function right here in this pr

Comment on lines 174 to 178
* Track an RPC request
* @param method - RPC method name (e.g., 'eth_getBlockByNumber')
* @param status - Request status ('success' or 'error')
* @param durationSeconds - Duration of the request in seconds
*/
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSDoc for trackRpcRequest is out of sync with the actual function signature: it mentions durationSeconds but the function takes startTime and computes the duration internally. Please update the doc to match the parameters and behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +168
export function trackClickHouseOperation(
operationType: 'read' | 'write',
status: 'success' | 'error',
startTime: number,
): void {
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib/prometheus.test.ts covers the server/counters but doesn’t exercise the newly exported histogram helpers (trackClickHouseOperation / trackRpcRequest). Adding a small test that calls them and asserts the corresponding metric series (with expected labels) appears would help prevent regressions in metric names/labels (monitoring API stability).

Copilot uses AI. Check for mistakes.
Comment on lines 78 to 82
.labels(
CLICKHOUSE_URL,
CLICKHOUSE_DATABASE || 'not_set',
NODE_URL || 'not_set',
)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scraper_config_info label values are sourced from CLICKHOUSE_URL / CLICKHOUSE_DATABASE / NODE_URL constants imported from lib/config.ts, but cli.ts mutates process.env after module import to apply CLI overrides. This means the gauge can report stale values (often not_set) even when the env vars are set via CLI flags. Consider reading from process.env inside startPrometheusServer() when initializing the gauge (mirroring lib/clickhouse.ts / lib/rpc.ts runtime reads) so the metric reflects the effective runtime configuration.

Copilot uses AI. Check for mistakes.
Comment on lines 35 to 39
const clickhouseOperations = new promClient.Histogram({
name: 'scraper_clickhouse_operations',
help: 'Duration of ClickHouse operations in seconds',
labelNames: ['operation_type', 'status'],
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This histogram tracks a duration in seconds but the metric name doesn’t include a unit suffix. Prometheus naming conventions recommend suffixing duration metrics with _seconds (e.g., scraper_clickhouse_operations_seconds) to avoid ambiguity and improve compatibility with standard dashboards/alerts. Since this is newly introduced, it’s a good time to align the name (and update README queries accordingly).

Copilot uses AI. Check for mistakes.
README.md Outdated
Comment on lines 132 to 134
**Configuration Info**
- `scraper_config_info` (gauge) - Configuration metadata
- Labels: `clickhouse_url`, `clickhouse_database`, `node_url`
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting scraper_config_info with labels clickhouse_url, clickhouse_database, and node_url reflects that full configuration URLs are being exposed as Prometheus metric labels, which can leak database credentials or RPC API keys via the /metrics endpoint. Anyone who can reach the metrics port (by default bound to all interfaces) could scrape these labels and reuse embedded passwords or tokens to access your ClickHouse instance or RPC provider. To avoid leaking secrets, avoid emitting full URLs or credentials as metric labels; instead, expose only non-sensitive, sanitized identifiers (for example, a redacted host or a boolean flag indicating whether a config value is set).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI commented Feb 3, 2026

@YaroShkvorets I've opened a new pull request, #258, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@YaroShkvorets
Copy link
Member Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 3, 2026

@YaroShkvorets I've opened a new pull request, #259, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Address PR review comments: fix metric naming, JSDoc, security, and add tests

Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>

* Improve JSDoc clarity and simplify fallback pattern

Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>

* Use sanitized node_host instead of boolean node_url_configured

Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>
@YaroShkvorets
Copy link
Member Author

@copilot tests fail

Copy link
Contributor

Copilot AI commented Feb 3, 2026

@YaroShkvorets I've opened a new pull request, #260, to work on those changes. Once the pull request is ready, I'll request review from you.

@YaroShkvorets
Copy link
Member Author

@copilot your PR was garbage so I closed it.
Let's try again.
You added some tests that are failing with these errors:

❯ bun test lib/prometheus.test.ts
bun test v1.3.8 (b64edcb4)

lib/prometheus.test.ts:
2026-02-03 21:18:38.820 INFO    lib/prometheus.ts:127   token-api-scraper:prometheus    Prometheus server started {
  port: 19001,
  url: 'http://0.0.0.0:19001/metrics' 
}
2026-02-03 21:18:38.928 INFO    lib/prometheus.ts:159   token-api-scraper:prometheus    Prometheus server stopped
✓ Prometheus Server > should start and stop server [213.86ms]
2026-02-03 21:18:39.032 INFO    lib/prometheus.ts:127   token-api-scraper:prometheus    Prometheus server started {
  port: 19002,
  url: 'http://0.0.0.0:19002/metrics' 
}
2026-02-03 21:18:39.135 INFO    lib/prometheus.ts:159   token-api-scraper:prometheus    Prometheus server stopped
✓ Prometheus Server > should expose key metrics [106.67ms]
2026-02-03 21:18:39.139 INFO    lib/prometheus.ts:127   token-api-scraper:prometheus    Prometheus server started {
  port: 19003,
  url: 'http://0.0.0.0:19003/metrics' 
}
2026-02-03 21:18:39.243 INFO    lib/prometheus.ts:159   token-api-scraper:prometheus    Prometheus server stopped
✓ Prometheus Server > should update metrics correctly [107.84ms]
2026-02-03 21:18:39.247 INFO    lib/prometheus.ts:127   token-api-scraper:prometheus    Prometheus server started {
  port: 19004,
  url: 'http://0.0.0.0:19004/metrics' 
}
2026-02-03 21:18:39.248 WARN    lib/prometheus.ts:110   token-api-scraper:prometheus    Prometheus server already running {
  port: 19004 
}
2026-02-03 21:18:39.250 INFO    lib/prometheus.ts:159   token-api-scraper:prometheus    Prometheus server stopped
✓ Prometheus Server > should handle starting server on already used port [7.42ms]
2026-02-03 21:18:39.255 ERROR   lib/prometheus.ts:135   token-api-scraper:prometheus    Prometheus server error {
  error: 'Failed to start server. Is port 19005 in use?' 
}
✓ Prometheus Server > should reject when port is already used by external process [5.06ms]
2026-02-03 21:18:39.256 WARN    lib/prometheus.ts:110   token-api-scraper:prometheus    Prometheus server already running {
  port: 19006 
}
error: Unable to connect. Is the computer able to access the url?
  path: "http://localhost:19006/metrics",
 errno: 0,
  code: "ConnectionRefused"

✗ Prometheus Histogram Helpers > should track ClickHouse operations with correct labels [103.59ms]
2026-02-03 21:18:39.359 WARN    lib/prometheus.ts:110   token-api-scraper:prometheus    Prometheus server already running {
  port: 19007 
}
error: Unable to connect. Is the computer able to access the url?
  path: "http://localhost:19007/metrics",
 errno: 0,
  code: "ConnectionRefused"

✗ Prometheus Histogram Helpers > should track RPC requests with correct labels [101.42ms]

 5 pass
 2 fail
 13 expect() calls
Ran 7 tests across 1 file. [705.00ms]

2 tests that you added are failing because you are probably setting up the environment wrong. It has nothing to do with running concurrently - all tests are running prometheus server on different ports. Fix tests without changing the tested functions and make sure bun test lib/prometheus.test.ts completes successfully

Copy link
Contributor

Copilot AI commented Feb 3, 2026

@YaroShkvorets I've opened a new pull request, #261, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Fix Prometheus test failures by cleaning up server state on error

Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>

* use bun sleep

* lint

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>
Co-authored-by: YaroShkvorets <shkvorets@gmail.com>
@YaroShkvorets YaroShkvorets merged commit 6f25d51 into main Feb 3, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants