Skip to content

Commit 6f25d51

Browse files
YaroShkvoretsCopilotCopilot
authored
Add Prometheus metrics tracking for ClickHouse and RPC operations (#257)
* Add Prometheus metrics tracking for ClickHouse and RPC operations * adjust buckets * Update README.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix Prometheus metric naming, security, and documentation issues (#259) * Initial plan * Address PR review comments: fix metric naming, JSDoc, security, and add tests Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> * Improve JSDoc clarity and simplify fallback pattern Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> * Use sanitized node_host instead of boolean node_url_configured Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> * Fix Prometheus server state leak on startup failure (#261) * Initial plan * Fix Prometheus test failures by cleaning up server state on error Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> * use bun sleep * lint --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com> Co-authored-by: YaroShkvorets <shkvorets@gmail.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: YaroShkvorets <29608734+YaroShkvorets@users.noreply.github.com>
1 parent e0a8106 commit 6f25d51

File tree

7 files changed

+210
-13
lines changed

7 files changed

+210
-13
lines changed

README.md

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ A specialized tool for scraping and indexing ERC-20 token data on the TRON block
77
- **ERC-20 Focus**: Designed specifically for TRON ERC-20 tokens
88
- **Continuous Query Mechanism**: Tracks block numbers to enable incremental balance updates
99
- **Efficient Processing**: Only queries new or updated transfers, avoiding redundant RPC calls
10-
- **Prometheus Metrics**: Real-time monitoring with Prometheus metrics support
10+
- **Prometheus Metrics**: Real-time monitoring with comprehensive metrics (see [Monitoring](#monitoring))
1111
- **Concurrent Processing**: Configurable concurrency for optimal RPC throughput
1212
- **RPC Batch Requests**: Optional batching of multiple RPC calls for improved performance
1313

@@ -107,6 +107,44 @@ docker run --env-file .env -p 9090:9090 token-api-scraper \
107107

108108
See [Docker Guide](docs/DOCKER.md) for Docker Compose examples and production deployment.
109109

110+
## Monitoring
111+
112+
The scraper exposes Prometheus metrics on port `9090` (configurable via `PROMETHEUS_PORT`).
113+
114+
### Available Metrics
115+
116+
**ClickHouse Operations**
117+
- `scraper_clickhouse_operations_seconds` (histogram) - Duration of ClickHouse operations in seconds
118+
- Labels: `operation_type` (`read`, `write`), `status` (`success`, `error`)
119+
- Automatically tracked in `lib/clickhouse.ts` and `lib/batch-insert.ts`
120+
121+
**RPC Requests**
122+
- `scraper_rpc_requests_seconds` (histogram) - Duration of RPC requests in seconds
123+
- Labels: `method` (e.g., `eth_call`, `eth_getBalance`), `status` (`success`, `error`)
124+
- Automatically tracked in `lib/rpc.ts`
125+
126+
**Task Completion**
127+
- `scraper_completed_tasks_total` (counter) - Total completed tasks
128+
- Labels: `service`, `status` (`success`, `error`)
129+
- `scraper_error_tasks_total` (counter) - Total failed tasks
130+
- Labels: `service`
131+
132+
**Configuration Info**
133+
- `scraper_config_info` (gauge) - Configuration metadata
134+
- Labels: `clickhouse_host` (sanitized hostname only), `clickhouse_database`, `node_host` (sanitized hostname only)
135+
136+
### Accessing Metrics
137+
138+
```bash
139+
# View metrics endpoint
140+
curl http://localhost:9090/metrics
141+
142+
# Example Prometheus queries
143+
scraper_clickhouse_operations_seconds_bucket{operation_type="write"}
144+
rate(scraper_rpc_requests_seconds_count{status="success"}[5m])
145+
histogram_quantile(0.95, sum(rate(scraper_clickhouse_operations_seconds_bucket{operation_type="read",status="success"}[5m])) by (le))
146+
```
147+
110148
## Testing
111149

112150
```bash

biome.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"$schema": "https://biomejs.dev/schemas/2.3.11/schema.json",
2+
"$schema": "./node_modules/@biomejs/biome/configuration_schema.json",
33
"vcs": {
44
"enabled": true,
55
"clientKind": "git",

lib/batch-insert.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import { insertClient } from './clickhouse';
22
import { NODE_URL } from './config';
33
import { createLogger } from './logger';
4+
import { trackClickHouseOperation } from './prometheus';
45

56
const log = createLogger('batch-insert');
67

@@ -60,11 +61,14 @@ export class BatchInsertQueue {
6061

6162
// Get all items and clear the queue
6263
const items = queue.splice(0);
64+
const startTime = performance.now();
6365

6466
try {
6567
await this.insertImmediate(table, items);
68+
trackClickHouseOperation('write', 'success', startTime);
6669
} catch (error) {
6770
this.handleInsertError(error, table, items.length);
71+
trackClickHouseOperation('write', 'error', startTime);
6872
}
6973
}
7074

lib/clickhouse.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
import { type ClickHouseClient, createClient } from '@clickhouse/client';
22
import { createLogger } from './logger';
3+
import { trackClickHouseOperation } from './prometheus';
34

45
const log = createLogger('clickhouse');
56

@@ -156,6 +157,7 @@ export async function query<T = any>(
156157
query_params,
157158
format: 'JSONEachRow',
158159
});
160+
trackClickHouseOperation('read', 'success', startTime);
159161
const queryEndTime = performance.now();
160162

161163
// Track data parsing time
@@ -189,6 +191,7 @@ export async function query<T = any>(
189191
},
190192
};
191193
} catch (error: unknown) {
194+
trackClickHouseOperation('read', 'error', startTime);
192195
const url = process.env.CLICKHOUSE_URL || 'http://localhost:8123';
193196
const urlObj = new URL(url);
194197
const host = urlObj.hostname;

lib/prometheus.test.ts

Lines changed: 81 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
import { describe, expect, test } from 'bun:test';
2+
import { sleep } from 'bun';
23
import {
34
incrementError,
45
incrementSuccess,
56
startPrometheusServer,
67
stopPrometheusServer,
8+
trackClickHouseOperation,
9+
trackRpcRequest,
710
} from './prometheus';
811

912
describe('Prometheus Server', () => {
@@ -12,8 +15,7 @@ describe('Prometheus Server', () => {
1215

1316
await startPrometheusServer(port);
1417

15-
// Wait for server to start
16-
await new Promise((resolve) => setTimeout(resolve, 100));
18+
await sleep(100);
1719

1820
// Verify the Prometheus metrics endpoint is accessible
1921
const response = await fetch(`http://localhost:${port}/metrics`);
@@ -26,7 +28,7 @@ describe('Prometheus Server', () => {
2628
await stopPrometheusServer();
2729

2830
// Wait for server to close
29-
await new Promise((resolve) => setTimeout(resolve, 100));
31+
await sleep(100);
3032

3133
// Verify server is closed
3234
try {
@@ -50,8 +52,7 @@ describe('Prometheus Server', () => {
5052

5153
await startPrometheusServer(port);
5254

53-
// Wait for server to start
54-
await new Promise((resolve) => setTimeout(resolve, 100));
55+
await sleep(100);
5556

5657
// Fetch metrics
5758
const response = await fetch(`http://localhost:${port}/metrics`);
@@ -65,9 +66,9 @@ describe('Prometheus Server', () => {
6566
expect(metricsText).toContain('scraper_config_info');
6667

6768
// Verify config info has labels
68-
expect(metricsText).toContain('clickhouse_url');
69+
expect(metricsText).toContain('clickhouse_host');
6970
expect(metricsText).toContain('clickhouse_database');
70-
expect(metricsText).toContain('node_url');
71+
expect(metricsText).toContain('node_host');
7172

7273
await stopPrometheusServer();
7374
});
@@ -78,8 +79,7 @@ describe('Prometheus Server', () => {
7879

7980
await startPrometheusServer(port);
8081

81-
// Wait for server to start
82-
await new Promise((resolve) => setTimeout(resolve, 100));
82+
await sleep(100);
8383

8484
// Set metrics
8585
incrementSuccess(serviceName);
@@ -93,6 +93,9 @@ describe('Prometheus Server', () => {
9393
expect(metricsText).toContain(serviceName);
9494

9595
await stopPrometheusServer();
96+
97+
// Wait for server to fully close
98+
await sleep(100);
9699
});
97100

98101
test('should handle starting server on already used port', async () => {
@@ -108,6 +111,9 @@ describe('Prometheus Server', () => {
108111
expect(response.ok).toBe(true);
109112

110113
await stopPrometheusServer();
114+
115+
// Wait for server to fully close
116+
await sleep(100);
111117
});
112118

113119
test('should reject when port is already used by external process', async () => {
@@ -130,3 +136,69 @@ describe('Prometheus Server', () => {
130136
}
131137
});
132138
});
139+
140+
describe('Prometheus Histogram Helpers', () => {
141+
test('should track ClickHouse operations with correct labels', async () => {
142+
const port = 19006;
143+
144+
await startPrometheusServer(port);
145+
146+
await sleep(100);
147+
148+
// Track some ClickHouse operations
149+
const startTime = performance.now();
150+
trackClickHouseOperation('read', 'success', startTime);
151+
trackClickHouseOperation('write', 'success', startTime);
152+
trackClickHouseOperation('read', 'error', startTime);
153+
154+
// Fetch metrics
155+
const response = await fetch(`http://localhost:${port}/metrics`);
156+
const metricsText = await response.text();
157+
158+
// Verify histogram metric is present with correct name
159+
expect(metricsText).toContain('scraper_clickhouse_operations_seconds');
160+
161+
// Verify labels are present
162+
expect(metricsText).toContain('operation_type="read"');
163+
expect(metricsText).toContain('operation_type="write"');
164+
expect(metricsText).toContain('status="success"');
165+
expect(metricsText).toContain('status="error"');
166+
167+
await stopPrometheusServer();
168+
169+
// Wait for server to fully close
170+
await sleep(100);
171+
});
172+
173+
test('should track RPC requests with correct labels', async () => {
174+
const port = 19007;
175+
176+
await startPrometheusServer(port);
177+
178+
await sleep(100);
179+
180+
// Track some RPC requests
181+
const startTime = performance.now();
182+
trackRpcRequest('eth_call', 'success', startTime);
183+
trackRpcRequest('eth_getBalance', 'success', startTime);
184+
trackRpcRequest('eth_call', 'error', startTime);
185+
186+
// Fetch metrics
187+
const response = await fetch(`http://localhost:${port}/metrics`);
188+
const metricsText = await response.text();
189+
190+
// Verify histogram metric is present with correct name
191+
expect(metricsText).toContain('scraper_rpc_requests_seconds');
192+
193+
// Verify labels are present
194+
expect(metricsText).toContain('method="eth_call"');
195+
expect(metricsText).toContain('method="eth_getBalance"');
196+
expect(metricsText).toContain('status="success"');
197+
expect(metricsText).toContain('status="error"');
198+
199+
await stopPrometheusServer();
200+
201+
// Wait for server to fully close
202+
await sleep(100);
203+
});
204+
});

lib/prometheus.ts

Lines changed: 78 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,18 +31,53 @@ const errorTasksCounter = new promClient.Counter({
3131
registers: [register],
3232
});
3333

34+
// ClickHouse operation metrics
35+
const clickhouseOperations = new promClient.Histogram({
36+
name: 'scraper_clickhouse_operations_seconds',
37+
help: 'Duration of ClickHouse operations in seconds',
38+
labelNames: ['operation_type', 'status'],
39+
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
40+
registers: [register],
41+
});
42+
43+
// RPC request metrics
44+
const rpcRequests = new promClient.Histogram({
45+
name: 'scraper_rpc_requests_seconds',
46+
help: 'Duration of RPC requests in seconds',
47+
labelNames: ['method', 'status'],
48+
buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 10],
49+
registers: [register],
50+
});
51+
3452
// Configuration info metrics
3553
const configInfoGauge = new promClient.Gauge({
3654
name: 'scraper_config_info',
3755
help: 'Configuration information for the scraper',
38-
labelNames: ['clickhouse_url', 'clickhouse_database', 'node_url'],
56+
labelNames: ['clickhouse_host', 'clickhouse_database', 'node_host'],
3957
registers: [register],
4058
});
4159

4260
// Track whether config metrics have been initialized
4361
let configMetricsInitialized = false;
4462
let prometheusServer: http.Server | undefined;
4563

64+
/**
65+
* Sanitize a URL to extract only the hostname, removing credentials, port, and path
66+
* @param url - The URL to sanitize
67+
* @returns The sanitized hostname or 'not_set' if invalid
68+
*/
69+
function sanitizeUrl(url: string | undefined): string {
70+
if (!url || url === 'not_set') {
71+
return 'not_set';
72+
}
73+
try {
74+
const parsed = new URL(url);
75+
return parsed.hostname;
76+
} catch {
77+
return 'redacted';
78+
}
79+
}
80+
4681
/**
4782
* Initialize Prometheus server
4883
* @param port - Port to listen on
@@ -56,8 +91,18 @@ export function startPrometheusServer(
5691
return new Promise((resolve, reject) => {
5792
// Set configuration info metrics once (only on first initialization)
5893
if (!configMetricsInitialized) {
94+
// Read from process.env at runtime to get CLI overrides
95+
const clickhouseUrl = process.env.CLICKHOUSE_URL || CLICKHOUSE_URL;
96+
const clickhouseDatabase =
97+
process.env.CLICKHOUSE_DATABASE || CLICKHOUSE_DATABASE;
98+
const nodeUrl = process.env.NODE_URL || NODE_URL;
99+
59100
configInfoGauge
60-
.labels(CLICKHOUSE_URL, CLICKHOUSE_DATABASE, NODE_URL)
101+
.labels(
102+
sanitizeUrl(clickhouseUrl),
103+
clickhouseDatabase || 'not_set',
104+
sanitizeUrl(nodeUrl),
105+
)
61106
.set(1);
62107
configMetricsInitialized = true;
63108
}
@@ -89,6 +134,7 @@ export function startPrometheusServer(
89134

90135
prometheusServer.on('error', (err) => {
91136
log.error('Prometheus server error', { error: err.message });
137+
prometheusServer = undefined;
92138
reject(err);
93139
});
94140
});
@@ -133,3 +179,33 @@ export function incrementError(serviceName: string): void {
133179
completedTasksCounter.labels(serviceName, 'error').inc();
134180
errorTasksCounter.labels(serviceName).inc();
135181
}
182+
183+
/**
184+
* Track a ClickHouse operation
185+
* @param operationType - Type of operation ('read' or 'write')
186+
* @param status - Operation status ('success' or 'error')
187+
* @param startTime - Start time in milliseconds (from performance.now())
188+
*/
189+
export function trackClickHouseOperation(
190+
operationType: 'read' | 'write',
191+
status: 'success' | 'error',
192+
startTime: number,
193+
): void {
194+
const durationSeconds = (performance.now() - startTime) / 1000;
195+
clickhouseOperations.labels(operationType, status).observe(durationSeconds);
196+
}
197+
198+
/**
199+
* Track an RPC request
200+
* @param method - RPC method name (e.g., 'eth_getBlockByNumber')
201+
* @param status - Request status ('success' or 'error')
202+
* @param startTime - Start time in milliseconds (from performance.now())
203+
*/
204+
export function trackRpcRequest(
205+
method: string,
206+
status: 'success' | 'error',
207+
startTime: number,
208+
): void {
209+
const durationSeconds = (performance.now() - startTime) / 1000;
210+
rpcRequests.labels(method, status).observe(durationSeconds);
211+
}

0 commit comments

Comments
 (0)