Skip to content

smithclay/duckdb-otlp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DuckDB OpenTelemetry (OTLP) Extension

Query OpenTelemetry traces, logs, and metrics with SQL. Works with OTLP file exports from any OpenTelemetry Collector, uses a row-based schema inspired by the Clickhouse OpenTelemetry exporter.

-- Install from DuckDB community extensions
INSTALL otlp FROM community;
LOAD otlp;

-- Query slow traces
SELECT
    trace_id,
    span_name,
    duration / 1000000 AS duration_ms
FROM read_otlp_traces('traces.jsonl')
WHERE duration > 1000000000  -- over 1 second
ORDER BY duration DESC
LIMIT 5;

Output:

┌─────────────────────────────────┬──────────────────┬──────────────┐
│            trace_id             │    span_name     │ duration_ms  │
├─────────────────────────────────┼──────────────────┼──────────────┤
│ 7a3f92e8b4c1d6f0a9e2...         │ POST /checkout   │    1523.4    │
│ 8b1e45c9f2a7d3e6b0f1...         │ GET /search      │    1205.7    │
│ 3c2d19f8e4b6a0c7d1f9...         │ PUT /cart/items  │    1089.2    │
└─────────────────────────────────┴──────────────────┴──────────────┘

Want to stream OTLP data directly to duckdb / Parquet in cloud storage? Check out https://github.com/smithclay/otlp2parquet

What you can do

  • Analyze production telemetry - Query OTLP file exports with familiar SQL syntax
  • Archive to your data lake - Convert OpenTelemetry data to Parquet with schemas intact
  • Debug faster - Filter logs by severity, find slow traces, aggregate metrics
  • Integrate with data tools - Use DuckDB's ecosystem (MotherDuck, Jupyter, DBT, etc.)

Get started in 3 minutes

→ Quick Start Guide - Install, load sample data, run your first query

Try it in your browser

→ Interactive Demo - Query OTLP data directly in your browser using DuckDB-WASM

The browser demo lets you:

  • Load sample OTLP traces, logs, and metrics
  • Run SQL queries without installing anything
  • Upload your own JSONL files for analysis

Note: The WASM demo supports JSON format only. For protobuf support, install the native extension.

Common use cases

Find slow requests

SELECT span_name, AVG(duration) / 1000000 AS avg_ms
FROM read_otlp_traces('prod-traces/*.jsonl')
WHERE span_kind = 2  -- SERVER
GROUP BY span_name
HAVING AVG(duration) > 1000000000
ORDER BY avg_ms DESC;

Export telemetry to Parquet

COPY (
  SELECT * FROM read_otlp_traces('otel-export/*.jsonl')
) TO 'data-lake/daily_traces.parquet' (FORMAT PARQUET);

Filter logs by severity

SELECT timestamp, service_name, body
FROM read_otlp_logs('app-logs/*.jsonl')
WHERE severity_text IN ('ERROR', 'FATAL')
ORDER BY timestamp DESC;

Build metrics dashboards

CREATE TABLE metrics_gauge AS
SELECT timestamp, service_name, metric_name, value
FROM read_otlp_metrics_gauge('metrics/*.jsonl');

→ See more examples in the Cookbook

Limits

Individual files are limited to 100 MB to prevent memory exhaustion. This applies to entire protobuf files or individual documents in JSONL files.

What's inside

Table Functions

Function What it does
read_otlp_traces(path) Stream trace spans (25 columns) with identifiers, attributes, events, and links
read_otlp_logs(path) Read log records (15 columns) with severity, body, and trace correlation
read_otlp_metrics_gauge(path) Read gauge metrics (16 columns)
read_otlp_metrics_sum(path) Read sum/counter metrics (18 columns) with aggregation temporality

→ Full API Reference

Features

  • Automatic format detection - Works with JSON, JSONL, and protobuf OTLP files (protobuf requires native extension)
  • DuckDB file systems - Read from local files, S3, HTTP(S), Azure Blob, GCS
  • ClickHouse-inspired schema - Row-based schema inspired by OpenTelemetry ClickHouse exporter
  • Browser support - Run queries in-browser with DuckDB-WASM (JSON only)

Installation

Option 1: Install from community (recommended)

INSTALL otlp FROM community;
LOAD otlp;

Option 2: Build from source

See CONTRIBUTING.md for build instructions.

Documentation

Documentation Hub →

📚 Guides - Task-based tutorials with real examples 📖 Reference - Schemas, API signatures, error handling ⚙️ Setup - Installation, collector configuration, sample data

How it works

Generally speaking: the idea is you load files created using the OpenTelemetry Collector file exporter.

OpenTelemetry     File         DuckDB OTLP          SQL
Collector      Exporter       Extension          Results
   │              │               │                 │
   │  OTLP/gRPC   │               │                 │
   ├─────────────►│  .jsonl/.pb   │                 │
   │              ├──────────────►│  read_otlp_*()  │
   │              │               ├────────────────►│
   │              │               │                 │

The extension reads OTLP files (JSON or protobuf), detects the format automatically, and streams strongly-typed rows into DuckDB tables. Schemas match the ClickHouse exporter format for compatibility.

→ Learn more in Architecture Guide

Schemas

All table functions emit strongly-typed columns with snake_case naming:

  • Traces: 25 columns - identifiers, timestamps, attributes, events, links
  • Logs: 15 columns - severity, body, trace correlation, attributes
  • Metrics: 16-18 columns depending on metric type (gauge, sum)

→ Schema Reference

Need help?

License

MIT - See LICENSE for details


Learn more: OpenTelemetry Protocol (OTLP) | ClickHouse Exporter | DuckDB Extensions

About

query OpenTelemetry metrics, logs, and traces (OTLP) in duckdb

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published