Skip to content

Commit 7504a0b

Browse files
committed
feat(rivetkit): traces
1 parent 9524546 commit 7504a0b

File tree

31 files changed

+4958
-330
lines changed

31 files changed

+4958
-330
lines changed

TRACES_DESIGN_NOTES.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
1. use span[] (make sure to link to docs on this in code)
2+
2. use otlp 1 (link to docs on thi sin code)
3+
3. use raw binary instead of hex/base64 for max compactness
4+
4. we don't need to store resources, since this is written to local disk for a single resource (eg an actor). we'll attach resource data when we export to otlp systems later. make sure this is documented
5+
5. use keyvalue. have the key be in the strings lookup map. have the value be encoded using cbor-x
6+
6. i think we need something more complicated where we store data in each span of all active spans at the beginning of this chunk and which bucket/span it started in. then we can look up that bucket/span manually. do you have any recommendations on how we coudl improve this? how does this affect our read/write system?
7+
7. yes
8+
8. yes, explicit clamped proerty
9+
9. we have a heavy write load and these spans can last months. is this still what you would recommend? give me a few recommendations.
10+
11+
did you get what i said about storing a lookup map for all strings?
12+
13+
# Traces design notes / questions
14+
15+
Primary references (OTLP/JSON schema and structure):
16+
17+
https://opentelemetry.io/docs/specs/otlp/
18+
https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/
19+
https://github.com/open-telemetry/opentelemetry-proto
20+
https://protodoc.io/open-telemetry/opentelemetry-proto/opentelemetry.proto.collector.trace.v1
21+
https://protodoc.io/Helicone/helicone/opentelemetry.proto.trace.v1
22+
https://opentelemetry.io/docs/specs/otel/common/
23+
https://opentelemetry.io/docs/concepts/resources/
24+
25+
---
26+
27+
1) OTLP/JSON “flavors”: what they are
28+
- OTLP/JSON ExportTraceServiceRequest is the canonical OTLP trace payload. It’s the protobuf ExportTraceServiceRequest encoded as JSON (proto3 JSON mapping + OTLP-specific rules). The structure is resourceSpans → scopeSpans → spans.
29+
- A “Span[] only” subset would be a custom format (not standard OTLP), which means any off‑the‑shelf collector won’t accept it. OTLP JSON examples show spans always nested under resource and scope.
30+
31+
Recommendation: use the standard OTLP/JSON envelope for interoperability, even if we store compact internal records and reconstruct on read.
32+
33+
---
34+
35+
2) OTLP versions and available fields
36+
- OTLP trace payloads are defined by the proto schemas in the opentelemetry-proto repo (trace + collector/trace). That’s the authoritative field list.
37+
- High‑level structure: ExportTraceServiceRequest.resourceSpans[] → each has resource + scopeSpans[] → each has scope + spans[].
38+
- Span fields include IDs, timestamps, name/kind, attributes, events, links, status, dropped counts, flags, etc. (see trace.proto via protodoc link).
39+
40+
Recommendation: target “current OTLP v1” (stable) and treat the proto as source‑of‑truth. The OTLP spec is stable for trace signals.
41+
42+
---
43+
44+
3) ID encoding: hex vs base64
45+
- OTLP/JSON explicitly requires hex strings for traceId/spanId (not base64).
46+
47+
Pros:
48+
- Spec‑compliant; matches OTel APIs (hex is the canonical external form).
49+
50+
Cons:
51+
- Larger than binary (hex is 2× size).
52+
53+
Recommendation: use hex strings in JSON output; store internally as bytes for compactness.
54+
55+
---
56+
57+
4) Resource vs scope (instrumentation scope)
58+
- Resource describes the entity producing telemetry (service, host, deployment, etc.).
59+
- Instrumentation scope describes the library that produced the spans (name/version/attributes).
60+
- In OTLP, spans are grouped by resource, then by scope.
61+
62+
Practical difference: resource = “who/where,” scope = “which instrumentation library,” and both are preserved in OTLP JSON.
63+
64+
---
65+
66+
5) “JSON-ish types” vs OTLP AnyValue
67+
OTLP attributes are a list of KeyValue, where the value is AnyValue (a tagged union: string/int/bool/double/bytes/array/map).
68+
69+
So you have two internal options:
70+
- JSON-ish: store arbitrary JSON objects/arrays directly and convert to AnyValue at read time.
71+
- OTLP-style: store AnyValue/KeyValue structures internally and serialize directly.
72+
73+
Given preference for a compact internal schema + string table, a good fit is:
74+
- Internal: compact AnyValue-like union + string table
75+
- External: OTLP/JSON reconstructed from that
76+
77+
---
78+
79+
6) Span lifetime vs time buckets
80+
OTel doesn’t define buckets; spans can start and end at any time. Typically, spans are exported after they end. So you can safely:
81+
- keep open spans in memory,
82+
- write ended spans to disk,
83+
- read hybrid (memory + disk) for queries.
84+
85+
If you need to persist long‑running spans without mutating disk entries, use append‑only records:
86+
- SpanStart, SpanEvent, SpanEnd records (or “SpanDelta”)
87+
- reconstruct on read into OTLP Span
88+
89+
This avoids rewriting chunks.
90+
91+
Question: do you want the append‑only delta model, or is “open spans stay only in memory until end” acceptable (with the risk of losing open spans on crash)?
92+
93+
---
94+
95+
7) emitEvent without span?
96+
In OTLP trace data, events are part of a span (Span.Event). There is no standalone trace event in OTLP JSON.
97+
98+
Options:
99+
- require an active span, or
100+
- create an implicit span (e.g., span name = event name), or
101+
- treat it as a log signal (not part of trace data).
102+
103+
Question: should emitEvent error without an active span, or should it auto‑create a short-lived span?
104+
105+
---
106+
107+
8) Read semantics (default picks)
108+
- Filter by: span startTimeUnixNano
109+
- Range: [start, end) (inclusive start, exclusive end) to avoid double‑count when paginating
110+
- Sort: by startTimeUnixNano, tie-break by traceId/spanId
111+
- Limit: clamp to MAX_LIMIT to avoid runaway allocations; return the actual count
112+
- Mid‑chunk: it’s fine to stop after reaching limit mid‑chunk (we can stream decode and stop early)
113+
114+
Question: do you want an explicit “limit was clamped” signal in the API, or just silently clamp?
115+
116+
---
117+
118+
9) Defaults (proposal)
119+
- Bucket size: 1h
120+
- Chunk target: 512 KiB (keeps well under 1 MiB)
121+
- Flush age: 5–10s (pick 5s for low‑traffic latency)
122+
- Size estimation: encode records into a per‑chunk buffer as they are added; size = buffer length (accurate and simple)
123+
124+
Pending inputs to finalize:
125+
1) append‑only deltas vs only write on end
126+
2) emitEvent behavior without span
127+
3) limit clamp behavior

0 commit comments

Comments
 (0)