|
| 1 | +1. use span[] (make sure to link to docs on this in code) |
| 2 | +2. use otlp 1 (link to docs on thi sin code) |
| 3 | +3. use raw binary instead of hex/base64 for max compactness |
| 4 | +4. we don't need to store resources, since this is written to local disk for a single resource (eg an actor). we'll attach resource data when we export to otlp systems later. make sure this is documented |
| 5 | +5. use keyvalue. have the key be in the strings lookup map. have the value be encoded using cbor-x |
| 6 | +6. i think we need something more complicated where we store data in each span of all active spans at the beginning of this chunk and which bucket/span it started in. then we can look up that bucket/span manually. do you have any recommendations on how we coudl improve this? how does this affect our read/write system? |
| 7 | +7. yes |
| 8 | +8. yes, explicit clamped proerty |
| 9 | +9. we have a heavy write load and these spans can last months. is this still what you would recommend? give me a few recommendations. |
| 10 | + |
| 11 | +did you get what i said about storing a lookup map for all strings? |
| 12 | + |
| 13 | +# Traces design notes / questions |
| 14 | + |
| 15 | +Primary references (OTLP/JSON schema and structure): |
| 16 | + |
| 17 | +https://opentelemetry.io/docs/specs/otlp/ |
| 18 | +https://opentelemetry.io/docs/specs/otel/protocol/file-exporter/ |
| 19 | +https://github.com/open-telemetry/opentelemetry-proto |
| 20 | +https://protodoc.io/open-telemetry/opentelemetry-proto/opentelemetry.proto.collector.trace.v1 |
| 21 | +https://protodoc.io/Helicone/helicone/opentelemetry.proto.trace.v1 |
| 22 | +https://opentelemetry.io/docs/specs/otel/common/ |
| 23 | +https://opentelemetry.io/docs/concepts/resources/ |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +1) OTLP/JSON “flavors”: what they are |
| 28 | +- OTLP/JSON ExportTraceServiceRequest is the canonical OTLP trace payload. It’s the protobuf ExportTraceServiceRequest encoded as JSON (proto3 JSON mapping + OTLP-specific rules). The structure is resourceSpans → scopeSpans → spans. |
| 29 | +- A “Span[] only” subset would be a custom format (not standard OTLP), which means any off‑the‑shelf collector won’t accept it. OTLP JSON examples show spans always nested under resource and scope. |
| 30 | + |
| 31 | +Recommendation: use the standard OTLP/JSON envelope for interoperability, even if we store compact internal records and reconstruct on read. |
| 32 | + |
| 33 | +--- |
| 34 | + |
| 35 | +2) OTLP versions and available fields |
| 36 | +- OTLP trace payloads are defined by the proto schemas in the opentelemetry-proto repo (trace + collector/trace). That’s the authoritative field list. |
| 37 | +- High‑level structure: ExportTraceServiceRequest.resourceSpans[] → each has resource + scopeSpans[] → each has scope + spans[]. |
| 38 | +- Span fields include IDs, timestamps, name/kind, attributes, events, links, status, dropped counts, flags, etc. (see trace.proto via protodoc link). |
| 39 | + |
| 40 | +Recommendation: target “current OTLP v1” (stable) and treat the proto as source‑of‑truth. The OTLP spec is stable for trace signals. |
| 41 | + |
| 42 | +--- |
| 43 | + |
| 44 | +3) ID encoding: hex vs base64 |
| 45 | +- OTLP/JSON explicitly requires hex strings for traceId/spanId (not base64). |
| 46 | + |
| 47 | +Pros: |
| 48 | +- Spec‑compliant; matches OTel APIs (hex is the canonical external form). |
| 49 | + |
| 50 | +Cons: |
| 51 | +- Larger than binary (hex is 2× size). |
| 52 | + |
| 53 | +Recommendation: use hex strings in JSON output; store internally as bytes for compactness. |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +4) Resource vs scope (instrumentation scope) |
| 58 | +- Resource describes the entity producing telemetry (service, host, deployment, etc.). |
| 59 | +- Instrumentation scope describes the library that produced the spans (name/version/attributes). |
| 60 | +- In OTLP, spans are grouped by resource, then by scope. |
| 61 | + |
| 62 | +Practical difference: resource = “who/where,” scope = “which instrumentation library,” and both are preserved in OTLP JSON. |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +5) “JSON-ish types” vs OTLP AnyValue |
| 67 | +OTLP attributes are a list of KeyValue, where the value is AnyValue (a tagged union: string/int/bool/double/bytes/array/map). |
| 68 | + |
| 69 | +So you have two internal options: |
| 70 | +- JSON-ish: store arbitrary JSON objects/arrays directly and convert to AnyValue at read time. |
| 71 | +- OTLP-style: store AnyValue/KeyValue structures internally and serialize directly. |
| 72 | + |
| 73 | +Given preference for a compact internal schema + string table, a good fit is: |
| 74 | +- Internal: compact AnyValue-like union + string table |
| 75 | +- External: OTLP/JSON reconstructed from that |
| 76 | + |
| 77 | +--- |
| 78 | + |
| 79 | +6) Span lifetime vs time buckets |
| 80 | +OTel doesn’t define buckets; spans can start and end at any time. Typically, spans are exported after they end. So you can safely: |
| 81 | +- keep open spans in memory, |
| 82 | +- write ended spans to disk, |
| 83 | +- read hybrid (memory + disk) for queries. |
| 84 | + |
| 85 | +If you need to persist long‑running spans without mutating disk entries, use append‑only records: |
| 86 | +- SpanStart, SpanEvent, SpanEnd records (or “SpanDelta”) |
| 87 | +- reconstruct on read into OTLP Span |
| 88 | + |
| 89 | +This avoids rewriting chunks. |
| 90 | + |
| 91 | +Question: do you want the append‑only delta model, or is “open spans stay only in memory until end” acceptable (with the risk of losing open spans on crash)? |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +7) emitEvent without span? |
| 96 | +In OTLP trace data, events are part of a span (Span.Event). There is no standalone trace event in OTLP JSON. |
| 97 | + |
| 98 | +Options: |
| 99 | +- require an active span, or |
| 100 | +- create an implicit span (e.g., span name = event name), or |
| 101 | +- treat it as a log signal (not part of trace data). |
| 102 | + |
| 103 | +Question: should emitEvent error without an active span, or should it auto‑create a short-lived span? |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +8) Read semantics (default picks) |
| 108 | +- Filter by: span startTimeUnixNano |
| 109 | +- Range: [start, end) (inclusive start, exclusive end) to avoid double‑count when paginating |
| 110 | +- Sort: by startTimeUnixNano, tie-break by traceId/spanId |
| 111 | +- Limit: clamp to MAX_LIMIT to avoid runaway allocations; return the actual count |
| 112 | +- Mid‑chunk: it’s fine to stop after reaching limit mid‑chunk (we can stream decode and stop early) |
| 113 | + |
| 114 | +Question: do you want an explicit “limit was clamped” signal in the API, or just silently clamp? |
| 115 | + |
| 116 | +--- |
| 117 | + |
| 118 | +9) Defaults (proposal) |
| 119 | +- Bucket size: 1h |
| 120 | +- Chunk target: 512 KiB (keeps well under 1 MiB) |
| 121 | +- Flush age: 5–10s (pick 5s for low‑traffic latency) |
| 122 | +- Size estimation: encode records into a per‑chunk buffer as they are added; size = buffer length (accurate and simple) |
| 123 | + |
| 124 | +Pending inputs to finalize: |
| 125 | +1) append‑only deltas vs only write on end |
| 126 | +2) emitEvent behavior without span |
| 127 | +3) limit clamp behavior |
0 commit comments