-
-
Notifications
You must be signed in to change notification settings - Fork 320
Add OpenTelemetry tracing to backend and frontend with full chain support #1256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: tankerkiller125 <3457368+tankerkiller125@users.noreply.github.com>
…acing, and HTTP middleware Co-authored-by: tankerkiller125 <3457368+tankerkiller125@users.noreply.github.com>
…proxy, and composables Co-authored-by: tankerkiller125 <3457368+tankerkiller125@users.noreply.github.com>
…yload limit Co-authored-by: tankerkiller125 <3457368+tankerkiller125@users.noreply.github.com>
|
@copilot the frontend telemetry should be enabled automatically when the backend telemetry is enabled. This should be done via the backend info endpoint which the front-end queries for information. |
…bled via status endpoint Co-authored-by: tankerkiller125 <3457368+tankerkiller125@users.noreply.github.com>
Implemented in ea84a16. The frontend now automatically enables telemetry when the backend has it enabled:
When you enable backend telemetry with |
WalkthroughThis PR adds comprehensive OpenTelemetry integration to both backend and frontend systems. Backend changes include a new OTEL provider, database tracing instrumentation, HTTP middleware integration, and a proxy endpoint for frontend telemetry collection. Frontend changes provide Vue composables, client initialization, and automatic trace instrumentation. Configuration structures support OTEL enablement, exporters, protocols, and per-component tracing toggles. Dependencies are updated with OTEL packages across SDK, exporters, and instrumentation libraries. Changes
Sequence Diagram(s)sequenceDiagram
participant Frontend as Frontend<br/>(Browser)
participant Plugin as OTEL Plugin<br/>(otel.client.ts)
participant Backend as Backend App<br/>(main.go)
participant BEStatus as /api/v1/status<br/>Endpoint
participant OTelLib as Frontend OTel<br/>Library
Frontend->>Plugin: Page Load
Plugin->>BEStatus: Fetch Backend Status<br/>(3s timeout)
alt Backend Telemetry Enabled
BEStatus-->>Plugin: APISummary {<br/>telemetry.enabled: true}
Plugin->>OTelLib: initializeOTel(config)
OTelLib->>OTelLib: Create Provider +<br/>BackendProxyExporter
OTelLib-->>Plugin: Ready
Plugin->>Frontend: Provide otelEnabled: true
Frontend->>Frontend: Instrument fetch,<br/>spans
else Backend Telemetry Disabled
BEStatus-->>Plugin: APISummary {<br/>telemetry.enabled: false}
Plugin->>Frontend: Provide otelEnabled: false
end
sequenceDiagram
participant Frontend as Frontend Span<br/>(FrontendSpan)
participant Exporter as BackendProxy<br/>Exporter
participant BEProxy as Backend<br/>/telemetry<br/>ProxyHandler
participant OTLPClient as OTLP Client<br/>(gRPC/HTTP)
participant OTLP as OTLP Collector<br/>(User's Endpoint)
Frontend->>Exporter: Batch Spans
Exporter->>BEProxy: POST /telemetry<br/>(TelemetryPayload JSON)
BEProxy->>BEProxy: Parse & Validate<br/>Payload
BEProxy->>BEProxy: Convert to OTLP<br/>Format
BEProxy->>OTLPClient: Forward Spans
OTLPClient->>OTLP: Send OTLP Protobuf<br/>(gRPC/HTTP)
OTLP-->>OTLPClient: ACK
OTLPClient-->>BEProxy: Success
BEProxy-->>Exporter: HTTP 200
Exporter-->>Frontend: Sent
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes The PR introduces substantial new OTEL integration logic across backend and frontend with ~1.5K lines of new code in the OTEL package alone (otel.go, proxy.go, database.go). Multiple file types (configuration structs, initialization code, instrumentation hooks, client plugins) and heterogeneous changes demand careful review. Key areas requiring attention: OTEL provider initialization sequencing, database instrumentation conditional logic, proxy endpoint security (payload size limits, validation), frontend initialization timing with backend status fetch, and dependency version compatibility. Security Recommendations
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/internal/sys/config/conf.go (1)
19-35: Redact OTEL headers when printing config.
Line 34:Config.Print()will now includeOtel.Headers, which can carry OTLP auth tokens. This risks leaking secrets into logs/console. Consider redacting before marshaling.🔒 Proposed redaction in
Config.Print()func (c *Config) Print() { - res, err := json.MarshalIndent(c, "", " ") + redacted := *c + if redacted.Otel.Headers != "" { + redacted.Otel.Headers = "[redacted]" + } + res, err := json.MarshalIndent(redacted, "", " ") if err != nil { panic(err) }
🤖 Fix all issues with AI agents
In `@backend/app/api/main.go`:
- Around line 228-236: The middleware list registers
otelchimetric.NewResponseSizeBytes twice causing duplicate response-size
metrics; edit the router.Use call in main.go to remove the duplicate
otelchimetric.NewResponseSizeBytes entry (keep one instance alongside
otelchi.Middleware and otelchimetric.NewRequestInFlight), or replace the second
call with the intended different metric if a second metric was meant to be
added.
In `@backend/internal/sys/otel/otel.go`:
- Around line 368-379: The metric gRPC exporter uses
otlpmetricgrpc.WithInsecure() while traces use
grpc.WithTransportCredentials(insecure.NewCredentials()) via WithDialOption;
update the metric (otlpmetricgrpc.New call) and log exporter usage to use
otlpmetricgrpc.WithDialOption(grpc.WithTransportCredentials(insecure.NewCredentials()))
when cfg.Insecure is true instead of WithInsecure(), preserve existing endpoint
and headers options (WithEndpoint, WithHeaders), and add a short comment near
the cfg.Insecure branch stating that insecure mode disables TLS (only for
dev/testing or trusted networks).
In `@backend/internal/sys/otel/proxy.go`:
- Around line 238-253: The gRPC branch builds dial options only for the insecure
case and omits setting TLS transport credentials when p.cfg.Insecure is false;
update the grpc dial option construction used before grpc.NewClient so that when
p.cfg.Insecure is false you append a secure transport credential (e.g.,
grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) or
another appropriate credentials.TransportCredentials) instead of leaving
dialOpts empty, ensure the change is applied where grpc.NewClient is called and
that the credentials package is imported, leaving the grpcClient /
collogspb.NewTraceServiceClient creation unchanged.
In `@frontend/lib/otel/index.ts`:
- Around line 104-109: The code incorrectly tries to read parentSpanId via a
fragile cast on span; instead read the parent span ID from the span's
parentSpanContext. Replace the cast and use span.parentSpanContext?.spanId (from
the ReadableSpan's parentSpanContext?: SpanContext) to populate parentSpanId so
the returned object uses spanContext.traceId, spanContext.spanId, and
parentSpanContext?.spanId for parentSpanId.
🧹 Nitpick comments (14)
backend/internal/data/repo/repo_item_attachments.go (1)
867-889: Consider recording errors on spans for better observability.The span instrumentation is good, but you could enhance debugging by recording errors when they occur. This helps correlate failures in your tracing backend.
♻️ Optional: Add error recording to spans
func (r *AttachmentRepo) processThumbnailFromImage(ctx context.Context, groupId uuid.UUID, img image.Image, title string, orientation uint16) (UploadResult, error) { ctx, span := otel.Tracer("data").Start(ctx, "repo.AttachmentRepo.processThumbnailFromImage") defer span.End() bounds := img.Bounds() span.AddEvent("applying EXIF orientation") // Apply EXIF orientation if needed if orientation > 1 { img = utils.ApplyOrientation(img, orientation) bounds = img.Bounds() } span.AddEvent("resizing image") newWidth, newHeight := calculateThumbnailDimensions(bounds.Dx(), bounds.Dy(), r.thumbnail.Width, r.thumbnail.Height) dst := image.NewRGBA(image.Rect(0, 0, newWidth, newHeight)) draw.CatmullRom.Scale(dst, dst.Rect, img, img.Bounds(), draw.Over, nil) span.AddEvent("encoding thumbnail as webp") buf := new(bytes.Buffer) err := webp.Encode(buf, dst, webp.Options{Quality: 80, Lossless: false}) if err != nil { + span.RecordError(err) return UploadResult{}, err }frontend/plugins/otel.client.ts (1)
18-41: AbortController refactoring requires API modifications.The suggested improvement is architecturally sound—using AbortController is the proper way to cancel HTTP requests and prevent resource leaks—but requires changes beyond the current code:
- The
RequestArgstype infrontend/lib/requests/requests.tsdoes not include asignalpropertyPublicApi.status()takes no parameters; it would need to accept options- The
Requests.get()method would need to forward the signal to the fetch APIWhile the current
Promise.raceimplementation works correctly and safely handles timeout scenarios, it does leave the API request running after the timeout fires, which is inefficient. To implement this refactoring cleanly, first extendRequestArgsto supportsignal: AbortSignal | undefined, then update the API method signatures accordingly.From a security perspective: The current timeout handling is safe but could benefit from proper request cancellation to avoid potential memory leaks in long-lived applications.
backend/internal/sys/config/conf_otel.go (1)
33-34: Add validation for SampleRate bounds at config load time.The
SampleRatefield should be validated to be within [0.0, 1.0]. While the initialization code gracefully handles out-of-bounds values, early validation at config load time would catch misconfigurations immediately and provide clearer error messages. Additionally, invalid sampling rates have security implications—misconfiguration could inadvertently disable tracing (audit trail loss) or apply unexpected sampling, affecting compliance and observability.backend/internal/sys/otel/zerolog_bridge.go (2)
24-39: Consider propagating request context for better trace correlation.The comment on line 37 acknowledges this, but using
context.Background()means log records won't be correlated with active spans. If zerolog's Hook interface allowed passing context, this would enable better distributed tracing correlation.For now, this is acceptable given the zerolog.Hook interface constraints.
54-57: PanicLevel maps to SeverityError - consider SeverityFatal4 for distinction.Both
FatalLevelandPanicLevelindicate critical failures, but mappingPanicLeveltoSeverityErrorloses some severity information. OpenTelemetry defines multiple fatal severities (Fatal, Fatal2, Fatal3, Fatal4) that could differentiate these.♻️ Optional enhancement
case zerolog.FatalLevel: return otellog.SeverityFatal case zerolog.PanicLevel: - return otellog.SeverityError + return otellog.SeverityFatal4backend/internal/sys/otel/proxy.go (2)
312-332: 🔐 Security: Validate trace/span ID lengths to prevent malformed data.Frontend-supplied trace IDs (32 hex chars = 16 bytes) and span IDs (16 hex chars = 8 bytes) should be validated. Malformed IDs could cause issues downstream or indicate malicious input.
♻️ Proposed validation
func convertSpanToProto(frontendSpan FrontendSpan) (*tracepb.Span, error) { // Parse trace ID + if len(frontendSpan.TraceID) != 32 { + return nil, fmt.Errorf("invalid trace ID length: expected 32, got %d", len(frontendSpan.TraceID)) + } traceID, err := hex.DecodeString(frontendSpan.TraceID) if err != nil { return nil, fmt.Errorf("invalid trace ID: %w", err) } // Parse span ID + if len(frontendSpan.SpanID) != 16 { + return nil, fmt.Errorf("invalid span ID length: expected 16, got %d", len(frontendSpan.SpanID)) + } spanID, err := hex.DecodeString(frontendSpan.SpanID)
386-393: Potential integer overflow when converting milliseconds to nanoseconds.Multiplying
int64millisecond timestamps by1000000and casting touint64can overflow for timestamps far in the future (though unlikely in practice). Consider using explicit safe conversion.The current approach works for dates until ~year 292471, so this is low-risk but worth noting for defensive coding.
frontend/lib/otel/index.ts (1)
203-204: Static analysis ReDoS warning - safe in this context.The static analysis tool flags
new RegExp(^${window.location.origin})as a potential ReDoS risk. However,window.location.originis browser-controlled and contains onlyscheme://host:portformat without regex metacharacters that could cause catastrophic backtracking.For defense in depth, you could escape the origin, but it's not strictly necessary here:
♻️ Optional defensive escaping
+function escapeRegExp(str: string): string { + return str.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); +} + new FetchInstrumentation({ - propagateTraceHeaderCorsUrls: [new RegExp(`^${window.location.origin}`)], + propagateTraceHeaderCorsUrls: [new RegExp(`^${escapeRegExp(window.location.origin)}`)],frontend/composables/use-otel.ts (1)
26-29: Empty span object may cause runtime errors if callback uses span methods.When OTEL is disabled, passing
{} as Spanto the callback could cause runtime errors if the callback attempts to call span methods likesetAttribute()orsetStatus(). Consider providing a no-op span implementation.♻️ Proposed no-op span implementation
+// No-op span for when OTel is disabled +const noopSpan: Span = { + spanContext: () => ({ traceId: '', spanId: '', traceFlags: 0 }), + setAttribute: () => noopSpan, + setAttributes: () => noopSpan, + addEvent: () => noopSpan, + setStatus: () => noopSpan, + updateName: () => noopSpan, + end: () => {}, + isRecording: () => false, + recordException: () => {}, +} as unknown as Span; async function traced<T>( name: string, fn: (span: Span) => Promise<T> | T, options?: { ... } ): Promise<T> { if (!isEnabled.value) { - return fn({} as Span); + return fn(noopSpan); } return withSpan(name, fn, options); }backend/internal/sys/otel/database.go (2)
40-43: 🔐 Security consideration: SQL queries in spans may expose sensitive data.
DisableQuery: falseincludes full SQL queries in trace spans. While useful for debugging, this could expose sensitive data (emails, passwords in queries, PII) in your telemetry backend.Consider:
- Using
DisableQuery: truein production- Adding a configuration option to toggle this behavior
- Ensuring your telemetry backend has appropriate access controls
♻️ Optional: make configurable
otelsql.WithSpanOptions(otelsql.SpanOptions{ - DisableQuery: false, // Include SQL queries in spans + DisableQuery: !p.cfg.IncludeSQLInSpans, // Configurable SQL query inclusion }),
99-108: WrapEntDriver doesn't actually wrap - consider removing or documenting clearly.The function name suggests it wraps a driver with tracing, but it simply returns the input driver unchanged. The comment explains the assumption, but callers may be confused.
Either:
- Remove this function if it serves no purpose
- Rename to clarify behavior (e.g.,
ValidateTracedDriver)- Actually implement wrapping logic for drivers not created via
OpenEntDriverbackend/internal/sys/otel/otel.go (3)
175-181: Security: Avoid logging potentially sensitive header values.The
cfg.Endpointis logged here. While endpoint logging is generally acceptable, be cautious if headers contain authentication tokens or API keys. The current code does not logcfg.Headers, which is good, but consider adding a note or ensuring that sensitive configuration is never accidentally logged in future changes.
461-464: Hardcoded environment value limits observability utility.
getEnvironment()always returns"production", which may not accurately reflect actual deployment stages (development, staging, etc.). Consider making this configurable viaOTelConfigor reading from an environment variable.♻️ Suggested improvement
// getEnvironment returns the current deployment environment. -func getEnvironment() string { - return "production" +func getEnvironment(cfg *config.OTelConfig) string { + if cfg.Environment != "" { + return cfg.Environment + } + return "production" }Then update the call site at line 127 to pass
cfg.
192-198: Silent failure on invalid MetricsInterval configuration.If
time.ParseDurationfails, the error is silently ignored and the default 15-second interval is used. Consider logging a warning so operators are aware their configuration was not applied.♻️ Suggested improvement
metricsInterval := 15 * time.Second if cfg.MetricsInterval != "" { - if parsed, err := time.ParseDuration(cfg.MetricsInterval); err == nil { + parsed, err := time.ParseDuration(cfg.MetricsInterval) + if err != nil { + log.Warn().Str("interval", cfg.MetricsInterval).Err(err).Msg("Invalid metrics interval, using default 15s") + } else { metricsInterval = parsed } }
| if otelProvider.IsEnabled() && cfg.Otel.EnableHTTPTracing { | ||
| otelChiBaseCfg := otelchimetric.NewBaseConfig(cfg.Otel.ServiceName, otelchimetric.WithMeterProvider(otelProvider.MeterProvider())) | ||
| router.Use( | ||
| otelchi.Middleware(cfg.Otel.ServiceName, otelchi.WithChiRoutes(router)), | ||
| otelchimetric.NewResponseSizeBytes(otelChiBaseCfg), | ||
| otelchimetric.NewRequestInFlight(otelChiBaseCfg), | ||
| otelchimetric.NewResponseSizeBytes(otelChiBaseCfg), | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate NewResponseSizeBytes middleware - likely copy-paste error.
otelchimetric.NewResponseSizeBytes is added twice (lines 232 and 234), which will record duplicate metrics for response sizes. This appears to be a copy-paste error.
🐛 Proposed fix
otelChiBaseCfg := otelchimetric.NewBaseConfig(cfg.Otel.ServiceName, otelchimetric.WithMeterProvider(otelProvider.MeterProvider()))
router.Use(
otelchi.Middleware(cfg.Otel.ServiceName, otelchi.WithChiRoutes(router)),
- otelchimetric.NewResponseSizeBytes(otelChiBaseCfg),
otelchimetric.NewRequestInFlight(otelChiBaseCfg),
otelchimetric.NewResponseSizeBytes(otelChiBaseCfg),
)🤖 Prompt for AI Agents
In `@backend/app/api/main.go` around lines 228 - 236, The middleware list
registers otelchimetric.NewResponseSizeBytes twice causing duplicate
response-size metrics; edit the router.Use call in main.go to remove the
duplicate otelchimetric.NewResponseSizeBytes entry (keep one instance alongside
otelchi.Middleware and otelchimetric.NewRequestInFlight), or replace the second
call with the intended different metric if a second metric was meant to be
added.
| case "grpc", "": | ||
| opts := []otlpmetricgrpc.Option{} | ||
| if cfg.Endpoint != "" { | ||
| opts = append(opts, otlpmetricgrpc.WithEndpoint(cfg.Endpoint)) | ||
| } | ||
| if cfg.Insecure { | ||
| opts = append(opts, otlpmetricgrpc.WithInsecure()) | ||
| } | ||
| if len(headers) > 0 { | ||
| opts = append(opts, otlpmetricgrpc.WithHeaders(headers)) | ||
| } | ||
| return otlpmetricgrpc.New(ctx, opts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n backend/internal/sys/otel/otel.go | head -400 | tail -150Repository: sysadminsmedia/homebox
Length of output: 5610
🌐 Web query:
otlpmetricgrpc.WithInsecure() grpc credentials insecure
💡 Result:
otlpmetricgrpc.WithInsecure() tells the OpenTelemetry OTLP metric gRPC exporter to disable TLS on its client connection (i.e., plaintext gRPC), similar in effect to the deprecated grpc.WithInsecure(). This is mainly for local/dev or when your collector endpoint is not using TLS. [1]
If your goal is specifically “use the new gRPC insecure credentials API (instead of grpc.WithInsecure)”, pass a dial option:
import (
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
)
exp, err := otlpmetricgrpc.New(ctx,
otlpmetricgrpc.WithEndpoint("localhost:4317"),
otlpmetricgrpc.WithDialOption(grpc.WithTransportCredentials(insecure.NewCredentials())),
)This is the direct replacement pattern for deprecated grpc.WithInsecure() (use grpc.WithTransportCredentials(insecure.NewCredentials())). [2]
You can also set it via env vars (OTEL_EXPORTER_OTLP_METRICS_INSECURE=true / OTEL_EXPORTER_OTLP_INSECURE=true) when using the OTLP metric gRPC exporter. [1]
Security note: disabling TLS means data is unencrypted in transit, so avoid it on untrusted networks. [3]
Sources: [1] [2] [3]
🏁 Script executed:
cat -n backend/internal/sys/otel/otel.go | sed -n '400,450p'Repository: sysadminsmedia/homebox
Length of output: 1881
Align all gRPC exporters to use modern insecure credentials pattern for consistency and best practices.
The trace exporter (line 320) applies grpc.WithTransportCredentials(insecure.NewCredentials()) via WithDialOption(), but the metric exporter (line 374) and log exporter (line 428) only use WithInsecure(). Standardize all three gRPC exporters to the modern credentials API pattern for consistency. Additionally, document that insecure mode disables TLS—data travels unencrypted—so this should only be used in secure network environments (dev/testing).
🤖 Prompt for AI Agents
In `@backend/internal/sys/otel/otel.go` around lines 368 - 379, The metric gRPC
exporter uses otlpmetricgrpc.WithInsecure() while traces use
grpc.WithTransportCredentials(insecure.NewCredentials()) via WithDialOption;
update the metric (otlpmetricgrpc.New call) and log exporter usage to use
otlpmetricgrpc.WithDialOption(grpc.WithTransportCredentials(insecure.NewCredentials()))
when cfg.Insecure is true instead of WithInsecure(), preserve existing endpoint
and headers options (WithEndpoint, WithHeaders), and add a short comment near
the cfg.Insecure branch stating that insecure mode disables TLS (only for
dev/testing or trusted networks).
| case "grpc", "": | ||
| dialOpts := []grpc.DialOption{} | ||
| if p.cfg.Insecure { | ||
| dialOpts = append(dialOpts, grpc.WithTransportCredentials(insecure.NewCredentials())) | ||
| } | ||
|
|
||
| conn, err := grpc.NewClient(p.cfg.Endpoint, dialOpts...) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create gRPC client: %w", err) | ||
| } | ||
|
|
||
| client := collogspb.NewTraceServiceClient(conn) | ||
| return &grpcClient{ | ||
| client: client, | ||
| conn: conn, | ||
| }, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔐 Security: gRPC client missing TLS credentials when Insecure is false.
When p.cfg.Insecure is false, the gRPC client is created without any transport credentials. This will likely cause connection failures or, worse, fall back to insecure connections depending on the gRPC version defaults.
🐛 Proposed fix - add TLS credentials for secure connections
+import "google.golang.org/grpc/credentials"
case "grpc", "":
dialOpts := []grpc.DialOption{}
if p.cfg.Insecure {
dialOpts = append(dialOpts, grpc.WithTransportCredentials(insecure.NewCredentials()))
+ } else {
+ dialOpts = append(dialOpts, grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")))
}🤖 Prompt for AI Agents
In `@backend/internal/sys/otel/proxy.go` around lines 238 - 253, The gRPC branch
builds dial options only for the insecure case and omits setting TLS transport
credentials when p.cfg.Insecure is false; update the grpc dial option
construction used before grpc.NewClient so that when p.cfg.Insecure is false you
append a secure transport credential (e.g.,
grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) or
another appropriate credentials.TransportCredentials) instead of leaving
dialOpts empty, ensure the change is applied where grpc.NewClient is called and
that the credentials package is imported, leaving the grpcClient /
collogspb.NewTraceServiceClient creation unchanged.
| // Get parent span ID from the span context if available | ||
| const parentSpanContext = (span as unknown as { parentSpanId?: string }).parentSpanId; | ||
| return { | ||
| traceId: spanContext.traceId, | ||
| spanId: spanContext.spanId, | ||
| parentSpanId: parentSpanContext || undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n frontend/lib/otel/index.ts | head -120 | tail -30Repository: sysadminsmedia/homebox
Length of output: 1323
🌐 Web query:
OpenTelemetry SDK ReadableSpan interface parentSpanId property
💡 Result:
In opentelemetry-js, the SDK ReadableSpan no longer has a parentSpanId property. The current ReadableSpan interface exposes the parent as an optional parentSpanContext?: SpanContext instead. (open-telemetry.github.io)
So, to get the parent span id you now do:
const parentSpanId = span.parentSpanContext?.spanId; // undefined for root spansThis was a documented breaking change: parentSpanId was removed and replaced by parentSpanContext (migration: span.parentSpanId → span.parentSpanContext?.spanId). (classic.yarnpkg.com)
This aligns with the OpenTelemetry trace SDK spec guidance that implementations must expose at least the full parent SpanContext. (opentelemetry.io)
Citations:
- 1: https://open-telemetry.github.io/opentelemetry-js/interfaces/_opentelemetry_sdk-trace-base.ReadableSpan.html
- 2: https://classic.yarnpkg.com/en/package/%40opentelemetry/sdk-trace-web
- 3: https://opentelemetry.io/docs/specs/otel/trace/sdk/?utm_source=openai
🏁 Script executed:
grep -r "ReadableSpan" frontend/lib/otel/ --include="*.ts"Repository: sysadminsmedia/homebox
Length of output: 366
🏁 Script executed:
grep -r "from.*opentelemetry" frontend/lib/otel/index.ts -A 5Repository: sysadminsmedia/homebox
Length of output: 991
Fix incorrect parentSpanId access on ReadableSpan.
The fragile type assertion is attempting to access parentSpanId, which doesn't exist on the OpenTelemetry SDK's ReadableSpan interface. The parentSpanId property was removed in favor of parentSpanContext?: SpanContext. Access the parent span ID correctly via the span context:
♻️ Correct implementation
- const parentSpanContext = (span as unknown as { parentSpanId?: string }).parentSpanId;
+ const parentSpanId = span.parentSpanContext?.spanId;
return {
traceId: spanContext.traceId,
spanId: spanContext.spanId,
- parentSpanId: parentSpanContext || undefined,
+ parentSpanId: parentSpanId,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Get parent span ID from the span context if available | |
| const parentSpanContext = (span as unknown as { parentSpanId?: string }).parentSpanId; | |
| return { | |
| traceId: spanContext.traceId, | |
| spanId: spanContext.spanId, | |
| parentSpanId: parentSpanContext || undefined, | |
| // Get parent span ID from the span context if available | |
| const parentSpanId = span.parentSpanContext?.spanId; | |
| return { | |
| traceId: spanContext.traceId, | |
| spanId: spanContext.spanId, | |
| parentSpanId: parentSpanId, |
🤖 Prompt for AI Agents
In `@frontend/lib/otel/index.ts` around lines 104 - 109, The code incorrectly
tries to read parentSpanId via a fragile cast on span; instead read the parent
span ID from the span's parentSpanContext. Replace the cast and use
span.parentSpanContext?.spanId (from the ReadableSpan's parentSpanContext?:
SpanContext) to populate parentSpanId so the returned object uses
spanContext.traceId, spanContext.spanId, and parentSpanContext?.spanId for
parentSpanId.
What type of PR is this?
What this PR does / why we need it:
Adds comprehensive OpenTelemetry tracing across the full stack with proper trace context propagation from frontend to backend to database.
Backend (
backend/internal/sys/otel/)otel.go- Provider init with OTLP/stdout exporters, configurable samplingmiddleware.go- HTTP middleware via otelhttpdatabase.go- Ent ORM tracing via otelsqlproxy.go- Authenticated proxy endpoint at/api/v1/telemetryfor frontend spansBackend Config (
backend/internal/sys/config/conf_otel.go)HBOX_OTEL_ENABLED,HBOX_OTEL_ENDPOINT,HBOX_OTEL_PROTOCOL,HBOX_OTEL_SAMPLE_RATE, etc.Backend Status Endpoint (
/api/v1/status)telemetry.enabledfield to APISummary responseFrontend (
frontend/lib/otel/)BackendProxyExportersends spans through authenticated backend endpointFetchInstrumentation(same-origin only for security)useOTel()composable for manual span creation in componentsUsage:
# Backend - enabling this automatically enables frontend telemetry HBOX_OTEL_ENABLED=true HBOX_OTEL_ENDPOINT=localhost:4317 HBOX_OTEL_INSECURE=trueWhich issue(s) this PR fixes:
Special notes for your reviewer:
/api/v1/statusendpoint)Testing
go vetOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.