Skip to content

Conversation

@aaryan359
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • Implements health status propagation from the jaeger-query server to the collector runtime
  • Reports StatusOK when the query server starts successfully
  • Reports StatusFatalError when HTTP or gRPC servers fail at runtime
  • Reports appropriate status during shutdown
  • Uses the existing componentstatus and telemetry.Settings.ReportStatus mechanisms
  • Removes the outdated TODO comment related to health propagation
  • Adds unit tests to verify status reporting, transitions, and thread safety

How was this change tested?

  • Added unit tests covering:
    • Initial health status reporting
    • Health status transitions
    • Thread-safe status updates
    • Handling of all supported health status types
  • Ran existing test suite to ensure no regressions

Checklist

Signed-off-by: aaryan359 <aaryanmeena96@gmail.com>
@aaryan359 aaryan359 requested a review from a team as a code owner February 7, 2026 12:30
@dosubot dosubot bot added the enhancement label Feb 7, 2026
@codecov
Copy link

codecov bot commented Feb 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.47%. Comparing base (9c047cb) to head (d8e1ce7).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7992   +/-   ##
=======================================
  Coverage   95.47%   95.47%           
=======================================
  Files         316      316           
  Lines       16756    16756           
=======================================
  Hits        15997    15997           
  Misses        593      593           
  Partials      166      166           
Flag Coverage Δ
badger_v1 9.16% <ø> (ø)
badger_v2 1.89% <ø> (ø)
cassandra-4.x-v1-manual 13.35% <ø> (ø)
cassandra-4.x-v2-auto 1.88% <ø> (ø)
cassandra-4.x-v2-manual 1.88% <ø> (ø)
cassandra-5.x-v1-manual 13.35% <ø> (ø)
cassandra-5.x-v2-auto 1.88% <ø> (ø)
cassandra-5.x-v2-manual 1.88% <ø> (ø)
clickhouse 1.97% <ø> (ø)
elasticsearch-6.x-v1 17.19% <ø> (ø)
elasticsearch-7.x-v1 17.22% <ø> (ø)
elasticsearch-8.x-v1 17.37% <ø> (ø)
elasticsearch-8.x-v2 1.89% <ø> (ø)
elasticsearch-9.x-v2 1.89% <ø> (ø)
grpc_v1 8.40% <ø> (ø)
grpc_v2 1.89% <ø> (ø)
kafka-3.x-v2 1.89% <ø> (ø)
memory_v2 1.89% <ø> (ø)
opensearch-1.x-v1 17.26% <ø> (ø)
opensearch-2.x-v1 17.26% <ø> (ø)
opensearch-2.x-v2 1.89% <ø> (ø)
opensearch-3.x-v2 1.89% <ø> (ø)
query 1.89% <ø> (ø)
tailsampling-processor 0.54% <ø> (ø)
unittests 94.15% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: aaryan359 <aaryanmeena96@gmail.com>
@aaryan359
Copy link
Contributor Author

@jkowall, @yurishkuro, can you review this?

if err != nil {
s.telset.Logger.Error("Could not start GRPC server", zap.Error(err))
s.reportHealthStatus(componentstatus.StatusFatalError)
s.telset.ReportStatus(componentstatus.NewFatalErrorEvent(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is literally a call to report status in the next line. What's the point of duplicating it?

}

// Report initial status as OK
s.reportHealthStatus(componentstatus.StatusOK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the point of this if nothing got started yet?

aaryan359 and others added 3 commits February 9, 2026 00:53
Signed-off-by: aaryan359 <aaryanmeena96@gmail.com>
Signed-off-by: aaryan359 <aaryanmeena96@gmail.com>
@aaryan359
Copy link
Contributor Author

@yurishkuro, I initially added explicit status reporting (OK / Stopped) based on a literal reading of the issue, but I understand now that in collector semantics, health is implicit and only failures should be reported.

Copilot AI review requested due to automatic review settings February 11, 2026 15:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Propagates Jaeger query server health into the OpenTelemetry Collector runtime by removing outdated TODO/comment references and aligning server lifecycle code with updated status reporting behavior and tests.

Changes:

  • Removed the outdated TODO about propagating healthcheck updates to the collector runtime.
  • Removed/cleaned up a stale inline comment and adjusted formatting in server lifecycle methods.
  • Minor whitespace tweaks for readability and consistency.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
cmd/jaeger/internal/extension/jaegerquery/server.go Removes outdated TODO related to health propagation.
cmd/jaeger/internal/extension/jaegerquery/internal/server.go Removes stale comment and applies minor formatting/whitespace adjustments in Start/Close.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants