Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions internal/storage/v2/badger/tracestore/package_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright (c) 2025 The Jaeger Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time travel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, good catch! I'm still mentally in 2025. I will update the copyright headers code to 2026. 😅

// SPDX-License-Identifier: Apache-2.0

package tracestore

import (
"testing"

"github.com/jaegertracing/jaeger/internal/testutils"
)

func TestMain(m *testing.M) {
testutils.VerifyGoLeaks(m)
}
102 changes: 102 additions & 0 deletions internal/storage/v2/badger/tracestore/reader.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
// Copyright (c) 2025 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package tracestore

import (
"context"
"errors"
"iter"

"github.com/dgraph-io/badger/v4"
"go.opentelemetry.io/collector/pdata/ptrace"

"github.com/jaegertracing/jaeger/internal/storage/v2/api/tracestore"
)

var _ tracestore.Reader = (*Reader)(nil)

const (
// operationNameIndexKey (0x82) is the prefix for the operations index.
operationNameIndexKey byte = 0x82

// sizeOfTraceID is 16 bytes for a TraceID.
sizeOfTraceID = 16
)

// Reader implements the tracestore.Reader interface for Badger.
type Reader struct {
db *badger.DB
}

// NewReader creates a new Reader instance.
func NewReader(db *badger.DB) *Reader {
return &Reader{
db: db,
}
}

// GetOperations returns all operation names for a given service.

Comment on lines +39 to +40
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing the extra blank line between the function comment and the function declaration to maintain consistency with Go conventions. Most functions in the codebase don't have this blank line.

Copilot uses AI. Check for mistakes.
func (r *Reader) GetOperations(_ context.Context, query tracestore.OperationQueryParams) ([]tracestore.Operation, error) {
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OperationQueryParams.SpanKind parameter is being ignored in this implementation. While this is consistent with the v1 Badger implementation (which doesn't store span kind in the operation index), the query parameter should be validated. If a user provides a non-empty SpanKind, the implementation should either:

  1. Return an error indicating that span kind filtering is not supported
  2. Log a warning and ignore the parameter
  3. Document this limitation in the function comment

Looking at other v2 implementations (memory, clickhouse), they do support SpanKind filtering, so silently ignoring it here could lead to confusion where queries work differently depending on the backend.

Copilot uses AI. Check for mistakes.
if query.SpanKind != "" {
return nil, errors.New("badger storage does not support SpanKind filtering")
}
if query.ServiceName == "" {
return nil, errors.New("service name is required")
}

var operations []tracestore.Operation

err := r.db.View(func(txn *badger.Txn) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already implementation of this logic in v1. This is not how we approach code migration, by rewriting it and introducing unknown bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. My goal was to provide a native v2 implementation, but I agree duplication is bad. Could you clarify if you prefer I expose the v1 CacheStore logic to be reusable here, or should v2 structurally wrap the v1 implementation for now? I want to align with the migration path you have in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we upgraded Elasticsearch implementation is we refactored its internal "core storage layer" to work on a dbmodel that was independent of v1/v2 Storage API. Both v1 and v2 ES implementations then shared that core storage layer. Refactoring was surgical, without large code changes or code movements, which makes the code reviewable with high confidence (while copy & paste is not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, Thanks for explaining the dbmodel pattern—I agree that having a shared core layer is much cleaner and safer than duplicating the logic.
I'll pause on this implementation and start looking into extracting the CacheStore/operations logic from v1 into a shared layer that both versions can use.
Thanks again!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yurishkuro,
I've analyzed the codebase and I'd like to propose the following refactoring plan to achieve the same result for Badger (clean separation without duplication):

  1. Create Shared Core (internal/storage/badger/spanstore) I will create a new package to hold the storage logic that is independent of the V1/V2 API.
  • cache.go: Move the CacheStore logic here (currently in v1).
  • writer.go (or core.go): Move the logic for creating badger.Entry (Key formatting, Indexing, Encoding) here. It will operate on model.Span (Since Badger stores data in V1/Proto format natively).
  1. Refactor V1 (internal/storage/v1/badger)
  • Update the V1 Writer/Reader to import and use this new Shared Core.
  • Remove the duplicated logic from V1 files.
  1. Implement V2 (internal/storage/v2/badger)
  • Writer: A thin adapter that accepts OTEL ptrace.Traces, converts them to model.Span (using existing adapters), and passes them to the Shared Core Writer.
  • Reader: Uses the Shared Cache for metadata and Shared Core for fetching spans, converting model.Span back to OTEL format for the response.

Does this plan align with your expectations for the migration? If so, I'll proceed with extracting the core logic first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the elasticsearch approach is gonna work for badger because we don't have anything like dbmodel.Span in badger. For v1, spans were directly stored in db without changing to any database model. Therefore I think we need to directly refactor model.Span to ptrace.Traces in v1 and then move the reader/writer to v2.

// Create the search prefix: [0x82][ServiceName]
prefix := make([]byte, len(query.ServiceName)+1)
prefix[0] = operationNameIndexKey
copy(prefix[1:], query.ServiceName)

opts := badger.DefaultIteratorOptions
opts.PrefetchValues = false
it := txn.NewIterator(opts)
defer it.Close()

uniqueOps := make(map[string]struct{})
for it.Seek(prefix); it.ValidForPrefix(prefix); it.Next() {
key := it.Item().Key()

// Key layout: [Prefix(1)][Service(N)][Operation(M)][Time(8)][TraceID(16)]
opNameStart := 1 + len(query.ServiceName)
opNameEnd := len(key) - (8 + sizeOfTraceID)

if opNameEnd > opNameStart {
opName := string(key[opNameStart:opNameEnd])
if _, exists := uniqueOps[opName]; !exists {
uniqueOps[opName] = struct{}{}
operations = append(operations, tracestore.Operation{
Name: opName,
// TODO: https://github.com/jaegertracing/jaeger/issues/1922
// SpanKind is not stored in Badger v1/v2 index
})
}
}
}
return nil
})
return operations, err
}

// Stubs
func (*Reader) GetServices(context.Context) ([]string, error) {
return nil, nil
}

func (*Reader) GetTraces(context.Context, ...tracestore.GetTraceParams) iter.Seq2[[]ptrace.Traces, error] {
return func(_ func([]ptrace.Traces, error) bool) {}
}

func (*Reader) FindTraces(context.Context, tracestore.TraceQueryParams) iter.Seq2[[]ptrace.Traces, error] {
return func(_ func([]ptrace.Traces, error) bool) {}
}

func (*Reader) FindTraceIDs(context.Context, tracestore.TraceQueryParams) iter.Seq2[[]tracestore.FoundTraceID, error] {
return func(_ func([]tracestore.FoundTraceID, error) bool) {}
}
90 changes: 90 additions & 0 deletions internal/storage/v2/badger/tracestore/reader_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright (c) 2025 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package tracestore

import (
"context"
"encoding/binary"
"testing"

"github.com/dgraph-io/badger/v4"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"github.com/jaegertracing/jaeger/internal/storage/v2/api/tracestore"
)

func TestGetOperations(t *testing.T) {
// 1. Setup Badger with a temporary directory (standard pattern)
opts := badger.DefaultOptions(t.TempDir()).WithLoggingLevel(badger.ERROR)
db, err := badger.Open(opts)
require.NoError(t, err)

// Ensure DB is closed after test
t.Cleanup(func() {
require.NoError(t, db.Close())
})

reader := NewReader(db)

// 2. Insert Dummy data
err = db.Update(func(txn *badger.Txn) error {
// Helper to create a key in the exact format:
// [0x82][ServiceName][OperationName][Timestamp(8)][TraceID(16)]
createKey := func(service, operation string, id uint64) []byte {
key := make([]byte, 1+len(service)+len(operation)+8+16)
key[0] = operationNameIndexKey
n := 1
n += copy(key[n:], service)
n += copy(key[n:], operation)

// Add dummy timestamp (8 bytes)
binary.BigEndian.PutUint64(key[n:], 123456)
n += 8

// Add dummy trace ID (16 bytes)
binary.BigEndian.PutUint64(key[n:], 0) // High
binary.BigEndian.PutUint64(key[n+8:], id) // Low
return key
}

// Add two operations for "service-a" and one for "service-b"
require.NoError(t, txn.Set(createKey("service-a", "op-1", 1), []byte{}))
require.NoError(t, txn.Set(createKey("service-a", "op-1", 2), []byte{})) // Duplicate op name, different trace
require.NoError(t, txn.Set(createKey("service-a", "op-2", 3), []byte{}))
require.NoError(t, txn.Set(createKey("service-b", "op-3", 4), []byte{}))
return nil
})

require.NoError(t, err)

// 3. Verify GetOperations for "service-a"
ops, err := reader.GetOperations(context.Background(),
tracestore.OperationQueryParams{
ServiceName: "service-a",
})
require.NoError(t, err)

// We expect 2 unique operations: "op-1" and "op-2"
assert.Len(t, ops, 2)
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-1")
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-2")

// 4. Verify GetOperations for "service-b"
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "4. Verify GetOperations for "service-b"" but this appears to be step 3 of the test (after setup, insert data, and verify service-a). The numbering should be corrected to "3." to match the actual test flow.

Suggested change
// 4. Verify GetOperations for "service-b"
// 3. Verify GetOperations for "service-b"

Copilot uses AI. Check for mistakes.
ops, err = reader.GetOperations(context.Background(),
tracestore.OperationQueryParams{
ServiceName: "service-b",
})
require.NoError(t, err)
assert.Len(t, ops, 1)
assert.Equal(t, "op-3", ops[0].Name)

// 5. Verify GetOperations for non-existent service
ops, err = reader.GetOperations(context.Background(),
tracestore.OperationQueryParams{
ServiceName: "service-c",
})
require.NoError(t, err)
assert.Empty(t, ops)
}
Loading