Skip to content

Conversation

@Sudhanshu-NITR
Copy link
Contributor

Which problem is this PR solving?

Description of the changes

  • Implemented GetOperations using efficient prefix scanning (0x82) to retrieve operation names.
  • Added reader_test.go with unit tests covering duplicates, multiple services, and empty results.
  • Added package_test.go to ensure goroutine leak compliance (testutils.VerifyGoLeaks).
  • Ensured full compliance with project naming conventions, error wrapping strings, and import grouping.

How was this change tested?

  • Unit Tests: Added TestGetOperations in reader_test.go which verifies correct data retrieval and de-duplication against a temporary Badger DB (t.TempDir()).
  • Linting: Verified with make lint (passed with 0 issues).
  • Compliance: Verified imports and formatting with make fmt.

Checklist

Signed-off-by: Sudhanshu-NITR <sudhanshu.kadam.99@gmail.com>
@Sudhanshu-NITR Sudhanshu-NITR requested a review from a team as a code owner February 1, 2026 11:23
Copilot AI review requested due to automatic review settings February 1, 2026 11:23
@dosubot
Copy link

dosubot bot commented Feb 1, 2026

Related Documentation

No published documentation to review for changes on this repository.

Write your first living document

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the storage/badger Issues related to badger storage label Feb 1, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the GetOperations method for the Badger v2 storage backend, which is part of the ongoing effort to upgrade Badger storage to the v2 API (issue #7937). The implementation reads operation names from Badger's operation index using efficient prefix scanning.

Changes:

  • Implemented GetOperations method in the Badger v2 Reader to retrieve operation names for a given service
  • Added comprehensive unit tests covering duplicate handling, multiple services, and various scenarios
  • Added goroutine leak detection test infrastructure

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
internal/storage/v2/badger/tracestore/reader.go Implements the GetOperations method using prefix scanning on operation index keys (0x82), plus stub implementations for other Reader interface methods
internal/storage/v2/badger/tracestore/reader_test.go Unit tests verifying correct operation retrieval, deduplication, and multi-service scenarios
internal/storage/v2/badger/tracestore/package_test.go Adds goroutine leak detection using testutils.VerifyGoLeaks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


require.NoError(t, err)

// Verify GetOperations for "service-a"
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references "3. Verify GetOperations" but there's no explicit comment for step 3. The comment numbering should be consistent throughout the test. Consider adding a comment "// 3. Verify GetOperations for 'service-a'" above line 62 to maintain clarity.

Suggested change
// Verify GetOperations for "service-a"
// 3. Verify GetOperations for "service-a"

Copilot uses AI. Check for mistakes.
Comment on lines 18 to 82
func TestGetOperations(t *testing.T) {
// 1. Setup Badger with a temporary directory (standard pattern)
opts := badger.DefaultOptions(t.TempDir()).WithLoggingLevel(badger.ERROR)
db, err := badger.Open(opts)
require.NoError(t, err)

// Ensure DB is closed after test
t.Cleanup(func() {
require.NoError(t, db.Close())
})

reader := NewReader(db)

// 2. Insert Dummy data
err = db.Update(func(txn *badger.Txn) error {
// Helper to create a key in the exact format:
// [0x82][ServiceName][Timestamp(8)][TraceID(16)]
createKey := func(service, operation string, id uint64) []byte {
key := make([]byte, 1+len(service)+len(operation)+8+16)
key[0] = operationNameIndexKey
n := 1
n += copy(key[n:], service)
n += copy(key[n:], operation)

// Add dummy timestamp (8 bytes)
binary.BigEndian.PutUint64(key[n:], 123456)
n += 8

// Add dummy trace ID (16 bytes)
binary.BigEndian.PutUint64(key[n:], 0) // High
binary.BigEndian.PutUint64(key[n+8:], id) // Low
return key
}

// Add two operations for "service-a" and one for "service-b"
require.NoError(t, txn.Set(createKey("service-a", "op-1", 1), []byte{}))
require.NoError(t, txn.Set(createKey("service-a", "op-1", 2), []byte{})) // Duplicate op name, different trace
require.NoError(t, txn.Set(createKey("service-a", "op-2", 3), []byte{}))
require.NoError(t, txn.Set(createKey("service-b", "op-3", 4), []byte{}))
return nil
})

require.NoError(t, err)

// Verify GetOperations for "service-a"
ops, err := reader.GetOperations(context.Background(),
tracestore.OperationQueryParams{
ServiceName: "service-a",
})
require.NoError(t, err)

// We expect 2 unique operations: "op-1" and "op-2"
assert.Len(t, ops, 2)
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-1")
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-2")

// 4. Verify GetOperations for "service-b"
ops, err = reader.GetOperations(context.Background(),
tracestore.OperationQueryParams{
ServiceName: "service-b",
})
require.NoError(t, err)
assert.Len(t, ops, 1)
assert.Equal(t, "op-3", ops[0].Name)
}
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't cover the edge case where an empty or non-existent service name is queried. Consider adding a test case that verifies GetOperations returns an empty slice (not an error) when querying for a service that doesn't exist, which appears to be the expected behavior based on the v1 implementation (internal/storage/v1/badger/spanstore/cache.go:90).

Copilot uses AI. Check for mistakes.
defer it.Close()

uniqueOps := make(map[string]struct{})
for it.Seek(prefix); it.ValidForPrefix((prefix)); it.Next() {
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra parentheses around "prefix" in "it.ValidForPrefix((prefix))" are unnecessary. This should be "it.ValidForPrefix(prefix)" for consistency with the rest of the codebase (see internal/storage/v1/badger/spanstore/reader.go:128, 184, 633, 657).

Suggested change
for it.Seek(prefix); it.ValidForPrefix((prefix)); it.Next() {
for it.Seek(prefix); it.ValidForPrefix(prefix); it.Next() {

Copilot uses AI. Check for mistakes.
// 2. Insert Dummy data
err = db.Update(func(txn *badger.Txn) error {
// Helper to create a key in the exact format:
// [0x82][ServiceName][Timestamp(8)][TraceID(16)]
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 34 says the key format is "[0x82][ServiceName][Timestamp(8)][TraceID(16)]", but this is incomplete. According to the writer implementation (internal/storage/v1/badger/spanstore/writer.go:73), the actual format is "[0x82][ServiceName+OperationName][Timestamp(8)][TraceID(16)]". The comment should be updated to "[0x82][ServiceName][OperationName][Timestamp(8)][TraceID(16)]" to accurately reflect what the test is creating.

Suggested change
// [0x82][ServiceName][Timestamp(8)][TraceID(16)]
// [0x82][ServiceName][OperationName][Timestamp(8)][TraceID(16)]

Copilot uses AI. Check for mistakes.
Comment on lines 43 to 52
// Create the search prefix: [0x82][ServiceName]
prefix := make([]byte, len(query.ServiceName)+1)
prefix[0] = operationNameIndexKey
copy(prefix[1:], query.ServiceName)

it := txn.NewIterator(badger.DefaultIteratorOptions)
defer it.Close()

uniqueOps := make(map[string]struct{})
for it.Seek(prefix); it.ValidForPrefix((prefix)); it.Next() {
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator is using DefaultIteratorOptions, which prefetches values by default. However, this implementation only reads keys and never accesses the values (which are empty for index entries anyway). For better performance, consider setting "opts.PrefetchValues = false" as done in similar code in the v1 implementation (internal/storage/v1/badger/spanstore/reader.go:586). This would reduce unnecessary I/O operations.

Suggested change
// Create the search prefix: [0x82][ServiceName]
prefix := make([]byte, len(query.ServiceName)+1)
prefix[0] = operationNameIndexKey
copy(prefix[1:], query.ServiceName)
it := txn.NewIterator(badger.DefaultIteratorOptions)
defer it.Close()
uniqueOps := make(map[string]struct{})
for it.Seek(prefix); it.ValidForPrefix((prefix)); it.Next() {
// Create the search prefix: [0x82][ServiceName]
prefix := make([]byte, len(query.ServiceName)+1)
prefix[0] = operationNameIndexKey
copy(prefix[1:], query.ServiceName)
opts := badger.DefaultIteratorOptions
opts.PrefetchValues = false
it := txn.NewIterator(opts)
defer it.Close()
uniqueOps := make(map[string]struct{})
for it.Seek(prefix); it.ValidForPrefix((prefix)); it.Next() {

Copilot uses AI. Check for mistakes.
Comment on lines 63 to 65
operations = append(operations, tracestore.Operation{
Name: opName,
})
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Operation struct includes a SpanKind field that is not being populated. According to the API definition in tracestore.Operation (reader.go:128-131) and implementations in other backends (e.g., memory, clickhouse), the SpanKind field should be set based on the span data. However, Badger v1 stores operation names as concatenated "ServiceName+OperationName" without span kind information in the key, which means span kind cannot be extracted from the index key alone.

This is a known limitation in Badger v1 (see the TODO comment at internal/storage/v1/badger/spanstore/cache.go:103-104). Since this v2 implementation is directly reading from the same key format as v1, it inherits this limitation.

For now, leaving SpanKind empty is acceptable since:

  1. The v1 implementation doesn't support it either
  2. The OperationQueryParams.SpanKind filter parameter is also not being used, which is consistent with not storing span kind
  3. This matches the current behavior of the v1adapter

However, this should be documented, and the SpanKind query parameter should ideally be validated to ensure it's not used (or a warning should be logged if provided).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct! span kind should be present in v2

}

// GetOperations returns all operation names for a given service.
func (r *Reader) GetOperations(_ context.Context, query tracestore.OperationQueryParams) ([]tracestore.Operation, error) {
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OperationQueryParams.SpanKind parameter is being ignored in this implementation. While this is consistent with the v1 Badger implementation (which doesn't store span kind in the operation index), the query parameter should be validated. If a user provides a non-empty SpanKind, the implementation should either:

  1. Return an error indicating that span kind filtering is not supported
  2. Log a warning and ignore the parameter
  3. Document this limitation in the function comment

Looking at other v2 implementations (memory, clickhouse), they do support SpanKind filtering, so silently ignoring it here could lead to confusion where queries work differently depending on the backend.

Copilot uses AI. Check for mistakes.
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-1")
assert.Contains(t, []string{ops[0].Name, ops[1].Name}, "op-2")

// 4. Verify GetOperations for "service-b"
Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "4. Verify GetOperations for "service-b"" but this appears to be step 3 of the test (after setup, insert data, and verify service-a). The numbering should be corrected to "3." to match the actual test flow.

Suggested change
// 4. Verify GetOperations for "service-b"
// 3. Verify GetOperations for "service-b"

Copilot uses AI. Check for mistakes.
Signed-off-by: Sudhanshu-NITR <sudhanshu.kadam.99@gmail.com>
Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
Signed-off-by: Sudhanshu Umesh Kadam <145262601+Sudhanshu-NITR@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 1, 2026 12:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +40
// GetOperations returns all operation names for a given service.

Copy link

Copilot AI Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing the extra blank line between the function comment and the function declaration to maintain consistency with Go conventions. Most functions in the codebase don't have this blank line.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,14 @@
// Copyright (c) 2025 The Jaeger Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time travel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, good catch! I'm still mentally in 2025. I will update the copyright headers code to 2026. 😅


var operations []tracestore.Operation

err := r.db.View(func(txn *badger.Txn) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already implementation of this logic in v1. This is not how we approach code migration, by rewriting it and introducing unknown bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. My goal was to provide a native v2 implementation, but I agree duplication is bad. Could you clarify if you prefer I expose the v1 CacheStore logic to be reusable here, or should v2 structurally wrap the v1 implementation for now? I want to align with the migration path you have in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we upgraded Elasticsearch implementation is we refactored its internal "core storage layer" to work on a dbmodel that was independent of v1/v2 Storage API. Both v1 and v2 ES implementations then shared that core storage layer. Refactoring was surgical, without large code changes or code movements, which makes the code reviewable with high confidence (while copy & paste is not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, Thanks for explaining the dbmodel pattern—I agree that having a shared core layer is much cleaner and safer than duplicating the logic.
I'll pause on this implementation and start looking into extracting the CacheStore/operations logic from v1 into a shared layer that both versions can use.
Thanks again!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yurishkuro,
I've analyzed the codebase and I'd like to propose the following refactoring plan to achieve the same result for Badger (clean separation without duplication):

  1. Create Shared Core (internal/storage/badger/spanstore) I will create a new package to hold the storage logic that is independent of the V1/V2 API.
  • cache.go: Move the CacheStore logic here (currently in v1).
  • writer.go (or core.go): Move the logic for creating badger.Entry (Key formatting, Indexing, Encoding) here. It will operate on model.Span (Since Badger stores data in V1/Proto format natively).
  1. Refactor V1 (internal/storage/v1/badger)
  • Update the V1 Writer/Reader to import and use this new Shared Core.
  • Remove the duplicated logic from V1 files.
  1. Implement V2 (internal/storage/v2/badger)
  • Writer: A thin adapter that accepts OTEL ptrace.Traces, converts them to model.Span (using existing adapters), and passes them to the Shared Core Writer.
  • Reader: Uses the Shared Cache for metadata and Shared Core for fetching spans, converting model.Span back to OTEL format for the response.

Does this plan align with your expectations for the migration? If so, I'll proceed with extracting the core logic first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the elasticsearch approach is gonna work for badger because we don't have anything like dbmodel.Span in badger. For v1, spans were directly stored in db without changing to any database model. Therefore I think we need to directly refactor model.Span to ptrace.Traces in v1 and then move the reader/writer to v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

storage/badger Issues related to badger storage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants