-
Notifications
You must be signed in to change notification settings - Fork 255
Open
Description
Context
Deferred from PR #235 — Copilot flagged redundant API calls for duplicate texts, and we replied it's a valid but low-priority future optimization.
Problem
If embed_batch(&["foo", "bar", "foo"]) is called with duplicates and none are cached, miss_texts includes "foo" twice, resulting in a redundant API call for the same content. The cache still returns correct results — it just wastes an HTTP round-trip for the duplicate.
Proposed Solution
Before calling the inner provider, group misses by cache key:
- Build a
HashMap<String, Vec<usize>>mapping unique text → list of original indices - Call the inner provider only for unique texts
- Fan out the returned embeddings back to all original indices
Why It Was Deferred
- Cache works correctly as-is — the redundant call is a minor inefficiency
- In practice, callers rarely pass duplicate texts in a single batch
- The dedup + fan-out logic adds complexity for minimal real-world savings
Effort
Medium — need the dedup map, unique-only provider call, and index fan-out.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels