Skip to content

perf: deduplicate identical texts within a single embed_batch() call #243

@ztsalexey

Description

@ztsalexey

Context

Deferred from PR #235Copilot flagged redundant API calls for duplicate texts, and we replied it's a valid but low-priority future optimization.

Problem

If embed_batch(&["foo", "bar", "foo"]) is called with duplicates and none are cached, miss_texts includes "foo" twice, resulting in a redundant API call for the same content. The cache still returns correct results — it just wastes an HTTP round-trip for the duplicate.

Proposed Solution

Before calling the inner provider, group misses by cache key:

  1. Build a HashMap<String, Vec<usize>> mapping unique text → list of original indices
  2. Call the inner provider only for unique texts
  3. Fan out the returned embeddings back to all original indices

Why It Was Deferred

  • Cache works correctly as-is — the redundant call is a minor inefficiency
  • In practice, callers rarely pass duplicate texts in a single batch
  • The dedup + fan-out logic adds complexity for minimal real-world savings

Effort

Medium — need the dedup map, unique-only provider call, and index fan-out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions