Skip to content

Enhancement: Hybrid Reasoning Step for Similarity Confirmation #56

@Kavirubc

Description

@Kavirubc

Problem

Currently, Simili Bot relies purely on vector similarity (mathematical distance) to identify duplicates. While efficient, this can lead to false positives where issues share similar keywords but have different technical root causes.

Proposed Solution

Implement a "Hybrid Reasoning Step" that uses an LLM to verify similarity results before taking action.

Process:

  1. Discovery: Use the existing Vector Search to find the Top 5 most similar issues.
  2. Reasoning: Send the new issue and the Top 5 candidates to an LLM.
  3. Confirmation: The LLM analyzes the technical context and confirms if any are true duplicates.
  4. Action: Only trigger the triage workflow (or the two-phase cooldown) if the LLM provides a high-confidence match.

Benefits:

  • Precision: Combines the speed of vector search with the deep reasoning of an LLM.
  • Context Awareness: Can ignore issues that look similar (e.g., both mention 'Auth') but describe different bugs.
  • Better Triage: Enables more nuanced decisions, such as "Issue A is a duplicate of B, but also related to C."

Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestv0.2.0Target for v0.2.0

    Type

    No type

    Projects

    Status

    In Progress

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions