Problem
Currently, Simili Bot relies purely on vector similarity (mathematical distance) to identify duplicates. While efficient, this can lead to false positives where issues share similar keywords but have different technical root causes.
Proposed Solution
Implement a "Hybrid Reasoning Step" that uses an LLM to verify similarity results before taking action.
Process:
- Discovery: Use the existing Vector Search to find the Top 5 most similar issues.
- Reasoning: Send the new issue and the Top 5 candidates to an LLM.
- Confirmation: The LLM analyzes the technical context and confirms if any are true duplicates.
- Action: Only trigger the triage workflow (or the two-phase cooldown) if the LLM provides a high-confidence match.
Benefits:
- Precision: Combines the speed of vector search with the deep reasoning of an LLM.
- Context Awareness: Can ignore issues that look similar (e.g., both mention 'Auth') but describe different bugs.
- Better Triage: Enables more nuanced decisions, such as "Issue A is a duplicate of B, but also related to C."
Related Issues: