Skip to content

v0.1.5

Choose a tag to compare

@Kavirubc Kavirubc released this 11 Feb 12:15
· 38 commits to main since this release
cee4697

Release v0.1.5

🎉 Major Features

🔧 Batch CLI Command

Process multiple issues from JSON files against the vector database without making any GitHub writes.

Features:

  • Process arrays of issues from JSON files
  • Support JSON (detailed) and CSV (stakeholder-friendly) output formats
  • Concurrent processing with configurable worker pool
  • Override collection, thresholds, and top-k via CLI flags
  • Perfect for testing bot logic on historical data

Usage:

simili batch --file issues.json --out-file results.csv --format csv --workers 5

Use Cases:

  • Test bot logic on historical issues without spamming repos
  • Generate analysis reports for stakeholders
  • Validate similarity search and duplicate detection accuracy
  • Audit quality assessment without requiring write access

🌐 Web UI (simili-web)

Interactive web interface for real-time issue analysis with minimal shadcn-style design.

Features:

  • Submit issues via web form (title, body, org, repo)
  • View similar issues with similarity scores
  • See duplicate detection with LLM reasoning
  • Get quality scores and improvement suggestions
  • Receive label and transfer recommendations
  • Single-binary deployment with embedded static files

Usage:

export GEMINI_API_KEY=xxx QDRANT_URL=xxx QDRANT_API_KEY=xxx QDRANT_COLLECTION=xxx
./simili-web
# Open http://localhost:8080

🐛 Bug Fixes

Duplicate Detection Improvements

  • Fixed: Related issues incorrectly marked as duplicates
  • Root cause: Only titles were passed to LLM, not full issue bodies
  • Solution: Now extracts and passes full text field from Qdrant payload
  • Rewrote duplicate detection prompt to be more conservative
  • Raised confidence threshold to 0.85
  • Added DuplicateReason field to capture LLM reasoning

Before: Issue chain #8640 → #8641 → #8642 incorrectly marked as duplicates
After: Issues correctly identified as related but not duplicates

Transfer Detection

  • Fixed transfer comment message when issue stays in current repo
  • Prevent incorrect routing by including current repo in candidates

✨ Enhancements

Message Templates

  • Redesigned all bot messages with professional branding
  • Improved clarity and consistency across all interactions

Hybrid Routing

  • Implemented hybrid routing with repository documentation learning
  • Combines rule-based and VDB semantic matching for better accuracy

🔧 Technical Changes

  • Extracted reusable ExecutePipeline() function for batch processing
  • Added URL sanitization to prevent XSS vulnerabilities
  • Improved error handling for JSON encoding operations
  • Fixed parameter shadowing issues
  • Updated embedding model to gemini-embedding-001

📚 Documentation

  • Added comprehensive README for simili-web
  • Updated main README with batch CLI usage
  • Added API documentation and troubleshooting guides
  • Included examples for both CLI and web UI

📦 What's Changed

Full Changelog: v0.1.0...v0.1.5

Issues Closed

  • Closes #34: Batch Issue Processing CLI
  • Closes #35: Simple Web Dashboard for Dry run