-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Parent epic: #74
Phase: 3 (quality upgrade)
Problem: Lexical Search Misses Meaning
The current recall system (Phase 1-2) uses lexical search — it finds archived messages by matching exact words. This works for simple cases but fails when:
| User asks | Archive contains | Lexical match? |
|---|---|---|
| "breast cancer gene" | "BRCA1 mutation analysis" | ❌ No |
| "run the analysis" | "execute the pipeline" | ❌ No |
| "the paper about autism" | "ASD prevalence study (PMID:12345)" | ❌ No |
Scientific conversations have high terminology variation: gene symbols vs full names, abbreviations vs expansions, colloquial vs formal terms.
Solution: vecflow-style Hybrid Recall
Based on the design work in semantic-memory-cli workspace, we adopt a single-binary, filesystem-native approach inspired by the vecflow concept:
Core Principles
- All-in-one binary: Archive, index, search (lexical + vector), recall in one tool
- Filesystem-native state: Memory artifacts are files, compatible with Git/Dropbox
- Fail-open hybrid retrieval: Degrades to lexical-only when vectors unavailable
- Per-workspace isolation: Each routed channel gets its own memory directory
Per-Workspace Memory Architecture
workspace/ # e.g., ~/Dropbox/sciclaw/nihc3i-dave/
├── sessions/ # Live session state
├── discord-archives/ # Phase 1-2 archives (.md)
└── memory/ # NEW: vecflow-style memory
├── archives/ # Archived transcripts
├── index/
│ ├── lexical/ # BM25 inverted index
│ └── vectors/ # Embedding vectors (.vec sidecars)
├── chunks/ # Pre-chunked text (~200 words)
└── config.toml # Model, chunk size, fusion weights
Critical: Each Discord channel routes to a separate workspace, so memory is automatically isolated per-channel. No cross-channel context bleeding.
Hybrid Search Pipeline
Query → ┬─→ BM25 Search ──────→ Lexical Ranks ─┐
│ ├─→ RRF Merge → Top-K
└─→ Embed Query → kNN Search → Vector Ranks ─┘
Reciprocal Rank Fusion (RRF):
RRF(d) = Σ 1/(k + rank(d)) where k=60
No score normalization needed, robust to distribution differences.
Implementation: Pure Go with chromem-go
Why Go? Matches sciclaw's existing codebase, cross-compiles easily, excellent for CLI.
Vector DB: chromem-go — pure Go, zero deps, Chroma-like API
- In-memory with file persistence
- Built-in cosine similarity search
- Stores documents, embeddings ([]float32), metadata
Embedding: Local ONNX inference
all-MiniLM-L6-v2(22M params, 384-dim, <5ms/chunk on CPU)- Or
nomic-embed-textvia Ollama if deployed (768-dim, ~20ms/chunk) - Model configured per-workspace in
config.toml
CLI Commands (integrated into sciclaw)
sciclaw memory init # Initialize memory dir in workspace
sciclaw memory index # Build/rebuild lexical + vector indices
sciclaw memory search "BRCA1" # Hybrid search
sciclaw memory recall "gene mutation" # Search + format for context injection
sciclaw memory status # Index health, chunk count, stalenessLatency Budget (<200ms total)
| Stage | Target |
|---|---|
| Query embedding | <50ms |
| Vector kNN search | <20ms |
| BM25 search | <20ms |
| RRF fusion + format | <10ms |
| I/O overhead | <100ms |
Fail-Open Behavior
When vector index is unavailable:
- Log warning
- Fall back to lexical-only search
- Agent continues working (no crash, no empty results)
Implementation Plan
- Add chromem-go dependency and embedding wrapper
- Chunking logic: Split archives on paragraphs, ~200 words, overlap 50
- Vector index:
.vecsidecar files alongside chunks - Hybrid recall: Parallel BM25 + kNN, RRF merge, dedup
- CLI surface:
sciclaw memory {init,index,search,recall,status} - Integration: Wire into agent loop's auto-recall path
TDD Gates
- Semantic match test: "breast cancer gene" finds "BRCA1 mutation"
- Hybrid beats lexical: Measurable recall@K improvement
- Fail-open test: Missing .vec → lexical-only works
- Latency test: <200ms end-to-end
- Per-workspace isolation: Channel A memory not visible to Channel B
References
- Context Rot (Hong et al. 2025) — empirical proof of degradation
- chromem-go — pure Go vector DB
- RRF (Cormack et al. 2009) — fusion strategy
- A-MEM (arXiv:2502.12110) — agentic memory
- MemGPT/Letta (arXiv:2310.08560) — memory hierarchy
Acceptance Criteria
- Hybrid recall finds semantic matches that lexical misses
- Per-workspace isolation verified (no cross-channel recall)
- Latency <200ms
- Fail-open when vectors unavailable
-
sciclaw memoryCLI commands working - Integration tests pass