feat(semantic): add hybrid lexical+vector recall index for Discord archives

Parent epic: #74
Phase: 3 (quality upgrade)

## Problem: Lexical Search Misses Meaning

The current recall system (Phase 1-2) uses **lexical search** — it finds archived messages by matching exact words. This works for simple cases but fails when:

| User asks | Archive contains | Lexical match? |
|-----------|------------------|----------------|
| "breast cancer gene" | "BRCA1 mutation analysis" | ❌ No |
| "run the analysis" | "execute the pipeline" | ❌ No |
| "the paper about autism" | "ASD prevalence study (PMID:12345)" | ❌ No |

Scientific conversations have high terminology variation: gene symbols vs full names, abbreviations vs expansions, colloquial vs formal terms.

## Solution: vecflow-style Hybrid Recall

Based on the design work in `semantic-memory-cli` workspace, we adopt a **single-binary, filesystem-native** approach inspired by the vecflow concept:

### Core Principles

1. **All-in-one binary**: Archive, index, search (lexical + vector), recall in one tool
2. **Filesystem-native state**: Memory artifacts are files, compatible with Git/Dropbox
3. **Fail-open hybrid retrieval**: Degrades to lexical-only when vectors unavailable
4. **Per-workspace isolation**: Each routed channel gets its own memory directory

### Per-Workspace Memory Architecture

```
workspace/                          # e.g., ~/Dropbox/sciclaw/nihc3i-dave/
├── sessions/                       # Live session state
├── discord-archives/               # Phase 1-2 archives (.md)
└── memory/                         # NEW: vecflow-style memory
    ├── archives/                   # Archived transcripts
    ├── index/
    │   ├── lexical/                # BM25 inverted index
    │   └── vectors/                # Embedding vectors (.vec sidecars)
    ├── chunks/                     # Pre-chunked text (~200 words)
    └── config.toml                 # Model, chunk size, fusion weights
```

**Critical: Each Discord channel routes to a separate workspace, so memory is automatically isolated per-channel.** No cross-channel context bleeding.

### Hybrid Search Pipeline

```
Query → ┬─→ BM25 Search ──────→ Lexical Ranks ─┐
        │                                       ├─→ RRF Merge → Top-K
        └─→ Embed Query → kNN Search → Vector Ranks ─┘
```

**Reciprocal Rank Fusion (RRF):**
```
RRF(d) = Σ 1/(k + rank(d))  where k=60
```

No score normalization needed, robust to distribution differences.

### Implementation: Pure Go with chromem-go

**Why Go?** Matches sciclaw's existing codebase, cross-compiles easily, excellent for CLI.

**Vector DB:** [chromem-go](https://github.com/philippgille/chromem-go) — pure Go, zero deps, Chroma-like API
- In-memory with file persistence
- Built-in cosine similarity search
- Stores documents, embeddings ([]float32), metadata

**Embedding:** Local ONNX inference
- `all-MiniLM-L6-v2` (22M params, 384-dim, <5ms/chunk on CPU)
- Or `nomic-embed-text` via Ollama if deployed (768-dim, ~20ms/chunk)
- Model configured per-workspace in `config.toml`

### CLI Commands (integrated into sciclaw)

```bash
sciclaw memory init                    # Initialize memory dir in workspace
sciclaw memory index                   # Build/rebuild lexical + vector indices
sciclaw memory search "BRCA1"          # Hybrid search
sciclaw memory recall "gene mutation"  # Search + format for context injection
sciclaw memory status                  # Index health, chunk count, staleness
```

### Latency Budget (<200ms total)

| Stage | Target |
|-------|--------|
| Query embedding | <50ms |
| Vector kNN search | <20ms |
| BM25 search | <20ms |
| RRF fusion + format | <10ms |
| I/O overhead | <100ms |

### Fail-Open Behavior

When vector index is unavailable:
1. Log warning
2. Fall back to lexical-only search
3. Agent continues working (no crash, no empty results)

## Implementation Plan

1. **Add chromem-go dependency** and embedding wrapper
2. **Chunking logic**: Split archives on paragraphs, ~200 words, overlap 50
3. **Vector index**: `.vec` sidecar files alongside chunks
4. **Hybrid recall**: Parallel BM25 + kNN, RRF merge, dedup
5. **CLI surface**: `sciclaw memory {init,index,search,recall,status}`
6. **Integration**: Wire into agent loop's auto-recall path

## TDD Gates

1. **Semantic match test**: "breast cancer gene" finds "BRCA1 mutation"
2. **Hybrid beats lexical**: Measurable recall@K improvement
3. **Fail-open test**: Missing .vec → lexical-only works
4. **Latency test**: <200ms end-to-end
5. **Per-workspace isolation**: Channel A memory not visible to Channel B

## References

- [Context Rot (Hong et al. 2025)](https://research.chromadb.dev/context-rot) — empirical proof of degradation
- [chromem-go](https://github.com/philippgille/chromem-go) — pure Go vector DB
- [RRF (Cormack et al. 2009)](https://dl.acm.org/doi/10.1145/1571941.1572114) — fusion strategy
- [A-MEM (arXiv:2502.12110)](https://arxiv.org/abs/2502.12110) — agentic memory
- [MemGPT/Letta (arXiv:2310.08560)](https://arxiv.org/abs/2310.08560) — memory hierarchy

## Acceptance Criteria

- [ ] Hybrid recall finds semantic matches that lexical misses
- [ ] Per-workspace isolation verified (no cross-channel recall)
- [ ] Latency <200ms
- [ ] Fail-open when vectors unavailable
- [ ] `sciclaw memory` CLI commands working
- [ ] Integration tests pass

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(semantic): add hybrid lexical+vector recall index for Discord archives #78

Problem: Lexical Search Misses Meaning

Solution: vecflow-style Hybrid Recall

Core Principles

Per-Workspace Memory Architecture

Hybrid Search Pipeline

Implementation: Pure Go with chromem-go

CLI Commands (integrated into sciclaw)

Latency Budget (<200ms total)

Fail-Open Behavior

Implementation Plan

TDD Gates

References

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

User asks	Archive contains	Lexical match?
"breast cancer gene"	"BRCA1 mutation analysis"	❌ No
"run the analysis"	"execute the pipeline"	❌ No
"the paper about autism"	"ASD prevalence study (PMID:12345)"	❌ No

Stage	Target
Query embedding	<50ms
Vector kNN search	<20ms
BM25 search	<20ms
RRF fusion + format	<10ms
I/O overhead	<100ms

feat(semantic): add hybrid lexical+vector recall index for Discord archives #78

Description

Problem: Lexical Search Misses Meaning

Solution: vecflow-style Hybrid Recall

Core Principles

Per-Workspace Memory Architecture

Hybrid Search Pipeline

Implementation: Pure Go with chromem-go

CLI Commands (integrated into sciclaw)

Latency Budget (<200ms total)

Fail-Open Behavior

Implementation Plan

TDD Gates

References

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions