The architectural intelligence layer for AI coding agents. Structural graph, architecture governance, multi-agent orchestration, vulnerability mapping, runtime analysis -- one CLI, zero API keys.
95 commands · 26 languages · architecture OS · 100% local
Roam is a structural intelligence engine for software. It pre-indexes your codebase into a semantic graph -- symbols, dependencies, call graphs, architecture layers, git history, and runtime traces -- stored in a local SQLite DB. Agents query it via CLI or MCP instead of repeatedly grepping files and guessing structure.
Unlike LSPs (editor-bound, language-specific) or Sourcegraph (hosted search), Roam provides architecture-level graph queries -- offline, cross-language, and compact. It goes beyond comprehension: Roam governs architecture through budget gates, simulates refactoring outcomes, orchestrates multi-agent swarms with zero-conflict guarantees, maps vulnerability reachability paths, and enables graph-level code editing without syntax errors.
Codebase ──> [Index] ──> Semantic Graph ──> 94 Commands ──> AI Agent
│ │ │
tree-sitter symbols comprehend
26 languages + edges govern
git history + metrics refactor
runtime traces + architecture orchestrate
Coding agents explore codebases inefficiently: dozens of grep/read cycles, high token cost, no structural understanding. Roam replaces this with one graph query:
$ roam context Flask
Callers: 47 Callees: 3
Affected tests: 31
Files to read:
src/flask/app.py:76-963 # definition
src/flask/__init__.py:1-15 # re-export
src/flask/testing.py:22-45 # caller: FlaskClient.__init__
tests/test_basic.py:12-30 # caller: test_app_factory
...12 more files
$ roam understand # full codebase briefing
$ roam context <name> # files-to-read with exact line ranges
$ roam preflight <name> # blast radius + tests + complexity + architecture rules
$ roam health # composite score (0-100)
$ roam diff # blast radius of uncommitted changes- Agent-assisted coding -- structured answers that reduce token usage vs raw file exploration
- Large codebases (100+ files) -- graph queries beat linear search at scale
- Architecture governance -- health scores, CI quality gates, budget enforcement, fitness functions
- Safe refactoring -- blast radius, affected tests, pre-change safety checks, graph-level editing
- Multi-agent orchestration -- partition codebases for parallel agent work with zero-conflict guarantees
- Security analysis -- vulnerability reachability mapping, auth gaps, CVE path tracing
- Algorithm optimization -- detect O(n^2) loops, N+1 queries, and 21 other anti-patterns with suggested fixes
- Backend quality -- auth gaps, missing indexes, over-fetching models, non-idempotent migrations, orphan routes, API drift
- Runtime analysis -- overlay production trace data onto the static graph for hotspot detection
- Multi-repo projects -- cross-repo API edge detection between frontend and backend
- Real-time type checking -- use an LSP (pyright, gopls, tsserver). Roam is static and offline.
- Small scripts (<10 files) -- just read the files directly.
- Pure text search -- ripgrep is faster for raw string matching.
Speed. One command replaces 5-10 tool calls (in typical workflows). Under 0.5s for any query.
Dependency-aware. Computes structure, not string matches. Knows Flask has 47 dependents and 31 affected tests. grep knows it appears 847 times.
LLM-optimized output. Plain ASCII, compact abbreviations (fn, cls, meth), --json envelopes. Designed for agent consumption, not human decoration.
Fully local. No API keys, telemetry, or network calls. Works in air-gapped environments.
Algorithm-aware. Built-in catalog of 23 anti-patterns. Detects suboptimal algorithms (quadratic loops, N+1 queries, unbounded recursion) and suggests fixes with Big-O improvements and confidence scores. Receiver-aware loop-invariant analysis minimizes false positives.
CI-ready. --json output, --gate quality gates, GitHub Action, SARIF 2.1.0.
| Without Roam | With Roam | |
|---|---|---|
| Tool calls | 8 | 1 |
| Wall time | ~11s | <0.5s |
| Tokens consumed | ~15,000 | ~3,000 |
Measured on a typical agent workflow in a 200-file Python project (Flask). See benchmarks for more.
Table of Contents
Getting Started: What is Roam? · Best for · Why use Roam · Install · Quick Start
Using Roam: Commands · Walkthrough · AI Coding Tools · MCP Server
Operations: CI/CD Integration · SARIF Output · For Teams
Reference: Language Support · Performance · How It Works · How Roam Compares · FAQ
More: Limitations · Troubleshooting · Update / Uninstall · Development · Contributing
pip install roam-code
# Recommended: isolated environment
pipx install roam-code
# or
uv tool install roam-code
# From source
pip install git+https://github.com/Cranot/roam-code.gitRequires Python 3.9+. Works on Linux, macOS, and Windows.
Windows: If
roamis not found after installing withuv, runuv tool update-shelland restart your terminal.
cd your-project
roam init # indexes codebase, creates config + CI workflow
roam understand # full codebase briefingFirst index takes ~5s for 200 files, ~15s for 1,000 files. Subsequent runs are incremental and near-instant.
Next steps:
- Set up your AI agent:
roam describe --write(auto-detects CLAUDE.md, AGENTS.md, .cursor/rules, etc. — see integration instructions) - Explore:
roam health→roam weather→roam map - Add to CI:
roam initalready generated a GitHub Action
Try it on Roam itself
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
roam init
roam understand
roam healthClaude Code • Cursor • Windsurf • GitHub Copilot • Aider • Cline • Gemini CLI • OpenAI Codex CLI • MCP • GitHub Actions • GitLab CI • Azure DevOps
The 5 core commands shown above cover ~80% of agent workflows. 95 commands are organized into 7 categories.
Full command reference
| Command | Description |
|---|---|
roam index [--force] [--verbose] |
Build or rebuild the codebase index |
roam init |
Guided onboarding: creates .roam/fitness.yaml, CI workflow, runs index, shows health |
roam understand |
Full codebase briefing: tech stack, architecture, key abstractions, health, conventions, complexity overview, entry points |
roam tour [--write PATH] |
Auto-generated onboarding guide: top symbols, reading order, entry points, language breakdown. --write saves to Markdown |
roam describe [--write] [--force] [-o PATH] [--agent-prompt] |
Auto-generate project description for AI agents. --write auto-detects your agent's config file. --agent-prompt returns a compact (<500 token) system prompt |
roam minimap [--update] [-o FILE] [--init-notes] |
Compact annotated codebase snapshot for CLAUDE.md injection: stack, annotated directory tree, key symbols by PageRank, high fan-in symbols to avoid touching, hotspots, conventions. Sentinel-based in-place updates |
roam config [KEY [VALUE]] |
View or set configuration options |
roam map [-n N] [--full] [--budget N] |
Project skeleton: files, languages, entry points, top symbols by PageRank. --budget caps output to N tokens |
roam schema [--diff] [--version V] |
JSON envelope schema versioning: view, diff, and validate output schemas |
| Command | Description |
|---|---|
roam file <path> [--full] [--changed] [--deps-of PATH] |
File skeleton: all definitions with signatures, cognitive load index, health score |
roam symbol <name> [--full] |
Symbol definition + callers + callees + metrics. Supports file:symbol disambiguation |
roam context <symbol> [--task MODE] [--for-file PATH] |
AI-optimized context: definition + callers + callees + files-to-read with line ranges |
roam search <pattern> [--kind KIND] |
Find symbols by name pattern, PageRank-ranked |
roam grep <pattern> [-g glob] [-n N] |
Text search annotated with enclosing symbol context |
roam deps <path> [--full] |
What a file imports and what imports it |
roam trace <source> <target> [-k N] |
Dependency paths with coupling strength and hub detection |
roam impact <symbol> |
Blast radius: what breaks if a symbol changes (Personalized PageRank weighted) |
roam diff [--staged] [--full] [REV_RANGE] |
Blast radius of uncommitted changes or a commit range |
roam pr-risk [REV_RANGE] |
PR risk score (0-100, multiplicative model) + structural spread + suggested reviewers |
roam pr-diff [--staged] [--range R] [--format markdown] |
Structural PR diff: metric deltas, edge analysis, symbol changes, footprint. Not text diff — graph delta |
roam attest [REV_RANGE] [--format markdown] [--sign] |
Proof-carrying PR attestation: bundles blast radius, risk, breaking changes, fitness, budget, tests, effects into one verifiable artifact |
roam annotate <symbol> <note> |
Attach persistent notes to symbols (agentic memory across sessions) |
roam annotations [--file F] [--symbol S] |
View stored annotations |
roam diagnose <symbol> [--depth N] |
Root cause analysis: ranks suspects by z-score normalized risk |
roam preflight <symbol|file> |
Compound pre-change check: blast radius + tests + complexity + coupling + fitness |
roam safe-delete <symbol> |
Safe deletion check: SAFE/REVIEW/UNSAFE verdict |
roam test-map <name> |
Map a symbol or file to its test coverage |
roam adversarial [--staged] [--range R] |
Adversarial architecture review: generates targeted challenges based on changes |
roam plan [--staged] [--range R] [--agents N] |
Agent work planner: decompose changes into sequenced, dependency-aware steps |
roam closure <symbol> [--rename] [--delete] |
Minimal-change synthesis: all files to touch for a safe rename/delete |
roam mutate move|rename|add-call|extract |
Graph-level code editing: move symbols, rename across codebase, add calls, extract functions. Dry-run by default |
| Command | Description |
|---|---|
roam health [--no-framework] |
Composite health score (0-100): weighted geometric mean of tangle ratio, god components, bottlenecks, layer violations. Includes propagation cost and algebraic connectivity |
roam complexity [--bumpy-road] |
Per-function cognitive complexity (SonarSource-compatible, triangular nesting penalty) + Halstead metrics (volume, difficulty, effort, bugs) + cyclomatic density |
roam algo [--task T] [--confidence C] |
Algorithm anti-pattern detection: 23-pattern catalog detects suboptimal algorithms (O(n^2) loops, N+1 queries, quadratic string building, branching recursion, loop-invariant calls) and suggests better approaches with Big-O improvements. Confidence calibration via caller-count and bounded-loop analysis. Language-aware tips. Alias: roam math |
roam n1 [--confidence C] [--verbose] |
Implicit N+1 I/O detection: finds ORM model computed properties ($appends/accessors) that trigger lazy-loaded DB queries in collection contexts. Cross-references with eager loading config. Supports Laravel, Django, Rails, SQLAlchemy, JPA |
roam over-fetch [--threshold N] [--confidence C] |
Detect models serializing too many fields: large $fillable without $hidden/$visible, direct controller returns bypassing API Resources, poor exposed-to-hidden ratio |
roam missing-index [--table T] [--confidence C] |
Find queries on non-indexed columns: cross-references WHERE/ORDER BY clauses, foreign keys, and paginated queries against migration-defined indexes |
roam weather [-n N] |
Hotspots ranked by geometric mean of churn x complexity (percentile-normalized) |
roam debt |
Hotspot-weighted tech debt prioritization with SQALE remediation cost estimates |
roam fitness [--explain] |
Architectural fitness functions from .roam/fitness.yaml |
roam alerts |
Health degradation trend detection (Mann-Kendall + Sen's slope) |
roam snapshot [--tag TAG] |
Persist health metrics snapshot for trend tracking |
roam trend |
Health score history with sparkline visualization |
roam digest [--brief] [--since TAG] |
Compare current metrics against last snapshot |
roam forecast [--symbol S] [--horizon N] [--alert-only] |
Predict when metrics will exceed thresholds: Theil-Sen regression on snapshot history + churn-weighted per-symbol risk |
roam budget [--init] [--staged] [--range R] |
Architectural budget enforcement: per-PR delta limits on health, cycles, complexity. CI gate (exit 1 on violation) |
roam bisect [--metric M] [--range R] |
Architectural git bisect: find the commit that degraded a specific metric |
roam ingest-trace <file> [--otel|--jaeger|--zipkin|--generic] |
Ingest runtime trace data (OpenTelemetry, Jaeger, Zipkin) for hotspot overlay |
roam hotspots [--runtime] [--discrepancy] |
Runtime hotspot analysis: find symbols missed by static analysis but critical at runtime |
roam algo — algorithm anti-pattern catalog (23 patterns)
roam algo scans every indexed function against a 23-pattern catalog, ranks findings by confidence, and shows the exact Big-O improvement available. Tips are language-aware (Python, JS, Go, Rust, Java, etc.):
$ roam algo
VERDICT: 8 algorithmic improvements found (3 high, 4 medium, 1 low)
Nested loop lookup (2):
fn resolve_permissions src/auth/rbac.py:112 [high]
Current: Nested iteration -- O(n*m)
Better: Hash-map join -- O(n+m)
Tip: Build a dict/set from one collection, iterate the other
fn find_matching_rule src/rules/engine.py:67 [high]
Current: Nested iteration -- O(n*m)
Better: Hash-map join -- O(n+m)
Tip: Build a dict/set from one collection, iterate the other
String building (1):
meth build_query src/db/query.py:88 [high]
Current: Loop concatenation -- O(n^2)
Better: Join / StringBuilder -- O(n)
Tip: Collect parts in a list, join once at the end
Branching recursion without memoization (1):
fn compute_cost src/pricing/calc.py:34 [medium]
Current: Naive branching recursion -- O(2^n)
Better: Memoized / iterative DP -- O(n)
Tip: Add @cache / @lru_cache, or convert to iterative with a table
Full catalog — 23 patterns:
| Pattern | Anti-pattern detected | Better approach | Improvement |
|---|---|---|---|
| Nested loop lookup | for x in a: for y in b: if x==y |
Hash-map join | O(n·m) → O(n+m) |
| Membership test | if x in list in a loop |
Set lookup | O(n) → O(1) per check |
| Sorting | Bubble / selection sort | Built-in sort | O(n²) → O(n log n) |
| Search in sorted data | Linear scan on sorted sequence | Binary search | O(n) → O(log n) |
| String building | s += chunk in loop |
join() / StringBuilder |
O(n²) → O(n) |
| Deduplication | Nested loop dedup | set() / dict.fromkeys |
O(n²) → O(n) |
| Max / min | Manual tracking loop | max() / min() |
idiom |
| Accumulation | Manual accumulator | sum() / reduce() |
idiom |
| Group by key | Manual key-existence check | defaultdict / groupingBy |
idiom |
| Fibonacci | Naive recursion | Iterative / @lru_cache |
O(2ⁿ) → O(n) |
| Exponentiation | Loop multiplication | pow(b, e, mod) |
O(n) → O(log n) |
| GCD | Manual loop | math.gcd() |
O(n) → O(log n) |
| Matrix multiply | Naive triple loop | NumPy / BLAS | same asymptotic, ~1000× faster via SIMD |
| Busy wait | while True: sleep() poll |
Event / condition variable | O(k) → O(1) wake-up |
| Regex in loop | re.match() compiled per iteration |
Pre-compiled pattern | O(n·(p+m)) → O(p + n·m) |
| N+1 query | Per-item DB / API call in loop | Batch WHERE IN (...) |
n round-trips → 1 |
| List front operations | list.insert(0, x) in loop |
collections.deque |
O(n) → O(1) per op |
| Sort to select | sorted(x)[0] or sorted(x)[:k] |
min() / heapq.nsmallest |
O(n log n) → O(n) or O(n log k) |
| Repeated lookup | .index() / .contains() inside loop |
Pre-built set / dict | O(m) → O(1) per lookup |
| Branching recursion | Naive f(n-1) + f(n-2) without cache |
@cache / iterative DP |
O(2ⁿ) → O(n) |
| Quadratic string building | result += chunk across multiple scopes |
parts.append + join at end |
O(n²) → O(n) |
| Loop-invariant call | len(col) or get_config() inside loop body |
Hoist before loop | per-iter cost → O(1) |
| String reversal | Manual char-by-char loop | s[::-1] / .reverse() |
idiom |
Filtering:
roam algo --task nested-lookup # one pattern type only
roam algo --confidence high # high-confidence findings only
roam algo --task io-in-loop -n 5 # top 5 N+1 query sites
roam --json algo # machine-readable outputConfidence calibration: high = strong structural signal (unbounded loop + high caller count + pattern confirmed); medium = pattern matched but loop may be bounded; low = heuristic signal only.
roam minimap — annotated codebase snapshot for CLAUDE.md
roam minimap generates a compact block (stack, annotated directory tree, key symbols, hotspots, conventions) wrapped in sentinel comments for in-place CLAUDE.md updates:
$ roam minimap
<!-- roam:minimap generated=2026-02-18 -->
**Stack:** Python · JavaScript · YAML
.github/ (4 files)
benchmarks/ (75 files)
src/
roam/
bridges/
base.py # LanguageBridge
registry.py # register_bridge, detect_bridges
commands/ (93 files) # is_test_file, get_changed_files
db/
connection.py # find_project_root, batched_in
schema.py
graph/
builder.py # build_symbol_graph, build_file_graph
pagerank.py # compute_pagerank, compute_centrality
languages/ (18 files) # ApexExtractor
output/
formatter.py # to_json, json_envelope
cli.py # cli, LazyGroup
mcp_server.py
tests/ (70 files)
`
Key symbols (PageRank): open_db · ensure_index · json_envelope · to_json · LanguageExtractor
Touch carefully (fan-in >= 15): to_json (116 callers) · json_envelope (116 callers) · open_db (105 callers) · ensure_index (100 callers)
Hotspots (churn x complexity): cmd_context.py · csharp_lang.py · cmd_dead.py
Conventions: snake_case fns, PascalCase classes
**Workflow:**
```bash
roam minimap # print to stdout
roam minimap --update # replace sentinel block in CLAUDE.md in-place
roam minimap -o docs/AGENTS.md # target a different file
roam minimap --init-notes # scaffold .roam/minimap-notes.md for project gotchas
The sentinel pair <!-- roam:minimap --> / <!-- /roam:minimap --> is replaced on each run — surrounding content is left intact. Add project-specific gotchas to .roam/minimap-notes.md and they appear in every subsequent output.
Tree annotations come from the top exported symbols by fan-in per file. Non-source root directories (.github/, benchmarks/, docs/) are collapsed immediately. Large subdirectories (e.g. commands/, languages/) are collapsed at depth 2+ with a file count.
| Command | Description |
|---|---|
roam clusters [--min-size N] |
Community detection vs directory structure. Modularity Q-score (Newman 2004) + per-cluster conductance |
roam layers |
Topological dependency layers + upward violations + Gini balance |
roam dead [--all] [--summary] [--clusters] |
Unreferenced exported symbols with safety verdicts + confidence scoring (60-95%) |
roam fan [symbol|file] [-n N] [--no-framework] |
Fan-in/fan-out: most connected symbols or files |
roam risk [-n N] [--domain KW] [--explain] |
Domain-weighted risk ranking |
roam why <name> [name2 ...] |
Role classification (Hub/Bridge/Core/Leaf), reach, criticality |
roam split <file> |
Internal symbol groups with isolation % and extraction suggestions |
roam entry-points |
Entry point catalog with protocol classification |
roam patterns |
Architectural pattern recognition: Strategy, Factory, Observer, etc. |
roam visualize [--format mermaid|dot] [--focus NAME] [--limit N] |
Generate Mermaid or DOT architecture diagrams. Smart filtering via PageRank, cluster grouping, cycle highlighting |
roam effects [TARGET] [--file F] [--type T] |
Side-effect classification: DB writes, network I/O, filesystem, global mutation. Direct + transitive effects through call graph |
roam dark-matter [--min-cochanges N] |
Detect hidden co-change couplings not explained by import/call edges |
roam simulate move|extract|merge|delete |
Counterfactual architecture simulator: test refactoring ideas in-memory, see metric deltas before writing code |
roam orchestrate --agents N [--files P] |
Multi-agent swarm partitioning: split codebase for parallel agents with zero-conflict guarantees |
roam fingerprint [--compact] [--compare F] |
Topology fingerprint: extract/compare architectural signatures across repos |
roam cut <target> [--depth N] |
Minimum graph cuts: find critical edges whose removal disconnects components |
roam safe-zones |
Graph-based containment boundaries |
roam coverage-gaps |
Unprotected entry points with no path to gate symbols |
| Command | Description |
|---|---|
roam module <path> |
Directory contents: exports, signatures, dependencies, cohesion |
roam sketch <dir> [--full] |
Compact structural skeleton of a directory |
roam uses <name> |
All consumers: callers, importers, inheritors |
roam owner <path> |
Code ownership: who owns a file or directory |
roam coupling [-n N] [--set] |
Temporal coupling: file pairs that change together (NPMI + lift) |
roam fn-coupling |
Function-level temporal coupling across files |
roam bus-factor [--brain-methods] |
Knowledge loss risk per module |
roam doc-staleness |
Detect stale docstrings |
roam conventions |
Auto-detect naming styles, import preferences. Flags outliers |
roam breaking [REV_RANGE] |
Breaking change detection: removed exports, signature changes |
roam affected-tests <symbol|file> |
Trace reverse call graph to test files |
roam relate <sym1> <sym2> |
Show relationship between two symbols: shared callers, shortest path, common ancestors |
roam search-semantic <query> |
Semantic search: find symbols by meaning, not just name pattern |
roam intent [--staged] [--range R] |
Doc-to-code linking: match documentation to symbols, detect drift |
roam schema [--diff] [--version V] |
JSON envelope schema versioning: view, diff, and validate output schemas |
roam x-lang [--bridges] [--edges] |
Cross-language edge browser: inspect bridge-resolved connections |
| Command | Description |
|---|---|
roam report [--list] [--config FILE] [PRESET] |
Compound presets: first-contact, security, pre-pr, refactor |
roam describe --write |
Generate agent config (auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.) |
roam auth-gaps [--routes-only] [--controllers-only] [--min-confidence C] |
Find endpoints missing authentication or authorization: routes outside auth middleware groups, CRUD methods without $this->authorize() / Gate::allows() checks. String-aware PHP brace parsing |
roam orphan-routes [-n N] [--confidence C] |
Detect backend routes with no frontend consumer: parses route definitions, searches frontend for API call references, reports controller methods with no route mapping |
roam migration-safety [-n N] [--include-archive] |
Detect non-idempotent migrations: missing hasTable/hasColumn guards, raw SQL without IF NOT EXISTS, index operations without existence checks |
roam api-drift [--model M] [--confidence C] |
Detect mismatches between PHP model $fillable/$appends fields and TypeScript interface properties. Auto-converts snake_case/camelCase for comparison. Single-repo; cross-repo planned for roam ws api-drift |
roam path-coverage [--from P] [--to P] [--max-depth N] |
Find critical call paths (entry -> sink) with zero test protection. Suggests optimal test insertion points |
roam capsule [--redact-paths] [--no-signatures] [--output F] |
Export sanitized structural graph (no code bodies) for external architectural review |
roam rules [--init] [--ci] [--rules-dir D] |
Plugin DSL for governance: user-defined architectural rules via .roam/rules/ YAML |
roam vuln-map --generic|--npm-audit|--trivy F |
Ingest vulnerability reports and match to codebase symbols |
roam vuln-reach [--cve C] [--from E] |
Vulnerability reachability: exact paths from entry points to vulnerable calls |
roam invariants [--staged] [--range R] |
Discover architectural contracts (invariants) from the codebase structure |
| Command | Description |
|---|---|
roam ws init <repo1> <repo2> [--name NAME] |
Initialize a workspace from sibling repos. Auto-detects frontend/backend roles |
roam ws status |
Show workspace repos, index ages, cross-repo edge count |
roam ws resolve |
Scan for REST API endpoints and match frontend calls to backend routes |
roam ws understand |
Unified workspace overview: per-repo stats + cross-repo connections |
roam ws health |
Workspace-wide health report with cross-repo coupling assessment |
roam ws context <symbol> |
Cross-repo augmented context: find a symbol across repos + show API callers |
roam ws trace <source> <target> |
Trace cross-repo paths via API edges |
| Option | Description |
|---|---|
roam --json <command> |
Structured JSON output with consistent envelope |
roam --compact <command> |
Token-efficient output: TSV tables, minimal JSON envelope |
roam --sarif <command> |
SARIF 2.1.0 output for dead, health, complexity, rules (GitHub/CI integration) |
roam <command> --gate EXPR |
CI quality gate (e.g., --gate score>=70). Exit code 1 on failure |
10-step walkthrough using Flask as an example (click to expand)
Here's how you'd use Roam to understand a project you've never seen before. Using Flask as an example:
Step 1: Onboard and get the full picture
$ roam init
Created .roam/fitness.yaml (6 starter rules)
Created .github/workflows/roam.yml
Done. 226 files, 1132 symbols, 233 edges.
Health: 78/100
$ roam understand
Tech stack: Python (flask, jinja2, werkzeug)
Architecture: Monolithic — 3 layers, 5 clusters
Key abstractions: Flask, Blueprint, Request, Response
Health: 78/100 — 1 god component (Flask)
Entry points: src/flask/__init__.py, src/flask/cli.py
Conventions: snake_case functions, PascalCase classes, relative imports
Complexity: avg 4.2, 3 high (>15), 0 critical (>25)
Step 2: Drill into a key file
$ roam file src/flask/app.py
src/flask/app.py (python, 963 lines)
cls Flask(App) :76-963
meth __init__(self, import_name, ...) :152
meth route(self, rule, **options) :411
meth register_blueprint(self, blueprint, ...) :580
meth make_response(self, rv) :742
...12 more methods
Step 3: Who depends on this?
$ roam deps src/flask/app.py
Imported by:
file symbols
-------------------------- -------
src/flask/__init__.py 3
src/flask/testing.py 2
tests/test_basic.py 1
...18 files total
Step 4: Find the hotspots
$ roam weather
=== Hotspots (churn x complexity) ===
Score Churn Complexity Path Lang
----- ----- ---------- ---------------------- ------
18420 460 40.0 src/flask/app.py python
12180 348 35.0 src/flask/blueprints.py python
Step 5: Check architecture health
$ roam health
Health: 78/100
Tangle: 0.0% (0/1132 symbols in cycles)
1 god component (Flask, degree 47, actionable)
0 bottlenecks, 0 layer violations
=== God Components (degree > 20) ===
Sev Name Kind Degree Cat File
------- ----- ---- ------ --- ------------------
WARNING Flask cls 47 act src/flask/app.py
Step 6: Get AI-ready context for a symbol
$ roam context Flask
Files to read:
src/flask/app.py:76-963 # definition
src/flask/__init__.py:1-15 # re-export
src/flask/testing.py:22-45 # caller: FlaskClient.__init__
tests/test_basic.py:12-30 # caller: test_app_factory
...12 more files
Callers: 47 Callees: 3
Step 7: Pre-change safety check
$ roam preflight Flask
=== Preflight: Flask ===
Blast radius: 47 callers, 89 transitive
Affected tests: 31 (DIRECT: 12, TRANSITIVE: 19)
Complexity: cc=40 (critical), nesting=6
Coupling: 3 hidden co-change partners
Fitness: 1 violation (max-complexity exceeded)
Verdict: HIGH RISK — consider splitting before modifying
Step 8: Decompose a large file
$ roam split src/flask/app.py
=== Split analysis: src/flask/app.py ===
87 symbols, 42 internal edges, 95 external edges
Cross-group coupling: 18%
Group 1 (routing) — 12 symbols, isolation: 83% [extractable]
meth route L411 PR=0.0088
meth add_url_rule L450 PR=0.0045
...
=== Extraction Suggestions ===
Extract 'routing' group: route, add_url_rule, endpoint (+9 more)
83% isolated, only 3 edges to other groups
Step 9: Understand why a symbol matters
$ roam why Flask url_for Blueprint
Symbol Role Fan Reach Risk Verdict
--------- ------------ ---------- -------- -------- --------------------------------------------------
Flask Hub fan-in:47 reach:89 CRITICAL God symbol (47 in, 12 out). Consider splitting.
url_for Core utility fan-in:31 reach:45 HIGH Widely used utility (31 callers). Stable interface.
Blueprint Bridge fan-in:18 reach:34 moderate Coupling point between clusters.
Step 10: Generate docs and set up CI
$ roam describe --write
Wrote CLAUDE.md (98 lines) # auto-detects: CLAUDE.md, AGENTS.md, .cursor/rules, etc.
$ roam health --gate score>=70
Health: 78/100 — PASS
Ten commands. Complete picture: structure, dependencies, hotspots, health, context, safety checks, decomposition, and CI gates.
Roam is designed to be called by coding agents via shell commands. Instead of repeatedly grepping and reading files, the agent runs one roam command and gets structured output.
Decision order for agents:
| Situation | Command |
|---|---|
| First time in a repo | roam understand then roam tour |
| Need to modify a symbol | roam preflight <name> (blast radius + tests + fitness) |
| Debugging a failure | roam diagnose <name> (root cause ranking) |
| Need files to read | roam context <name> (files + line ranges) |
| Need to find a symbol | roam search <pattern> |
| Need file structure | roam file <path> |
| Pre-PR check | roam pr-risk HEAD~3..HEAD |
| What breaks if I change X? | roam impact <symbol> |
| Check for N+1 queries | roam n1 (implicit lazy-load detection) |
| Check auth coverage | roam auth-gaps (routes + controllers) |
| Check migration safety | roam migration-safety (idempotency guards) |
Fastest setup:
roam describe --write # auto-detects your agent's config file
roam describe --write -o AGENTS.md # or specify an explicit path
roam describe --agent-prompt # compact ~500-token prompt (append to any config)
roam minimap --update # inject/refresh annotated codebase minimap in CLAUDE.mdAgent not using Roam correctly? If your agent is ignoring Roam and falling back to grep/read exploration, it likely doesn't have the instructions. Run:
roam describe --write # writes instructions to your agent's config (CLAUDE.md, AGENTS.md, etc.)If you already have a config file and don't want to overwrite it:
roam describe --agent-prompt # prints a compact prompt — copy-paste into your existing config
roam minimap --update # injects an annotated codebase snapshot into CLAUDE.md (won't touch other content)This teaches the agent which Roam command to use for each situation (e.g., roam preflight before changes, roam context for files to read, roam diagnose for debugging).
Copy-paste agent instructions
## Codebase navigation
This project uses `roam` for codebase comprehension. Always prefer roam over Glob/Grep/Read exploration.
Before modifying any code:
1. First time in the repo: `roam understand` then `roam tour`
2. Find a symbol: `roam search <pattern>`
3. Before changing a symbol: `roam preflight <name>` (blast radius + tests + fitness)
4. Need files to read: `roam context <name>` (files + line ranges, prioritized)
5. Debugging a failure: `roam diagnose <name>` (root cause ranking)
6. After making changes: `roam diff` (blast radius of uncommitted changes)
Additional: `roam health` (0-100 score), `roam impact <name>` (what breaks),
`roam pr-risk` (PR risk), `roam file <path>` (file skeleton).
Run `roam --help` for all commands. Use `roam --json <cmd>` for structured output.Where to put this for each tool
| Tool | Config file |
|---|---|
| Claude Code | CLAUDE.md in your project root |
| OpenAI Codex CLI | AGENTS.md in your project root |
| Gemini CLI | GEMINI.md in your project root |
| Cursor | .cursor/rules/roam.mdc (add alwaysApply: true frontmatter) |
| Windsurf | .windsurf/rules/roam.md (add trigger: always_on frontmatter) |
| GitHub Copilot | .github/copilot-instructions.md |
| Aider | CONVENTIONS.md |
| Continue.dev | config.yaml rules |
| Cline | .clinerules/ directory |
Roam vs native tools
| Task | Use Roam | Use native tools |
|---|---|---|
| "What calls this function?" | roam symbol <name> |
LSP / Grep |
| "What files do I need to read?" | roam context <name> |
Manual tracing (5+ calls) |
| "Is it safe to change X?" | roam preflight <name> |
Multiple manual checks |
| "Show me this file's structure" | roam file <path> |
Read the file directly |
| "Understand project architecture" | roam understand |
Manual exploration |
| "What breaks if I change X?" | roam impact <symbol> |
No direct equivalent |
| "What tests to run?" | roam affected-tests <name> |
Grep for imports (misses indirect) |
| "What's causing this bug?" | roam diagnose <name> |
Manual call-chain tracing |
| "Codebase health score for CI" | roam health --gate score>=70 |
No equivalent |
Roam includes a Model Context Protocol server for direct integration with tools that support MCP.
pip install roam-code[mcp]
roam mcp61 tools and 2 resources. All tools are read-only and query the index -- they never modify your code.
Lite mode (default): By default, 16 core tools are exposed to keep the tool list manageable for agents. Set ROAM_MCP_LITE=0 to expose all 61 tools:
ROAM_MCP_LITE=0 roam mcpCore tools in lite mode: roam_understand, roam_search_symbol, roam_context, roam_file_info, roam_deps, roam_preflight, roam_diff, roam_pr_risk, roam_affected_tests, roam_impact, roam_uses, roam_health, roam_dead_code, roam_complexity_report, roam_diagnose, roam_trace.
MCP tool list (all 61)
| Tool | Description |
|---|---|
roam_understand |
Full codebase briefing |
roam_health |
Health score (0-100) + issues |
roam_preflight |
Pre-change safety check |
roam_search_symbol |
Find symbols by name |
roam_context |
Files-to-read for modifying a symbol |
roam_trace |
Dependency path between two symbols |
roam_impact |
Blast radius of changing a symbol |
roam_file_info |
File skeleton with all definitions |
roam_pr_risk |
Risk score for pending changes |
roam_breaking_changes |
Detect breaking changes between refs |
roam_affected_tests |
Find tests affected by a change |
roam_dead_code |
List unreferenced exports |
roam_complexity_report |
Per-symbol cognitive complexity |
roam_repo_map |
Project skeleton with key symbols |
roam_tour |
Auto-generated onboarding guide |
roam_diagnose |
Root cause analysis for debugging |
roam_visualize |
Generate Mermaid or DOT architecture diagrams |
roam_algo |
Algorithm anti-pattern detection with language-aware tips |
roam_ws_understand |
Unified multi-repo workspace overview |
roam_ws_context |
Cross-repo augmented symbol context |
roam_pr_diff |
Structural PR diff: metric deltas, edge analysis, symbol changes |
roam_budget_check |
Check changes against architectural budgets |
roam_effects |
Side-effect classification (DB writes, network, filesystem) |
roam_attest |
Proof-carrying PR attestation with all evidence bundled |
roam_capsule_export |
Export sanitized structural graph (no code bodies) |
roam_path_coverage |
Find critical untested call paths (entry -> sink) |
roam_forecast |
Predict when metrics will exceed thresholds |
roam_simulate |
Counterfactual architecture simulator |
roam_orchestrate |
Multi-agent swarm partitioning |
roam_fingerprint |
Topology fingerprint comparison |
roam_mutate |
Graph-level code editing (move/rename/extract) |
roam_dark_matter |
Hidden co-change coupling detection |
roam_closure |
Minimal-change synthesis for rename/delete |
roam_adversarial_review |
Adversarial architecture review |
roam_generate_plan |
Agent work planner |
roam_get_invariants |
Architectural invariant discovery |
roam_bisect_blame |
Architectural git bisect |
roam_doc_intent |
Doc-to-code linking |
roam_cut_analysis |
Minimum graph cut analysis |
roam_annotate_symbol |
Attach persistent notes to symbols |
roam_get_annotations |
View stored annotations |
roam_relate |
Show relationship between two symbols |
roam_search_semantic |
Semantic search by meaning |
roam_rules_check |
Plugin DSL governance rules |
roam_vuln_map |
Vulnerability report ingestion |
roam_vuln_reach |
Vulnerability reachability paths |
roam_ingest_trace |
Ingest runtime trace data |
roam_runtime_hotspots |
Runtime hotspot analysis |
roam_diff |
Blast radius of uncommitted/committed changes |
roam_symbol |
Symbol definition, callers, callees, metrics |
roam_deps |
File-level import/imported-by relationships |
roam_uses |
All consumers of a symbol by edge type |
roam_weather |
Code hotspots: churn x complexity ranking |
roam_debt |
Hotspot-weighted technical debt prioritization |
roam_n1 |
Detect N+1 I/O patterns in ORM code |
roam_auth_gaps |
Find endpoints missing auth |
roam_over_fetch |
Detect models serializing too many fields |
roam_missing_index |
Find queries on non-indexed columns |
roam_orphan_routes |
Detect dead backend routes |
roam_migration_safety |
Detect non-idempotent migrations |
roam_api_drift |
Backend/frontend model mismatch detection |
Resources: roam://health (current health score), roam://summary (project overview)
Claude Code
claude mcp add roam-code -- roam mcpOr add to .mcp.json in your project root:
{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"]
}
}
}Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"],
"cwd": "/path/to/your/project"
}
}
}Cursor
Add to .cursor/mcp.json:
{
"mcpServers": {
"roam-code": {
"command": "roam",
"args": ["mcp"]
}
}
}VS Code + Copilot
Add to .vscode/mcp.json:
{
"servers": {
"roam-code": {
"type": "stdio",
"command": "roam",
"args": ["mcp"]
}
}
}All you need is Python 3.9+ and pip install roam-code.
# .github/workflows/roam.yml
name: Roam Analysis
on: [pull_request]
jobs:
roam:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: Cranot/roam-code@main
with:
command: health --gate score>=70
comment: true
fail-on-violation: trueUse roam init to auto-generate this workflow.
| Input | Default | Description |
|---|---|---|
command |
health |
Roam command to run |
python-version |
3.12 |
Python version |
comment |
false |
Post results as PR comment |
fail-on-violation |
false |
Fail the job on violations |
roam-version |
(latest) | Pin to a specific version |
GitLab CI
roam-analysis:
stage: test
image: python:3.12-slim
before_script:
- pip install roam-code
script:
- roam index
- roam health --gate score>=70
- roam --json pr-risk origin/main..HEAD > roam-report.json
artifacts:
paths:
- roam-report.json
rules:
- if: $CI_MERGE_REQUEST_IIDAzure DevOps / any CI
Universal pattern:
pip install roam-code
roam index
roam health --gate score>=70 # exit 1 on failure
roam --json health > report.jsonRoam exports analysis results in SARIF 2.1.0 format for GitHub Code Scanning.
from roam.output.sarif import health_to_sarif, write_sarif
sarif = health_to_sarif(health_data)
write_sarif(sarif, "roam-health.sarif")- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: roam-health.sarifZero infrastructure, zero vendor lock-in, zero data leaving your network.
| Tool | Annual cost (20-dev team) | Infrastructure | Setup time |
|---|---|---|---|
| SonarQube Server | $15,000-$45,000 | Self-hosted server | Days |
| CodeScene | $20,000-$60,000 | SaaS or on-prem | Hours |
| Code Climate | $12,000-$36,000 | SaaS | Hours |
| Roam | $0 (MIT license) | None (local) | 5 minutes |
Team rollout guide
Week 1-2 (pilot): 1-2 developers run roam init on one repo. Use roam preflight before changes, roam pr-risk before PRs.
Week 3-4 (expand): Add roam health --gate score>=60 to CI as a non-blocking check.
Month 2+ (standardize): Tighten to --gate score>=70. Expand to additional repos. Track trajectory with roam trend.
Complements your existing stack
| If you use... | Roam adds... |
|---|---|
| SonarQube | Architecture-level analysis: dependency cycles, god components, blast radius, health scoring |
| CodeScene | Free, local alternative for health scoring and hotspot analysis |
| ESLint / Pylint | Cross-language architecture checks. Linters enforce style per file; Roam enforces architecture across the codebase |
| LSP | AI-agent-optimized queries. roam context answers "what calls this?" with PageRank-ranked results in one call |
| Language | Extensions | Symbols | References | Inheritance |
|---|---|---|---|---|
| Python | .py .pyi |
classes, functions, methods, decorators, variables | imports, calls, inheritance | extends, __all__ exports |
| JavaScript | .js .jsx .mjs .cjs |
classes, functions, arrow functions, CJS exports | imports, require(), calls | extends |
| TypeScript | .ts .tsx .mts .cts |
interfaces, type aliases, enums + all JS | imports, calls, type refs | extends, implements |
| Java | .java |
classes, interfaces, enums, constructors, fields | imports, calls | extends, implements |
| Go | .go |
structs, interfaces, functions, methods, fields | imports, calls | embedded structs |
| Rust | .rs |
structs, traits, impls, enums, functions | use, calls | impl Trait for Struct |
| C / C++ | .c .h .cpp .hpp .cc |
structs, classes, functions, namespaces, templates | includes, calls | extends |
| C# | .cs |
classes, interfaces, structs, enums, records, methods, constructors, properties, delegates, events, fields | using directives, calls, new, attributes |
extends, implements |
| PHP | .php |
classes, interfaces, traits, enums, methods, properties | namespace use, calls, static calls, new |
extends, implements, use (traits) |
| Visual FoxPro | .prg |
functions, procedures, classes, methods, properties, constants | DO, SET PROCEDURE/CLASSLIB, CREATEOBJECT, =func(), obj.method() |
DEFINE CLASS ... AS |
| YAML (CI/CD) | .yml .yaml |
GitLab CI: jobs, template anchors, stages. GitHub Actions: workflow name, jobs, reusable workflows. Generic: top-level keys | extends:, needs:, !reference, uses: |
— |
| HCL / Terraform | .tf .tfvars .hcl |
resource, data, variable, output, module, provider, locals entries |
var.*, module.*, data.*, local.*, resource cross-refs |
— |
| Vue | .vue |
via <script> block extraction (TS/JS) |
imports, calls, type refs | extends, implements |
| Svelte | .svelte |
via <script> block extraction (TS/JS) |
imports, calls, type refs | extends, implements |
Salesforce ecosystem (Tier 1)
| Language | Extensions | Symbols | References |
|---|---|---|---|
| Apex | .cls .trigger |
classes, triggers, SOQL, annotations | imports, calls, System.Label, generic type refs |
| Aura | .cmp .app .evt .intf .design |
components, attributes, methods, events | controller refs, component refs |
| LWC (JavaScript) | .js (in LWC dirs) |
anonymous class from filename | @salesforce/apex/, @salesforce/schema/, @salesforce/label/ |
| Visualforce | .page .component |
pages, components | controller/extensions, merge fields, includes |
| SF Metadata XML | *-meta.xml |
objects, fields, rules, layouts | Apex class refs, formula field refs, Flow actionCalls |
Cross-language edges mean roam impact AccountService shows blast radius across Apex, LWC, Aura, Visualforce, and Flows.
| Ruby | .rb | classes, modules, methods, singleton methods, constants | require, require_relative, include/extend, calls, ClassName.new | class inheritance |
| JSONC | .jsonc | via JSON grammar | -- | -- |
| MDX | .mdx | via Markdown grammar | -- | -- |
Kotlin (.kt .kts), Swift (.swift), Scala (.scala .sc)
Tier 2 languages get symbol extraction and basic inheritance via a generic tree-sitter walker.
| Metric | Value |
|---|---|
| Index 200 files | ~3-5s |
| Index 3,000 files | ~2 min |
| Incremental (no changes) | <1s |
| Any query command | <0.5s |
Detailed benchmarks
| Project | Language | Files | Symbols | Edges | Index Time | Rate |
|---|---|---|---|---|---|---|
| Express | JS | 211 | 624 | 804 | 3s | 70 files/s |
| Axios | JS | 237 | 1,065 | 868 | 6s | 41 files/s |
| Vue | TS | 697 | 5,335 | 8,984 | 25s | 28 files/s |
| Laravel | PHP | 3,058 | 39,097 | 38,045 | 1m46s | 29 files/s |
| Svelte | TS | 8,445 | 16,445 | 19,618 | 2m40s | 52 files/s |
| Repo | Language | Score | Coverage | Edge Density | Commands |
|---|---|---|---|---|---|
| Laravel | PHP | 9.55 | 91.2% | 0.97 | 29/29 |
| Vue | TS | 9.27 | 85.8% | 1.68 | 29/29 |
| Svelte | TS | 9.04 | 94.7% | 1.19 | 29/29 |
| Axios | JS | 8.98 | 85.9% | 0.82 | 29/29 |
| Express | JS | 8.46 | 96.0% | 1.29 | 29/29 |
| Metric | Value |
|---|---|
1,600-line file → roam file |
~5,000 chars (~70:1 compression) |
| Full project map | ~4,000 chars |
--compact mode |
40-50% additional token reduction |
roam preflight replaces |
5-7 separate agent tool calls |
Codebase
|
[1] Discovery ──── git ls-files (respects .gitignore + .roamignore)
|
[2] Parse ──────── tree-sitter AST per file (26 languages)
|
[3] Extract ────── symbols + references (calls, imports, inheritance)
|
[4] Resolve ────── match references to definitions → edges
|
[5] Metrics ────── adaptive PageRank, betweenness, cognitive complexity, Halstead
|
[6] Algorithms ── 23-pattern anti-pattern catalog (O(n^2) loops, N+1, recursion)
|
[7] Git ────────── churn, co-change matrix, authorship, Renyi entropy
|
[8] Clusters ───── Louvain community detection
|
[9] Health ─────── per-file scores (7-factor) + composite score (0-100)
|
[10] Store ─────── .roam/index.db (SQLite, WAL mode)
After the first full index, roam index only re-processes changed files (mtime + SHA-256 hash). Incremental updates are near-instant.
Graph algorithms
- Adaptive PageRank -- damping factor auto-tunes based on cycle density (0.82-0.92); identifies the most important symbols (used by
map,search,context) - Personalized PageRank -- distance-weighted blast radius for
impact(Gleich, 2015) - Adaptive betweenness centrality -- exact for small graphs, sqrt-scaled sampling for large (Brandes & Pich, 2007); finds bottleneck symbols
- Edge betweenness centrality -- identifies critical cycle-breaking edges in SCCs (Brandes, 2001)
- Tarjan's SCC -- detects dependency cycles with tangle ratio
- Propagation Cost -- fraction of system affected by any change, via transitive closure (MacCormack, Rusnak & Baldwin, 2006)
- Algebraic connectivity (Fiedler value) -- second-smallest Laplacian eigenvalue; measures architectural robustness (Fiedler, 1973)
- Louvain community detection -- groups related symbols into clusters
- Modularity Q-score -- measures if cluster boundaries match natural community structure (Newman, 2004)
- Conductance -- per-cluster boundary tightness: cut(S, S_bar) / min(vol(S), vol(S_bar)) (Yang & Leskovec)
- Topological sort -- computes dependency layers, Gini coefficient for layer balance (Gini, 1912), weighted violation severity
- k-shortest simple paths -- traces dependency paths with coupling strength
- Renyi entropy (order 2) -- measures co-change distribution; more robust to outliers than Shannon (Renyi, 1961)
- Mann-Kendall trend test -- non-parametric degradation detection, robust to noise (Mann, 1945; Kendall, 1975)
- Sen's slope estimator -- robust trend magnitude, resistant to outliers (Sen, 1968)
- NPMI -- Normalized Pointwise Mutual Information for coupling strength (Bouma, 2009)
- Lift -- association rule mining metric for co-change statistical significance (Agrawal & Srikant, 1994)
- Halstead metrics -- volume, difficulty, effort, and predicted bugs from operator/operand counts (Halstead, 1977)
- SQALE remediation cost -- time-to-fix estimates per issue type for tech debt prioritization (Letouzey, 2012)
- Algorithm anti-pattern catalog -- 23 patterns detecting suboptimal algorithms (quadratic loops, N+1 queries, quadratic string building, branching recursion, manual top-k, loop-invariant calls) with confidence calibration via caller-count and bounded-loop analysis
Health scoring
Composite health score (0-100) using a weighted geometric mean of sigmoid health factors. Non-compensatory: a zero in any dimension cannot be masked by high scores in others.
| Factor | Weight | What it measures |
|---|---|---|
| Tangle ratio | 30% | % of symbols in dependency cycles |
| God components | 20% | Symbols with extreme fan-in/fan-out |
| Bottlenecks | 15% | High-betweenness chokepoints |
| Layer violations | 15% | Upward dependency violations (severity-weighted by layer distance) |
| Per-file health | 20% | Average of 7-factor file health scores |
Each factor uses sigmoid health: h = e^(-signal/scale) (1 = pristine, approaches 0 = worst). Score = 100 * product(h_i ^ w_i). Also reports propagation cost (MacCormack 2006) and algebraic connectivity (Fiedler 1973). Per-file health (1-10) combines: cognitive complexity (triangular nesting penalty per Sweller's Cognitive Load Theory), indentation complexity, cycle membership, god component membership, dead export ratio, co-change entropy, and churn amplification.
Roam is not a replacement for your linter, LSP, or SonarQube. It fills a different gap: giving AI agents structural understanding of the codebase in a format optimized for LLM consumption.
| Tool | What it does | How Roam differs |
|---|---|---|
| ctags / cscope | Symbol index for editors | Roam adds graph metrics, git signals, architecture analysis, and AI-optimized output |
| LSP (pyright, gopls) | Real-time type checking | LSP requires a running server and file:line:col queries. Roam is offline, exploratory, and cross-language |
| Sourcegraph / Cody | Code search + AI | Requires hosted deployment. Roam is local-only, MIT-licensed, zero infrastructure |
| Aider repo map | Tree-sitter + PageRank | Context selection for chat. Roam adds git signals, 95 architecture commands, CI gates, multi-agent orchestration |
| CodeScene | Behavioral code analysis | Commercial SaaS ($20-60k/yr). Roam is free, local, uses peer-reviewed algorithms (Mann-Kendall, NPMI, Personalized PageRank) |
| SonarQube | Code quality + security | Heavy server ($15-45k/yr). Roam's cognitive complexity follows SonarSource spec |
| Serena MCP | LSP-based symbol navigation | 6 MCP tools for navigation. Roam has 61 MCP tools covering architecture, governance, simulation, and orchestration |
| Repomix / code2prompt | Codebase packing for LLMs | Flat file packing with no graph intelligence. Roam gives structural queries, not raw file dumps |
| Augment Code | Cloud context engine | Cloud-hosted, enterprise-priced. Roam is 100% local, air-gapped, MIT-licensed |
| grep / ripgrep | Text search | No semantic understanding. Can't distinguish definitions from usage |
Does Roam send any data externally? No. Zero network calls. No telemetry, no analytics, no update checks.
Can Roam run in air-gapped environments? Yes. Once installed, no internet access is required.
Does Roam modify my source code?
Read-only by default. Creates .roam/ with an index database. The roam mutate command can apply code changes (move/rename/extract) but defaults to --dry-run mode — you must explicitly pass --apply to write changes.
How does Roam handle monorepos? Indexes from the root. Batched SQL handles 100k+ symbols. Incremental updates stay fast.
How does Roam handle multi-repo projects (e.g., frontend + backend)?
Use roam ws init <repo1> <repo2> to create a workspace. Each repo keeps its own index; a workspace overlay DB stores cross-repo API edges. roam ws resolve scans for REST endpoints and matches frontend calls to backend routes. Then roam ws context, roam ws trace, etc. work across repos.
Is Roam compatible with SonarQube / CodeScene? Yes. Roam complements existing tools. Both can run in the same CI pipeline. SARIF output integrates with GitHub Code Scanning.
Static analysis trade-offs:
- Static analysis primarily -- can't trace dynamic dispatch, reflection, or eval'd code. Runtime trace ingestion (
roam ingest-trace) adds production data but requires external trace export - Import resolution is heuristic -- complex re-exports or conditional imports may not resolve
- Limited cross-language edges -- Salesforce, Protobuf, REST API, and multi-repo edges are supported, but not arbitrary FFI
- Tier 2 languages (Kotlin, Swift, Scala) get basic symbol extraction only
- Large monorepos (100k+ files) may have slow initial indexing
| Problem | Solution |
|---|---|
roam: command not found |
Ensure install location is on PATH. For uv: uv tool update-shell |
Another indexing process is running |
Delete .roam/index.lock and retry |
database is locked |
roam index --force to rebuild |
| Unicode errors on Windows | chcp 65001 for UTF-8 |
| Symbol resolves to wrong file | Use file:symbol syntax: roam symbol myfile:MyFunction |
| Health score seems wrong | roam health --json for factor breakdown |
Index stale after git pull |
roam index (incremental). After major refactors: roam index --force |
# Update
pipx upgrade roam-code
uv tool upgrade roam-code
pip install --upgrade roam-code
# Uninstall
pipx uninstall roam-code
uv tool uninstall roam-code
pip uninstall roam-codeDelete .roam/ from your project root to clean up local data.
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e ".[dev]" # includes pytest, ruff
pytest tests/ # 2656 tests, Python 3.9-3.13
# Or use Make targets:
make dev # install with dev extras
make test # run tests
make lint # ruff checkProject structure
roam-code/
├── pyproject.toml
├── action.yml # Reusable GitHub Action
├── src/roam/
│ ├── __init__.py # Version (from pyproject.toml)
│ ├── cli.py # Click CLI (95 commands, 7 categories)
│ ├── mcp_server.py # MCP server (61 tools, 2 resources)
│ ├── db/
│ │ ├── connection.py # SQLite (WAL, pragmas, batched IN)
│ │ ├── schema.py # Tables, indexes, migrations
│ │ └── queries.py # Named SQL constants
│ ├── index/
│ │ ├── indexer.py # Orchestrates full pipeline
│ │ ├── discovery.py # git ls-files, .gitignore
│ │ ├── parser.py # Tree-sitter parsing
│ │ ├── symbols.py # Symbol + reference extraction
│ │ ├── relations.py # Reference resolution -> edges
│ │ ├── complexity.py # Cognitive complexity (SonarSource) + Halstead metrics
│ │ ├── git_stats.py # Churn, co-change, blame, Renyi entropy
│ │ ├── incremental.py # mtime + hash change detection
│ │ ├── file_roles.py # Smart file role classifier
│ │ └── test_conventions.py # Pluggable test naming adapters
│ ├── languages/
│ │ ├── base.py # Abstract LanguageExtractor
│ │ ├── registry.py # Language detection + aliasing
│ │ ├── *_lang.py # One file per language (17 Tier 1)
│ │ └── generic_lang.py # Tier 2 fallback
│ ├── bridges/
│ │ ├── base.py, registry.py # Cross-language bridge framework
│ │ ├── bridge_salesforce.py # Apex <-> Aura/LWC/Visualforce
│ │ └── bridge_protobuf.py # .proto -> Go/Java/Python stubs
│ ├── catalog/
│ │ ├── tasks.py # Universal algorithm catalog (23 patterns)
│ │ └── detectors.py # Anti-pattern detectors with confidence calibration
│ ├── workspace/
│ │ ├── config.py # .roam-workspace.json
│ │ ├── db.py # Workspace overlay DB
│ │ ├── api_scanner.py # REST API endpoint detection
│ │ └── aggregator.py # Cross-repo aggregation
│ ├── graph/
│ │ ├── builder.py, pagerank.py # DB -> NetworkX, PageRank
│ │ ├── cycles.py, clusters.py # Tarjan SCC, propagation cost, Louvain, modularity Q
│ │ ├── layers.py, pathfinding.py # Topo layers, k-shortest paths
│ │ ├── split.py, why.py # Decomposition, role classification
│ │ └── anomaly.py # Statistical anomaly detection
│ ├── commands/
│ │ ├── resolve.py # Shared symbol resolution
│ │ ├── graph_helpers.py # Shared graph utilities (adj builders, BFS)
│ │ ├── context_helpers.py # Data-gathering helpers for context command
│ │ ├── gate_presets.py # Framework-specific gate rules
│ │ └── cmd_*.py # One module per command
│ ├── analysis/
│ │ └── effects.py # Side-effect classification engine
│ ├── refactor/
│ │ ├── codegen.py # Import generation (Python/JS/Go)
│ │ └── transforms.py # move/rename/add-call/extract transforms
│ ├── rules/
│ │ └── engine.py # YAML rule parser + graph query evaluator
│ ├── runtime/
│ │ ├── trace_ingest.py # OpenTelemetry/Jaeger/Zipkin ingestion
│ │ └── hotspots.py # Runtime hotspot analysis
│ ├── search/
│ │ ├── tfidf.py # TF-IDF semantic search engine
│ │ └── index_embeddings.py # Embedding index builder
│ ├── security/
│ │ ├── vuln_store.py # CVE/vulnerability storage
│ │ └── vuln_reach.py # Vulnerability reachability paths
│ └── output/
│ ├── formatter.py # Token-efficient formatting
│ ├── sarif.py # SARIF 2.1.0 output
│ └── schema_registry.py # JSON envelope schema versioning
└── tests/ # Test suite across 70 test files
| Package | Purpose |
|---|---|
| click >= 8.0 | CLI framework |
| tree-sitter >= 0.23 | AST parsing |
| tree-sitter-language-pack >= 0.6 | 165+ grammars |
| networkx >= 3.0 | Graph algorithms |
Optional: fastmcp >= 2.0 (MCP server — install with pip install roam-code[mcp])
- Composite health scoring (v7.0)
- MCP server -- 19 tools, 2 resources (v7.0-v7.4)
- SARIF 2.1.0 output (v7.0)
- GitHub Action (v7.0)
- Large-repo batched SQL (v7.1)
- Salesforce cross-language edges (v7.1)
- Cognitive load index, tour, diagnose (v7.2)
- Multi-repo workspace support (v7.4)
- Research-backed algorithms: adaptive PageRank, Personalized PageRank, Mann-Kendall, NPMI, Sen's slope, sigmoid-bounded health, Gini layer balance (v7.4)
- Advanced math: Halstead metrics, Renyi entropy, propagation cost, algebraic connectivity, modularity Q-score, conductance, edge betweenness, SQALE remediation cost, multiplicative PR risk, weighted geometric mean health, dead code confidence scoring, cyclomatic density (v7.5)
- C# Tier 1 support (v8.0)
- Deep Python extractor: instance attrs, assignment type refs, forward refs (v8.1)
- Internal complexity reduction: 50+ functions refactored below CC=25 (v9.0)
- Scoring math audit: fixed boolean-op double-counting, unified percentile implementations (v9.0)
- Test speed optimization: in-process indexing for fixtures (v9.0)
- Algorithm anti-pattern detection: 23-pattern catalog, AST signal extraction, confidence calibration (v9.0)
-
.roamignoresupport for excluding files from indexing (v9.1) - Implicit N+1 I/O detection: ORM model
$appends/accessor lazy-load analysis (v9.1) - 7 new backend analysis commands:
n1,auth-gaps,over-fetch,missing-index,orphan-routes,migration-safety,api-drift(v9.1) - Ruby Tier 1 support: classes, modules, methods, constants, require/include (v9.1)
-
--sarifCLI flag for direct SARIF export on dead, health, complexity, rules (v9.1) - Architecture simulation:
roam simulate move|extract|merge|delete(v9.1) - Multi-agent orchestration:
roam orchestrate --agents Nwith zero-conflict partitioning (v9.1) - Graph-level editing:
roam mutate move|rename|add-call|extract(v9.1) - Vulnerability mapping:
roam vuln-map+roam vuln-reachwith CVE reachability paths (v9.1) - Runtime trace overlay:
roam ingest-trace+roam hotspots(v9.1) - Governance DSL:
roam ruleswith.roam/rules/YAML plugin system (v9.1) - Topology fingerprinting:
roam fingerprintwith cross-repo comparison (v9.1) - 30+ new commands: simulate, orchestrate, mutate, closure, adversarial, plan, invariants, bisect, intent, cut, effects, dark-matter, capsule, forecast, path-coverage, fingerprint, rules, vuln-map, vuln-reach, ingest-trace, hotspots, and more (v9.1)
- MCP lite mode:
ROAM_MCP_LITE=1for 15 core tools (v10.0) - YAML/HCL Tier 1 support: CI/CD pipelines, Terraform configs (v10.0)
- Compact annotated minimap for CLAUDE.md injection (v10.0)
- Algorithm detection false-positive reduction: receiver-aware loop-invariant analysis (v10.0)
- Terminal demo GIF
- Docker image for CI
- VS Code extension (CodeLens for callers/callees, inline health indicators)
- File-system watch mode for sub-second incremental re-indexing
- Embedding-based semantic search via local models (Ollama integration)
- Official GitHub Action marketplace listing
- Token budget management (
--max-tokensflag for context-aware output)
git clone https://github.com/Cranot/roam-code.git
cd roam-code
pip install -e .
pytest tests/ # All 2656 tests must passGood first contributions: add a Tier 1 language (see go_lang.py or php_lang.py as templates), improve reference resolution, add benchmark repos, extend SARIF converters, add MCP tools.
Please open an issue first to discuss larger changes.