feat(memory): add citation support, line tracking, and cognitive routines#63
feat(memory): add citation support, line tracking, and cognitive routines#63joe-rlo wants to merge 25 commits intonearai:mainfrom
Conversation
- Add 'path' field to RankedResult and SearchResult structs - Include document path in FTS and vector search queries - Add 'citation' field to memory_search tool output - Prepare for future line number tracking (line_start/line_end fields) Phase 1 of citation support - path-level citations work now, line-level citations need chunker changes (Phase 2).
- Add ChunkWithPosition struct with line_start/line_end - Add chunk_document_with_positions() function - Add line_start/line_end fields to MemoryChunk - Add citation() method for formatting 'line X' or 'lines X-Y' - Add comprehensive tests for position tracking Enables citations like 'Source: daily/2026-02-12.md#lines 15-23'
- Add CheckpointTracker for tracking exchanges and triggering checkpoints - Add pre_game_instructions() for task preparation checklist - Add after_action_template() for post-task reviews - Add post_compaction_recovery() for context loss recovery - Add CognitiveConfig for enabling/disabling features - Add write_checkpoint() helper for workspace integration Implements the cognitive architecture patterns from OpenClaw: - Pre-game routine before task execution - Mandatory checkpointing every N exchanges (default 15) - After-action review templates - Post-compaction recovery instructions
- Add cognitive_instructions() helper function - Include mandatory memory recall requirements in system prompt - Add pre-game routine checklist - Add checkpointing reminder Ensures the agent follows recall-first patterns and pre-game routines automatically via system prompt injection.
- Add GeminiEmbeddings struct with embedContent and batchEmbedContents support - Support gemini-embedding-001 (768 dims) and text-embedding-004 - Proper error handling for auth, rate limits, and API errors - Export from workspace module
- Add V9 migration for line_start/line_end columns - Add insert_chunk_with_lines() for storing line positions - Update FTS and vector search queries to return line numbers - Update RankedResult and SearchResult with line fields - Format citations as 'path#line N' or 'path#lines N-M' Citations now work end-to-end: 1. Chunker tracks line positions 2. DB stores line_start/line_end 3. Search returns line info 4. memory_search formats proper citations
- Add 'index' command for reindexing embeddings - Add --full flag to search for complete content - Add --from/--lines flags to read for line ranges - Show citations with line numbers in search results - Add has_embeddings() helper to Workspace CLI now supports: ironclaw memory search 'query' --limit 10 --full ironclaw memory read MEMORY.md --from 15 --lines 20 ironclaw memory index [--force]
- normalize_path() now rejects '..' sequences and null bytes - Added validate_path() wrapper with proper error handling - Added InvalidPath variant to WorkspaceError - All file ops (read/write/append/delete/list) now validate paths Security audit found: - [FIXED] Path traversal in workspace operations - [DOCUMENTED] Log injection in cognitive routines (caller responsibility) - [DOCUMENTED] SQL stored proc in repository (external validation) Tests: - 4 new unit tests for path validation - Created tests/security_integration.rs (8 tests) - Created tests/cognitive_integration.rs (9 tests) - Created tests/chunker_integration.rs (14 tests) - All 24 new tests pass
Summary of ChangesHello @joe-rlo, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly upgrades the agent's memory and operational capabilities. It introduces precise source citation for memory search results, enhances agent reliability through structured cognitive routines like checkpointing and pre-task planning, and strengthens the system's security by preventing malicious path manipulations. These changes aim to make the agent more robust, transparent, and capable of handling complex, long-running tasks. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces key features from OpenClaw to IronClaw, including memory citations with line number tracking, cognitive routines, and significant security hardening against path traversal. However, critical security issues were identified, specifically a potential prompt injection vector in the cognitive checkpointing logic and the insecure transmission of API keys in the Gemini embedding provider. Furthermore, the new citation feature is currently ineffective for new or modified documents because the reindex_document function incorrectly uses chunk_document instead of chunk_document_with_positions. Minor comments were also left regarding code efficiency, CLI usability (including a panic condition in line-range reading logic), and potential refactoring.
- fix(security): Move Gemini API key from URL query param to x-goog-api-key header - fix(security): Add sanitize_checkpoint_text() to prevent markdown/prompt injection - refactor: Add SearchResult::citation() method to centralize citation formatting - refactor: Improve add_topic() efficiency (avoid unnecessary String allocation) - cleanup: Remove unimplemented --force flag from memory index command Addresses all 5 issues from Gemini Code Assist review: nearai#63
|
/gemini review |
There was a problem hiding this comment.
Code Review
This is a great PR that adds significant new capabilities to the memory system, including citation support with line numbers, cognitive routines for the agent, and important security hardening against path traversal. The code is well-structured and includes a comprehensive set of new tests.
I have a couple of suggestions for improvement in the comments.
One critical point that is not in the changed files but is highly relevant to this PR's features: the reindex_document function in src/workspace/mod.rs does not seem to be updated to use the new chunk_document_with_positions function. This means that documents created or updated via write and append will not have line number information stored for their chunks, which undermines the new citation feature. This seems like a critical oversight that should be addressed for the new features to work as intended.
Critical fix: reindex_document was using chunk_document() instead of chunk_document_with_positions(), meaning line numbers were never stored. Now properly tracks line_start/line_end for all chunks. Thanks to Gemini Code Assist for catching this oversight.
Replace verbose match block with map_or + iterator methods. More idiomatic Rust and easier to read.
Renamed test_checkpoint_content_escaping to test_checkpoint_content_sanitization. Now verifies specific sanitization transformations: - # escaped to \# - Newlines flattened to spaces - --- replaced with en-dashes - ``` replaced with '''
Code Review: PR #63 - "feat(memory): add citation support, line tracking, and cognitive routines"SummaryThis PR introduces significant features to the IronClaw memory system:
The PR adds ~1600 new lines across 12 files with comprehensive test coverage. The changes are well-structured and include proper migration scripts. ProsExcellent Test Coverage
Good Security Practices
Clean Code Organization
Practical Features
Database Design
Rust Best Practices
Concerns1. Potential Bug in
|
Memory Guardian adds automatic memory discipline that doesn't rely on the agent choosing to follow instructions. It operates at the system level, writing breadcrumbs and injecting checkpoint pressure regardless of agent attention state. ## New Features (cognitive.rs) - Pre-compaction breadcrumbs: Raw state snapshot written synchronously before context compression. Acts as a 'black box recorder' — survives even when the LLM-generated compaction summary is lossy or incomplete. - Auto-breadcrumbs: Every 10 tool calls, automatically append a breadcrumb to daily notes with recent tool names and call counts. Creates a paper trail even if the agent never explicitly writes a checkpoint. - Checkpoint gate: Escalating pressure injected into the system prompt when too many tool calls happen without a checkpoint write: - Gentle (12 calls): 'Consider checkpointing' - Firm (20 calls): 'Checkpoint needed' - Urgent (30 calls): 'STOP and checkpoint NOW' Uses escalation (not hard blocking) because hard blocks would break legitimate multi-step workflows. - Memory search tracking: Counts turns since last memory_search call, reminds agent to search before answering memory questions (default: 8). - Tool calls as metric (not just exchanges): Tool calls are a better proxy for 'work that could be lost' — an agent might have 3 exchanges but 30 tool calls of file reads, shell commands, and code writes. ## Design Decisions - Escalating vs hard blocking: Escalating pressure works with the agent's attention mechanism. Urgent-level messages are emphatic enough to interrupt most workflows without breaking them. - Breadcrumbs don't reset checkpoint counter: If they did, the agent would never feel pressure to write its own checkpoints — the guardian would always relieve the pressure first. - CheckpointTracker on Thread (not Session): Each conversation thread has independent memory state. Not serialized — resets on session reload, which is fine since it's a running tally for the current conversation. ## Integration Points (agent_loop.rs) - Pre-compaction breadcrumb before auto-compaction and manual /compact - Pre-reset breadcrumb before /new thread creation - Tool call tracking after every tool execution - Guardian context (gate + search reminder) injected into system prompt - Exchange tracking on turn start ## Also Fixed - workspace/mod.rs: Fixed normalize_path Option<String> handling, added insert_chunk_with_lines to WorkspaceStorage enum, replaced missing normalize_directory with normalize_path - cli/memory.rs: Fixed missing pattern fields and arguments for Search (full) and Read (from, lines) commands, added Index arm ## Tests 14 new unit tests covering: - Tool call tracking and breadcrumb triggering - Checkpoint detection (writes to daily/*.md) - Checkpoint gate escalation at 12/20/30 thresholds - Memory search reminder timing - Rolling tool window (last 10) - Breadcrumb content generation - Counter reset behavior (breadcrumb vs checkpoint) - Config defaults
ilblackdragon
left a comment
There was a problem hiding this comment.
Review: feat(memory): add citation support, line tracking, and cognitive routines
This is a well-designed PR that brings valuable features (citations, cognitive routines, path traversal prevention). The architecture of the memory guardian with its dual-layer approach (prompt-level guidance + system-level enforcement) is thoughtful and well-documented. However, there are several issues that need to be addressed before merging.
Blocking: Compilation Failure (libSQL Backend)
The PR does not compile. cargo check fails with 3 errors in src/db/libsql_backend.rs:
error[E0063]: missing fields `line_end` and `line_start` in initializer of `MemoryChunk`
--> src/db/libsql_backend.rs:2403:25
error[E0063]: missing fields `line_end`, `line_start` and `path` in initializer of `RankedResult`
--> src/db/libsql_backend.rs:2460:30
error[E0063]: missing fields `line_end`, `line_start` and `path` in initializer of `RankedResult`
--> src/db/libsql_backend.rs:2509:30
The MemoryChunk and RankedResult structs gained new fields (line_start, line_end, path), but the libSQL backend was not updated. Per CLAUDE.md: "All new features that touch persistence MUST support both backends." Additionally, src/db/libsql_migrations.rs needs the line_start and line_end columns added to the memory_chunks table schema.
Fixes needed:
- Add
line_start: None, line_end: Noneto theMemoryChunkconstruction at line 2403 - Add
path: String::new(), line_start: None, line_end: Noneto theRankedResultconstructions at lines 2460 and 2509 (ideally, joinmemory_documentsto fetch the actual path, same as the postgres repository does) - Update the FTS and vector search SQL queries in
libsql_backend.rsto SELECTd.path,c.line_start,c.line_end - Add
line_startandline_endcolumns to thememory_chunksCREATE TABLE inlibsql_migrations.rs
Blocking: Integration Test Won't Compile
tests/cognitive_integration.rs line 115 constructs CognitiveConfig with only 4 of 8 fields:
let config = CognitiveConfig {
pre_game_enabled: false,
checkpointing_enabled: true,
checkpoint_interval: 20,
after_action_enabled: true,
};Missing: guardian_enabled, breadcrumb_interval, gate_thresholds, memory_search_reminder_interval. This won't compile. Either add the missing fields or use ..Default::default().
.unwrap() in Production Code
Per CLAUDE.md: "Never use .unwrap() or .expect() in production code."
In src/workspace/chunker.rs, inside chunk_document_with_positions:
let last: ChunkWithPosition = chunks.pop().unwrap();While this is logically safe (guarded by !chunks.is_empty()), it violates the project convention. Suggestion:
let Some(last) = chunks.pop() else { break; };Feature Gate Issue
The PR removes #[cfg(feature = "postgres")] from pub use repository::Repository (in src/workspace/mod.rs) but the mod repository declaration still has the feature gate. With default features ["postgres", "libsql"] this works, but building with --no-default-features --features libsql would fail because mod repository is not compiled while pub use repository::Repository is unconditional. Either:
- Keep the
#[cfg(feature = "postgres")]on thepub use, or - Also remove it from
mod repository(if Repository should be available for all backends)
Minor Issues
-
Trailing whitespace: Multiple files have trailing whitespace on blank lines (e.g.,
src/cli/memory.rslines with trailing spaces). Runcargo fmtto clean up. -
super::in tests:src/agent/cognitive.rsusesuse super::*;in the test module (line 855 of the diff). Per project conventions,crate::imports are preferred oversuper::. However,use super::*inmod testswithin the same file is a common Rust pattern and could be considered an exception here. -
Security test dependency on PostgreSQL: All tests in
tests/security_integration.rsrequire a PostgreSQL connection (deadpool_postgres::Pool). These tests will fail in CI environments without a running Postgres instance. Consider either:- Gating them behind the
integrationfeature - Adding a connection check that skips the test if Postgres is unavailable
- Gating them behind the
-
GeminiEmbeddings added but not documented: The PR adds a full
GeminiEmbeddingsprovider (~240 lines) which is not mentioned in the PR description. This is a separate feature that could be its own PR. It also doesn't appear in the configuration docs or CLAUDE.md.
Positive Notes
- Excellent documentation throughout
cognitive.rs-- the "Why" sections for design decisions are exactly the right level of documentation - Path traversal prevention is well-implemented with both
..and null byte checks - The escalating checkpoint gate (gentle/firm/urgent) is a smart design choice
- Comprehensive test coverage: 31 new tests covering the core logic
- The sanitization of checkpoint text to prevent markdown/prompt injection is a nice security touch
- Clean separation between prompt-level guidance and system-level enforcement
Please fix the compilation failures and I'm happy to approve.
GitHub PR Review Batch 3 - nearai/ironclawPR #71: Fix undo/redo deadlock and checkpoint behaviorSummaryThis PR addresses critical concurrency issues in the undo/redo system. The main changes are:
Pros
Concerns
Suggestions
PR #66: Update README architecture diagram to use Unicode box drawingSummaryThis PR updates the README.md architecture diagram from ASCII art to Unicode box drawing characters. The change improves readability and visual appeal of the system architecture diagram, using characters like Pros
Concerns
Suggestions
PR #63: Memory Guardian and Cognitive RoutinesSummaryThis is a large PR (894 lines of new code) adding a comprehensive memory management system with two layers: Cognitive Routines (Prompt-Level):
Memory Guardian (System-Level):
The PR also adds line number tracking to memory chunks for citation support (V9 migration), checkpoint tracker to Thread struct, and extensive testing (15 new tests). Pros
Concerns
Suggestions
PR #62: Add Tinfoil private inference providerSummaryThis PR adds support for Tinfoil as a new LLM backend. The changes include:
Pros
Concerns
Suggestions
PR #61: Add PostgreSQL test workflow for CISummaryThis PR adds a PostgreSQL with pgvector service to the GitHub Actions test workflow. The changes enable running tests against a real PostgreSQL database instead of libSQL-only testing. The service is configured with pgvector/pgvector:pg16 image and health checks. Pros
Concerns
Suggestions
PR #57: Sandbox job monitoring and credential grants persistenceSummaryThis is a large PR (~6333 lines touched) introducing several features:
Pros
Concerns
Suggestions
SummaryHighest Quality PRs
Needs Splitting
Minor Concerns
Recommendations
|
- Add line_start, line_end to MemoryChunk in libsql_backend - Add path, line_start, line_end to RankedResult in FTS/vector queries - Add line_start, line_end columns to libsql migrations schema - Add get_opt_i64 helper for nullable integer columns - Fix CognitiveConfig in tests with ..Default::default() - Replace .unwrap() with safe patterns in chunker.rs - Add #[cfg(feature = "postgres")] to pub use repository::Repository
- Add connection check to security tests (skip if Postgres unavailable) - Document GeminiEmbeddings in CLAUDE.md config section - Update feature parity to mention Gemini embeddings provider
Summary
This PR ports key memory system improvements from OpenClaw to IronClaw, focusing on better search citations, checkpointing, and agent cognitive routines.
Changes
1. Memory Citations with Line Numbers
ChunkWithPositionstruct to track line numbers during chunkingchunk_document_with_positions()function preserves source locationsMemoryChunknow includesline_start/line_endfieldspath#L15-L23)2. Cognitive Routines Module (
src/agent/cognitive.rs)CognitiveConfig- configurable checkpointing intervals and feature flagsCheckpointTracker- tracks exchanges, topics, and decisions for periodic checkpointspre_game_instructions()- 5-step checklist before task executionafter_action_template()- structured post-task review formatpost_compaction_recovery()- recovery instructions after context losscognitive_instructions()in Workspace3. Security Hardening
normalize_path()now rejects..sequences and null bytesvalidate_path()wrapper with properInvalidPatherror handlingTesting
tests/cognitive_integration.rs(9 tests)tests/chunker_integration.rs(14 tests)tests/security_integration.rs(8 tests)Motivation
These changes address feature gaps identified when comparing IronClaw to OpenClaw:
Breaking Changes
None. All changes are additive and backward compatible.
Happy to iterate on any feedback!