feat(memory): add file locking to support multi-instance #3286

xz-dev · 2026-02-03T09:14:56Z

Description

This PR implements proper cross-process file locking for the memory server using proper-lockfile to ensure atomic read-modify-write operations across multiple MCP server instances.

Key changes:

Add proper-lockfile dependency for cross-process file locking
Wrap all write operations (createEntities, deleteEntities, createRelations, deleteRelations, addObservations, deleteObservations) in withLock() to ensure atomic read-modify-write
Use write-file-atomic for more robust atomic file writes
Add comprehensive lock tests validating consistency under heavy concurrent writes

Why PR #3060's in-memory lock solution is insufficient

PR #3060 attempts to fix Issue #2579's race condition using an in-memory Promise-based lock. While this works for concurrent operations within a single process, it fundamentally misses the real-world deployment model:

MCP servers using stdio transport are spawned as separate processes per client.

┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐
│  Claude Code    │   │   Claude Code   │   │   Claude Code   │
└────────┬────────┘   └────────┬────────┘   └────────┬────────┘
         │                     │                     │
         ▼                     ▼                     ▼
┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐
│mcp-server-memory│   │mcp-server-memory│   │mcp-server-memory│
│   (Process A)   │   │   (Process B)   │   │   (Process C)   │
│ [In-memory lock]│   │ [In-memory lock]│   │ [In-memory lock]│
└────────┬────────┘   └────────┬────────┘   └────────┬────────┘
         │                     │                     │
         └──────────┬──────────┴──────────┬──────────┘
                    ▼                     ▼
              ┌─────────────────────────────────┐
              │         memory.json             │
              │   (STILL VULNERABLE TO RACES!)  │
              └─────────────────────────────────┘

PR #3060 acknowledges this limitation:

"⚠️ Important: This is a Temporary Measure for Single Editor Usage"
"If you use multiple LLM clients or editors concurrently with the same memory file, you may still experience race conditions."

This multi-client scenario is the default use case for many users, not an edge case.

Server Details

Server: memory
Changes to: tools (all write operations now wrapped with cross-process file lock)

Motivation and Context

The memory server experiences file corruption when multiple MCP server instances (spawned by different AI clients) attempt concurrent writes to the same memory.json file.

This is critical because:

Many users run multiple AI clients (Claude Desktop, VS Code, Cursor, CLI) simultaneously
Each client spawns its own mcp-server-memory process via stdio
All processes share the same memory file, causing race conditions and data corruption
The in-memory lock in PR Added in-memory locks for the process, fix for Bug #2579 temporary measure. #3060 only protects within a single process, not across processes

How Has This Been Tested?

1. Single-process concurrent writes (10,000 operations)

File: knowledge-graph.test.ts
Scenario: 10,000 concurrent createEntities calls within a single process
Purpose: Validates that the lock mechanism correctly serializes rapid async operations and prevents race conditions from Promise concurrency

2. Multi-process concurrent writes (5 processes × 2,000 = 10,000 operations)

File: multi-process-lock.test.ts
Scenario: Spawns 5 separate Node.js processes, each writing 2,000 entities simultaneously to the same file
Purpose: Simulates the real-world MCP deployment where multiple AI clients spawn independent mcp-server-memory processes via stdio transport

Additional testing:

Verified lock acquisition and release under heavy contention
Tested with multiple simultaneous MCP client connections to same memory file
All existing tests continue to pass

Breaking Changes

No breaking changes:

Same API and tool signatures
Same file format (JSONL)
No configuration changes required

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Protocol Documentation
My changes follows MCP security best practices
I have updated the server's README accordingly
I have tested this with an LLM client
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have documented all environment variables and configuration options

Additional context

Related Issues and PRs

Fixes Bug Report and Resolution: mcp-server-memory Failures due to Race Condition and Environment Misconfiguration #2579 - Race condition causing "Unexpected non-whitespace character after JSON" errors
Supersedes Added in-memory locks for the process, fix for Bug #2579 temporary measure. #3060 - In-memory lock is insufficient for multi-process stdio scenarios

Why `proper-lockfile`?

Approach	Cross-process safe	External service	Complexity
In-memory Promise lock (#3060)	No	No	Low
`proper-lockfile` (this PR)	Yes	No	Low
SQLite	Yes	No	Medium
Redis	Yes	Yes	High

proper-lockfile provides the best balance of correctness and simplicity - true cross-process safety without external dependencies or storage migration.

- Add proper-lockfile dependency for cross-process file locking - Wrap all write operations (create/delete entities/relations/observations) in withLock() to ensure atomic read-modify-write - Add chaos test validating lock consistency under 10k concurrent writes

…ile locking Spawns 5 worker processes each writing 2000 entities to verify that proper-lockfile correctly prevents data loss across separate processes. This simulates the real-world scenario where multiple MCP clients spawn separate mcp-server-memory instances via stdio.

- Increase stale timeout to 60s (safe for NFS acregmax) - Add 10s update heartbeat for lock freshness - Refactor lockOptions to use spread for easier customization - Update test to match new lockOptions structure

- Move worker logic into multi-process-lock.test.ts to make the test self-contained - Remove separate worker script - Add tsx to devDependencies to support running the worker process - Fix vitest import issue in worker process

Replace manual temp file + rename implementation with write-file-atomic package, which provides more robust atomic write handling including: - Automatic concurrent write serialization - Better temp file naming with murmur hash - Automatic cleanup on errors - Worker thread support Disabled fsync for better performance.

- Add missing @types/write-file-atomic to fix TypeScript build error - Refactor withLock to use try/finally for clearer control flow - Fix incorrect comment about proper-lockfile requirements

…ck vs business errors

- Use rmdir instead of unlink for lock directory cleanup - Update error message assertion to match new format

xz-dev force-pushed the add-file-locking branch from 53850a2 to 8c72c32 Compare February 3, 2026 14:06

xz-dev added 11 commits February 4, 2026 23:35

refactor(memory): optimize lock config for NFS compatibility

5fd0be3

- Increase stale timeout to 60s (safe for NFS acregmax) - Add 10s update heartbeat for lock freshness - Refactor lockOptions to use spread for easier customization - Update test to match new lockOptions structure

test(memory): merge multi-process-worker into lock test

fc12005

- Move worker logic into multi-process-lock.test.ts to make the test self-contained - Remove separate worker script - Add tsx to devDependencies to support running the worker process - Fix vitest import issue in worker process

refactor(test): rename chaosManager to stressTestManager for clarity

a2a568b

feat(memory): use atomic rename for safer file writes

9f2f95b

fix(memory): add @types/write-file-atomic and improve error handling

7c54ecc

- Add missing @types/write-file-atomic to fix TypeScript build error - Refactor withLock to use try/finally for clearer control flow - Fix incorrect comment about proper-lockfile requirements

refactor(memory): simplify withLock using .catch() and distinguish lo…

69c6554

…ck vs business errors

fix(memory): correct lock cleanup and error message assertion

556f8a4

- Use rmdir instead of unlink for lock directory cleanup - Update error message assertion to match new format

docs(memory): explain lock-free read strategy in KnowledgeGraphManager

ffd7c53

xz-dev force-pushed the add-file-locking branch from 8c72c32 to ffd7c53 Compare February 4, 2026 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): add file locking to support multi-instance #3286

feat(memory): add file locking to support multi-instance #3286

xz-dev commented Feb 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(memory): add file locking to support multi-instance #3286

Are you sure you want to change the base?

feat(memory): add file locking to support multi-instance #3286

Conversation

xz-dev commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Why PR #3060's in-memory lock solution is insufficient

Server Details

Motivation and Context

How Has This Been Tested?

1. Single-process concurrent writes (10,000 operations)

2. Multi-process concurrent writes (5 processes × 2,000 = 10,000 operations)

Additional testing:

Breaking Changes

Types of changes

Checklist

Additional context

Related Issues and PRs

Why proper-lockfile?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xz-dev commented Feb 3, 2026 •

edited

Loading

Why `proper-lockfile`?