Feature request: Add structured memory primitives for better context engineering by mchockal · Pull Request #791 · cloudflare/agents

mchockal · 2026-01-20T17:53:36Z

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper.

Why?

Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:

Duplicated effort across agent implementations
Inconsistent approaches to memory management
Tight coupling between conversation state and model-specific formats
Complex compaction logic scattered throughout application code

These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and compaction.

What's Added

1. Session - The Ground Truth

A durable, structured log of agent interactions. Sessions are model-agnostic and serve as the source of truth.

import { Session, EventAction, generateEventId } from 'agents/memory';

// Create a new session
const session = new Session('my-agent');

// Add events
session.addEvent({
  id: generateEventId(),
  action: EventAction.USER_MESSAGE,
  timestamp: Date.now(),
  content: 'What is the weather in San Francisco?'
});

// Get conversation turns (user message + agent response + tool calls)
const turns = session.getConversationTurns();

// Serialize for storage
const serialized = session.serialize();
const restored = Session.deserialize(serialized);

2. WorkingContext - The Computed View

An ephemeral, computed view sent to the LLM. Rebuilt for each invocation.

import { WorkingContext } from 'agents/memory';

const context = new WorkingContext('session-123');

// Add system instructions
context.addSystemInstruction('You are a helpful assistant.');

// Add conversation content
context.addContent({ role: 'user', content: 'Hello!' });
context.addContent({ role: 'assistant', content: 'Hi there!' });

// Convert to model format (currently supports workers-ai)
const modelInput = context.toModelFormat('workers-ai', {
  format: 'chat_completions'
});
// { messages: [{ role: 'system', content: '...' }, ...] }

3. Processor Pipeline - Modular Context Building

A pipeline of processors that transform Session state into WorkingContext, inspired by Google ADK's processor pattern.

import {
  ProcessorPipeline,
  basicRequestProcessor,
  instructionsRequestProcessor,
  compactionFilterRequestProcessor
} from 'agents/memory';

// Create pipeline with processors
const pipeline = new ProcessorPipeline();
pipeline.addRequestProcessor('basic', basicRequestProcessor);
pipeline.addRequestProcessor('instructions', instructionsRequestProcessor, {
  instructions: ['You are a helpful assistant.']
});
pipeline.addRequestProcessor('compaction-filter', compactionFilterRequestProcessor, {
  keepCompactionSummaries: true
});

// Execute pipeline to build WorkingContext from Session
const workingContext = await pipeline.executeRequestPipeline(session);

Built-in Request Processors:

Processor	Description
basicRequestProcessor	Initializes context with session metadata
instructionsRequestProcessor	Adds system instructions
identityRequestProcessor	Adds agent identity
contentsRequestProcessor	Transforms events to content
slidingWindowRequestProcessor	Keeps recent conversation turns
compactionFilterRequestProcessor	Filters compacted events, includes summaries
contextCacheRequestProcessor	Marks stable prefixes for caching
tokenLimitRequestProcessor	Truncates to fit token limits

Built-in Response Processors:

Processor	Description
statisticsResponseProcessor	Updates session statistics

Bonus: Compaction Without Model Coupling

The primitives enable context compaction without worrying about underlying model structure:

import { Session, EventAction, generateEventId, type CompactionEvent } from 'agents/memory';

// Session tracks compaction configuration
session.updateCompactionConfig({
  enabled: true,
  windowSize: 5,        // Keep last 5 turns
  strategy: 'sliding_window'
});

// When context exceeds limits, create a compaction event
const compactionEvent: CompactionEvent = {
  id: generateEventId(),
  action: EventAction.COMPACTION,
  timestamp: Date.now(),
  summary: 'User asked about weather in SF, NY, and London. All responses provided.',
  compactedEventIds: ['event-1', 'event-2', 'event-3'],
  compactionStrategy: 'sliding_window',
  originalTokenCount: 2000,
  compactedTokenCount: 150
};

// Add compaction event and remove compacted events
session.addEvent(compactionEvent);

// The compactionFilterRequestProcessor automatically:
// 1. Filters out events that were compacted
// 2. Includes the compaction summary in the context
// 3. Preserves recent turns for continuity

Why this matters:

Compaction logic is decoupled from model format - the Session stores structured events, not raw messages
The Processor Pipeline handles the transformation - compaction summaries are injected at the right place
Developers can implement custom summarization (e.g., using LLM) while the SDK handles the plumbing
Future model support requires only adding new toModelFormat() adapters, not changing compaction logic

Event Types

Strongly-typed events for all agent interactions:

Event Type	Description
USER_MESSAGE	User input
AGENT_MESSAGE	Agent response
TOOL_CALL	Tool invocation
TOOL_RESULT	Tool execution result
COMPACTION	Context summarization
ERROR	Error occurrence
CONTROL_SIGNAL	Control flow signals
AGENT_TRANSFER	Multi-agent handoff
SYSTEM_INSTRUCTION	Dynamic instruction updates

Current Limitations

⚠️ Note: For now, this only supports Workers AI chat completions format via toModelFormat('workers-ai'). Support for additional native model providers (OpenAI, Anthropic, etc.) , integration with vercel ai sdk will have to shortly follow.

Files Added

packages/agents/src/memory/
├── index.ts           # Module exports
├── events.ts          # Event types and EventAction enum
├── session.ts         # Session class
├── working-context.ts # WorkingContext class
└── processors.ts      # ProcessorPipeline and built-in processors

changeset-bot · 2026-01-20T17:53:41Z

⚠️ No Changeset found

Latest commit: 8078063

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

pkg-pr-new · 2026-01-20T19:01:54Z

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@791

commit: 8078063

deathbyknowledge

This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.

Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?

deathbyknowledge · 2026-01-28T11:01:17Z

packages/agents/src/memory/processors.ts

+/**
+ * Token limit processor - ensures context fits within token limits
+ */
+export const tokenLimitRequestProcessor: RequestProcessor<


This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.

Maybe we can add "character" truncation instead?

True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.

deathbyknowledge · 2026-01-28T11:08:58Z

packages/agents/src/memory/session.ts

+ */
+export class Session {
+  readonly metadata: SessionMetadata;
+  readonly events: Event[];


Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?

Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?

(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )

yeah do sqlite is fine. might need to work around 2mb row limitation, just consider that.

threepointone · 2026-02-02T13:45:42Z

I'll review further on wednesday

threepointone · 2026-02-04T14:45:19Z

Some thoughts: (that opus cleaned up for me)

Thanks for putting this together - the architecture works and the problem space (context management, compaction, model-agnostic event logging) is real. I think this is worth shipping as an experimental feature with some adjustments.

Overall Take

The Session/WorkingContext/ProcessorPipeline split is reasonable. For users building long-running conversations, multi-agent systems, or agents that need audit trails, this provides useful primitives that don't exist in the SDK today.

That said, most agents-sdk users are building simpler things and won't need this level of abstraction yet. So I'd recommend shipping this as explicitly experimental - opt-in for power users who want to explore these patterns. Maybe under agents/experimental/sessions?

Before Merging

1. Add clear experimental labeling

The exports and docs should make it obvious this is experimental and the API may change:

Consider exporting from agents/experimental/sessions and adding Experimental prefix to class names
Add @experimental JSDoc tags
Docs should lead with a warning about API instability

2. Add at least one complete example

The PR description has good code snippets, but there's no end-to-end example showing how to wire this into an actual agent. Something in examples/ that demonstrates:

Creating a Session, adding events
Running the processor pipeline
Calling an LLM with the resulting WorkingContext
(Bonus) Triggering compaction after N turns

Without this, developers won't know how to connect the pieces.

3. Document known limitations

Be explicit about what this doesn't do (yet):

Only supports Workers AI format (OpenAI/Anthropic adapters planned)
Compaction requires you to provide your own summarizer
No integration with Vercel AI SDK (yet)
Token estimation is approximate (~chars/4) we should probably fix this before landing tbh
Will probably break as we iterate on it (schemas, apis, what not)

Code-Level Items

A few things to clean up:

Token counting — The chars / 4 estimation can be off significantly. Either document this limitation clearly or consider making the estimator pluggable.

truncateToFit algorithm — The current implementation breaks on the first content that doesn't fit, which may not be optimal. Consider whether this matters for the use cases you're targeting.

readonly modifiers — events, statistics, and compactionConfig are marked readonly but then mutated. This is confusing - either use actual immutability or drop the readonly keyword.

Event ID generation — Date.now() + random suffix could theoretically collide under high throughput. Consider crypto.randomUUID() for guaranteed uniqueness.

Tests — Would be good to have basic coverage for:

Session serialization/deserialization roundtrip
Processor pipeline ordering
Compaction filtering logic

Summary

Ship it as experimental with:

Clear experimental labeling in exports and docs
At least one working example in examples/
Known limitations documented
Basic test coverage
Fix the readonly confusion

We'll learn more from real usage than from more design iteration. Just make sure the "experimental" signal is loud enough that people don't accidentally depend on API stability.

mchockal-cf and others added 2 commits January 20, 2026 08:40

Add memory primitives v0

4a37d1f

Merge branch 'cloudflare:main' into main

8078063

deathbyknowledge reviewed Jan 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Add structured memory primitives for better context engineering#791

Feature request: Add structured memory primitives for better context engineering#791
mchockal wants to merge 2 commits intocloudflare:mainfrom
mchockal:main

mchockal commented Jan 20, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Jan 20, 2026

Uh oh!

pkg-pr-new bot commented Jan 20, 2026

Uh oh!

deathbyknowledge left a comment

Uh oh!

deathbyknowledge Jan 28, 2026

Uh oh!

mchockal Feb 3, 2026

Uh oh!

deathbyknowledge Jan 28, 2026

Uh oh!

mchockal Feb 3, 2026

Uh oh!

threepointone Feb 4, 2026

Uh oh!

threepointone commented Feb 2, 2026

Uh oh!

threepointone commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mchockal commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives

Summary

Why?

What's Added

1. Session - The Ground Truth

2. WorkingContext - The Computed View

3. Processor Pipeline - Modular Context Building

Bonus: Compaction Without Model Coupling

Event Types

Current Limitations

Files Added

Uh oh!

changeset-bot bot commented Jan 20, 2026

⚠️ No Changeset found

Uh oh!

pkg-pr-new bot commented Jan 20, 2026

Uh oh!

deathbyknowledge left a comment

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

deathbyknowledge Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

mchockal Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

threepointone Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

threepointone commented Feb 2, 2026

Uh oh!

threepointone commented Feb 4, 2026

Overall Take

Before Merging

Code-Level Items

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mchockal commented Jan 20, 2026 •

edited

Loading