Feature request: Add structured memory primitives for better context engineering#791
Feature request: Add structured memory primitives for better context engineering#791mchockal wants to merge 2 commits intocloudflare:mainfrom
Conversation
|
commit: |
deathbyknowledge
left a comment
There was a problem hiding this comment.
This is not a final review yet but I'm struggling to see the bigger picture with this, it seems quite opinionated and not sure if the benefits outweigh the negatives.
Could you update one or two of the existing examples (or make a new one) to use this approach and see how if feels?
| /** | ||
| * Token limit processor - ensures context fits within token limits | ||
| */ | ||
| export const tokenLimitRequestProcessor: RequestProcessor< |
There was a problem hiding this comment.
This relies on estimateTokenCount which just guesses tokens. Since there's quite some variation in tokenizers depending on the model at use, I'd prefer if we remove this one and let users provide their own.
Maybe we can add "character" truncation instead?
There was a problem hiding this comment.
True. Ideally the choice of this processor implementation would be something left to the user. That would be step 0. But it could potentially be something to include at a later stage when context compaction to keep long-conversation going becomes more common, and if there's benefits to including it in the sdk at that point. The entire processors.ts can be part of a later PR.
| */ | ||
| export class Session { | ||
| readonly metadata: SessionMetadata; | ||
| readonly events: Event[]; |
There was a problem hiding this comment.
Do we want to use the DO sqlite to handle storage here? Or perhaps storing events directly, kind of like an event store?
There was a problem hiding this comment.
Good call. sqlite storage would be better, and fits better with the existing pattern. A cf_agents_session table might be a better approach to store user's session. Primary key will be a session_id, and events can be stored in order for easy retrieval of last 'N' user-agent turns in a given session, which would serve as the working-context for the current request. What are your thoughts?
(Apologies about the late response. I missed the email notification thread for this set of comments, and only noticed earlier today )
There was a problem hiding this comment.
yeah do sqlite is fine. might need to work around 2mb row limitation, just consider that.
|
I'll review further on wednesday |
|
Some thoughts: (that opus cleaned up for me) Thanks for putting this together - the architecture works and the problem space (context management, compaction, model-agnostic event logging) is real. I think this is worth shipping as an experimental feature with some adjustments. Overall TakeThe Session/WorkingContext/ProcessorPipeline split is reasonable. For users building long-running conversations, multi-agent systems, or agents that need audit trails, this provides useful primitives that don't exist in the SDK today. That said, most agents-sdk users are building simpler things and won't need this level of abstraction yet. So I'd recommend shipping this as explicitly experimental - opt-in for power users who want to explore these patterns. Maybe under agents/experimental/sessions? Before Merging1. Add clear experimental labeling The exports and docs should make it obvious this is experimental and the API may change:
2. Add at least one complete example The PR description has good code snippets, but there's no end-to-end example showing how to wire this into an actual agent. Something in
Without this, developers won't know how to connect the pieces. 3. Document known limitations Be explicit about what this doesn't do (yet):
Code-Level ItemsA few things to clean up: Token counting — The
Event ID generation — Tests — Would be good to have basic coverage for:
SummaryShip it as experimental with:
We'll learn more from real usage than from more design iteration. Just make sure the "experimental" signal is loud enough that people don't accidentally depend on API stability. |
feat(memory): Add Session, WorkingContext, and Processor Pipeline primitives
Summary
This PR introduces structured memory primitives for building context-aware agents, based on the tiered memory architecture principles from Google's ADK whitepaper.
Why?
Currently, agent developers must manually manage conversation history, handle context window limits, and implement their own compaction strategies. This leads to:
These primitives provide a model-agnostic foundation that separates the ground truth (Session) from the computed view (WorkingContext), enabling clean abstractions for context management and compaction.
What's Added
1. Session - The Ground Truth
A durable, structured log of agent interactions. Sessions are model-agnostic and serve as the source of truth.
2. WorkingContext - The Computed View
An ephemeral, computed view sent to the LLM. Rebuilt for each invocation.
3. Processor Pipeline - Modular Context Building
A pipeline of processors that transform Session state into WorkingContext, inspired by Google ADK's processor pattern.
Built-in Request Processors:
Built-in Response Processors:
Bonus: Compaction Without Model Coupling
The primitives enable context compaction without worrying about underlying model structure:
Why this matters:
Event Types
Strongly-typed events for all agent interactions:
Current Limitations
Files Added