-
-
Notifications
You must be signed in to change notification settings - Fork 758
Description
The current conversation/response/storage abstraction has some problems.
Gemini, Anthropic and OpenAI have been getting more sophisticated with how they handle longer conversations and tool calling. In particular some of the more recent models can run tools server-side - like their web search tool - so the model does one or more tool calls before returning from the API.
These tool calls can even take place during the "reasoning" phase. The reasoning stuff is a problem because that can return text that is reasoning text that is not technically part of the final response but still deserves to be stored in SQLite and optionally made available to the Python API users.
Sometimes models return encrypted blocks of additional context - these need to be stored too so they can be round-tripped to the API in future calls. This all adds up to much more complex requirements for how the conversation mechanism should work and how ongoing conversations should be persisted.
I brainstormed with Claude Opus 4.5 a bit, it was very useful: https://gistpreview.github.io/?8e871d0030a7b28fa40616d32611660e