feat: add multimodal UIMessage support#230
Conversation
📝 WalkthroughWalkthroughAdds multimodal support for UIMessage by introducing Image, Audio, Video, and Document ContentPart types, updating type unions, and changing conversions between UIMessage and ModelMessage to preserve multimodal content and metadata through round trips. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
| * Convert ContentPart array to MessagePart array | ||
| * Preserves all multimodal content (text, image, audio, video, document) | ||
| */ | ||
| function contentPartsToMessageParts( |
There was a problem hiding this comment.
I'm a bit confused, don't these two types match identically? from what I see what you're doing is just coping the old data into the new one?
There was a problem hiding this comment.
I might've gotten carried away here and overcomplicated things.
I initially thought ContentPart (used in ModelMessage.content) and MessagePart (used in UIMessage.parts) were separate type systems for model and ui that needed their own definitions.
Pushed changes to simplify and resolve this.
|
I would also like to send media messages from the client, I need this feature |
|
@jherr mind reviewing this one? It looks good to me but as you were in charge of this piece of code I'd feel much more comfortable if you approved it |
|
| Command | Status | Duration | Result |
|---|---|---|---|
nx affected --targets=test:sherif,test:knip,tes... |
❌ Failed | 2m 34s | View ↗ |
nx run-many --targets=build --exclude=examples/** |
❌ Failed | 1m 4s | View ↗ |
☁️ Nx Cloud last updated this comment at 2026-01-26 10:38:51 UTC

🎯 Changes
When calling
append()with aModelMessagecontaining multimodal content (images, audio, files), the content was stripped during theModelMessage → UIMessageconversion becausemodelMessageToUIMessage()only extracted text viagetTextContent(). Along this, thepartsof a message doesn't include multimodal parts, making it impossible to build chat UIs that preserve and display multimodal content.Added new message part types and updated the conversion functions to preserve multimodal content during round-trips:
New Types (@tanstack/ai and @tanstack/ai-client):
ImageMessagePart- preserves image data with source and optional metadataAudioMessagePart- preserves audio dataVideoMessagePart- preserves video data - (NOT TESTED)DocumentMessagePart- preserves document data (e.g., PDFs) - (NOT TESTED)Updated Conversion Functions:
modelMessageToUIMessage()- now convertsContentPart[]to correspondingMessagePart[]instead of discarding non-text partsuiMessageToModelMessages()- now buildsContentPart[]when multimodal parts are present, preserving part orderingExample:
Demo
Images:
https://github.com/user-attachments/assets/5f62ab32-9f11-44f7-bfc0-87d00678e265
Audio:
https://github.com/user-attachments/assets/bbbdc2f9-f8d7-4d74-99c2-23d15a3278a3
Closes #200
Note
This contribution touches core message handling. Let me know if the approach doesn't align with the project's vision, I am happy to iterate on it :)
This PR is not ready to be merged because:
✅ Checklist
pnpm run test:pr.🚀 Release Impact
Summary by CodeRabbit
New Features
Public API
Tests
Chores