Skip to content

feat: add multimodal UIMessage support#230

Open
jakobhoeg wants to merge 9 commits intoTanStack:mainfrom
jakobhoeg:feat/multimodal-capabilities
Open

feat: add multimodal UIMessage support#230
jakobhoeg wants to merge 9 commits intoTanStack:mainfrom
jakobhoeg:feat/multimodal-capabilities

Conversation

@jakobhoeg
Copy link

@jakobhoeg jakobhoeg commented Jan 17, 2026

🎯 Changes

When calling append() with a ModelMessage containing multimodal content (images, audio, files), the content was stripped during the ModelMessage → UIMessage conversion because modelMessageToUIMessage() only extracted text via getTextContent(). Along this, the parts of a message doesn't include multimodal parts, making it impossible to build chat UIs that preserve and display multimodal content.

Added new message part types and updated the conversion functions to preserve multimodal content during round-trips:
New Types (@tanstack/ai and @tanstack/ai-client):

  • ImageMessagePart - preserves image data with source and optional metadata
  • AudioMessagePart - preserves audio data
  • VideoMessagePart - preserves video data - (NOT TESTED)
  • DocumentMessagePart - preserves document data (e.g., PDFs) - (NOT TESTED)

Updated Conversion Functions:

  • modelMessageToUIMessage() - now converts ContentPart[] to corresponding MessagePart[] instead of discarding non-text parts
  • uiMessageToModelMessages() - now builds ContentPart[] when multimodal parts are present, preserving part ordering

Example:

// Input ModelMessage with multimodal content
const message: ModelMessage = {
  role: 'user',
  content: [
    { type: 'text', text: 'What is in this image?' },
    { type: 'image', source: { type: 'url', value: '' } }
  ]
}

// UIMessage now preserves all content
const uiMessage = modelMessageToUIMessage(message)
// uiMessage.parts = [
//   { type: 'text', content: 'What is in this image?' },
//   { type: 'image', source: { type: 'url', value: '' } }
// ]

// UI
if (part.type === 'image') { // 'audio' etc.
  ...<Render UI />
}

Demo

Images:
https://github.com/user-attachments/assets/5f62ab32-9f11-44f7-bfc0-87d00678e265

Audio:
https://github.com/user-attachments/assets/bbbdc2f9-f8d7-4d74-99c2-23d15a3278a3

Closes #200

Note

I have not tested this with other adapters than my own community adapter that I'm currently working on.

This contribution touches core message handling. Let me know if the approach doesn't align with the project's vision, I am happy to iterate on it :)

This PR is not ready to be merged because:

  • Video and document parts are implemented but not yet tested
  • Only tested with my community adapter - needs verification with official adapters (OpenAI, Anthropic, etc.)

✅ Checklist

  • I have followed the steps in the Contributing guide.
    • I followed CLAUDE.md, since the link is broken.
  • I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

  • This change affects published code, and I have generated a changeset.
  • This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

  • New Features

    • UI messages now support multimodal content: images, audio, video, and documents alongside text.
  • Public API

    • Added multimodal message part types so messages can include Image/Audio/Video/Document parts.
  • Tests

    • Added comprehensive tests verifying conversion and round-trip preservation of multimodal message content.
  • Chores

    • Added a changeset entry to patch affected packages.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Adds multimodal support for UIMessage by introducing Image, Audio, Video, and Document ContentPart types, updating type unions, and changing conversions between UIMessage and ModelMessage to preserve multimodal content and metadata through round trips.

Changes

Cohort / File(s) Summary
Changeset
.changeset/brave-nights-shout.md
New changeset entry enabling a patch release for multimodal UIMessage support.
Type Definitions
packages/typescript/ai/src/types.ts, packages/typescript/ai-client/src/types.ts
Introduce ImagePart, AudioPart, VideoPart, DocumentPart interfaces and extend MessagePart union to include these multimodal variants; export/import updates.
Message Conversion Logic
packages/typescript/ai/src/activities/chat/messages.ts
Update uiMessageToModelMessages and modelMessageToUIMessage to detect and preserve multimodal parts, emit ContentPart[] when multimodal content exists, and maintain metadata and ordering.
Tests
packages/typescript/ai/tests/messages.test.ts
Add tests covering text-only and multimodal conversions, metadata preservation, part ordering, and round-trip fidelity across text, image, audio, video, and document parts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • jherr

Poem

🐰 I hop and sniff the multimodal trail,

Images, sounds, and docs set sail,
Parts kept safe, in order tight,
Through UI and model, day and night,
A carrot-coded cheer for this new light!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main feature addition of multimodal UIMessage support.
Description check ✅ Passed The description comprehensively covers changes, includes examples, acknowledges limitations, and completes the required checklist items.
Linked Issues check ✅ Passed The PR fully addresses issue #200 by implementing multimodal part types (ImagePart, AudioPart, VideoPart, DocumentPart) and updating conversion functions to preserve multimodal content.
Out of Scope Changes check ✅ Passed All changes are directly scoped to adding multimodal support to UIMessage and updating related conversion functions, with no extraneous modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jakobhoeg
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jakobhoeg jakobhoeg marked this pull request as ready for review January 17, 2026 10:07
* Convert ContentPart array to MessagePart array
* Preserves all multimodal content (text, image, audio, video, document)
*/
function contentPartsToMessageParts(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused, don't these two types match identically? from what I see what you're doing is just coping the old data into the new one?

Copy link
Author

@jakobhoeg jakobhoeg Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might've gotten carried away here and overcomplicated things.
I initially thought ContentPart (used in ModelMessage.content) and MessagePart (used in UIMessage.parts) were separate type systems for model and ui that needed their own definitions.
Pushed changes to simplify and resolve this.

@ilbertt
Copy link

ilbertt commented Jan 23, 2026

I would also like to send media messages from the client, I need this feature

@AlemTuzlak
Copy link
Contributor

@jherr mind reviewing this one? It looks good to me but as you were in charge of this piece of code I'd feel much more comfortable if you approved it

@nx-cloud
Copy link

nx-cloud bot commented Jan 26, 2026

🤖 Nx Cloud AI Fix Eligible

An automatically generated fix could have helped fix failing tasks for this run, but Self-healing CI is disabled for this workspace. Visit workspace settings to enable it and get automatic fixes in future runs.

To disable these notifications, a workspace admin can disable them in workspace settings.


View your CI Pipeline Execution ↗ for commit 273bdc0

Command Status Duration Result
nx affected --targets=test:sherif,test:knip,tes... ❌ Failed 2m 34s View ↗
nx run-many --targets=build --exclude=examples/** ❌ Failed 1m 4s View ↗

☁️ Nx Cloud last updated this comment at 2026-01-26 10:38:51 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

useChat's UiMessage.parts doesn't support multimodal parts

3 participants