Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 188 additions & 17 deletions docs/guides/multimodal-content.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,25 +26,27 @@ const textPart: TextPart = {
content: 'What do you see in this image?'
}

// Image from base64 data
// Image from base64 data (mimeType is required for data sources)
const imagePart: ImagePart = {
type: 'image',
source: {
type: 'data',
value: 'base64EncodedImageData...'
value: 'base64EncodedImageData...',
mimeType: 'image/jpeg' // Required for data sources
},
metadata: {
// Provider-specific metadata
detail: 'high' // OpenAI detail level
}
}

// Image from URL
// Image from URL (mimeType is optional for URL sources)
const imageUrlPart: ImagePart = {
type: 'image',
source: {
type: 'url',
value: 'https://example.com/image.jpg'
value: 'https://example.com/image.jpg',
mimeType: 'image/jpeg' // Optional hint for URL sources
}
}
```
Expand Down Expand Up @@ -95,7 +97,7 @@ const message = {
{ type: 'text' , content: 'Describe this image' },
{
type: 'image' ,
source: { type: 'data' , value: imageBase64 },
source: { type: 'data' , value: imageBase64, mimeType: 'image/jpeg' },
metadata: { detail: 'high' } // 'auto' | 'low' | 'high'
}
]
Expand All @@ -115,15 +117,14 @@ import { anthropicText } from '@tanstack/ai-anthropic'

const adapter = anthropicText()

// Image with media type
// Image with mimeType in source
const imageMessage = {
role: 'user' ,
content: [
{ type: 'text' , content: 'What do you see?' },
{
type: 'image' ,
source: { type: 'data' , value: imageBase64 },
metadata: { media_type: 'image/jpeg' }
source: { type: 'data' , value: imageBase64, mimeType: 'image/jpeg' }
}
]
}
Expand All @@ -135,7 +136,7 @@ const docMessage = {
{ type: 'text', content: 'Summarize this document' },
{
type: 'document',
source: { type: 'data', value: pdfBase64 }
source: { type: 'data', value: pdfBase64, mimeType: 'application/pdf' }
}
]
}
Expand All @@ -154,15 +155,14 @@ import { geminiText } from '@tanstack/ai-gemini'

const adapter = geminiText()

// Image with mimeType
// Image with mimeType in source
const message = {
role: 'user',
content: [
{ type: 'text', content: 'Analyze this image' },
{
type: 'image',
source: { type: 'data', value: imageBase64 },
metadata: { mimeType: 'image/png' }
source: { type: 'data', value: imageBase64, mimeType: 'image/png' }
}
]
}
Expand All @@ -188,7 +188,7 @@ const message = {
{ type: 'text', content: 'What is in this image?' },
{
type: 'image',
source: { type: 'data', value: imageBase64 }
source: { type: 'data', value: imageBase64, mimeType: 'image/jpeg' }
}
]
}
Expand All @@ -202,28 +202,39 @@ Content can be provided as either inline data or a URL:

### Data (Base64)

Use `type: 'data'` for inline base64-encoded content:
Use `type: 'data'` for inline base64-encoded content. **The `mimeType` field is required** to ensure providers receive proper content type information:

```typescript
const imagePart = {
type: 'image',
source: {
type: 'data',
value: 'iVBORw0KGgoAAAANSUhEUgAAAAUA...' // Base64 string
value: 'iVBORw0KGgoAAAANSUhEUgAAAAUA...', // Base64 string
mimeType: 'image/png' // Required for data sources
}
}

const audioPart = {
type: 'audio',
source: {
type: 'data',
value: 'base64AudioData...',
mimeType: 'audio/mp3' // Required for data sources
}
}
```

### URL

Use `type: 'url'` for content hosted at a URL:
Use `type: 'url'` for content hosted at a URL. The `mimeType` field is **optional** as providers can often infer it from the URL or response headers:

```typescript
const imagePart = {
type: 'image' ,
source: {
type: 'url' ,
value: 'https://example.com/image.jpg'
value: 'https://example.com/image.jpg',
mimeType: 'image/jpeg' // Optional hint
}
}
```
Expand Down Expand Up @@ -315,3 +326,163 @@ const stream = chat({
3. **Check model support**: Not all models support all modalities. Verify the model you're using supports the content types you want to send.

4. **Handle errors gracefully**: When a model doesn't support a particular modality, it may throw an error. Handle these cases in your application.

## Client-Side Multimodal Messages

When using the `ChatClient` from `@tanstack/ai-client`, you can send multimodal messages directly from your UI using the `sendMessage` method.

### Basic Usage

The `sendMessage` method accepts either a simple string or a `MultimodalContent` object:

```typescript
import { ChatClient, fetchServerSentEvents } from '@tanstack/ai-client'

const client = new ChatClient({
connection: fetchServerSentEvents('/api/chat'),
})

// Simple text message
await client.sendMessage('Hello!')

// Multimodal message with image
await client.sendMessage({
content: [
{ type: 'text', content: 'What is in this image?' },
{
type: 'image',
source: { type: 'url', value: 'https://example.com/photo.jpg' }
}
]
})
```

### Custom Message ID

You can provide a custom ID for the message:

```typescript
await client.sendMessage({
content: 'Hello!',
id: 'custom-message-id-123'
})
```

### Per-Message Body Parameters

The second parameter allows you to pass additional body parameters for that specific request. These are shallow-merged with the client's base body configuration, with per-message parameters taking priority:

```typescript
const client = new ChatClient({
connection: fetchServerSentEvents('/api/chat'),
body: { model: 'gpt-5' }, // Base body params
})

// Override model for this specific message
await client.sendMessage('Analyze this complex problem', {
model: 'gpt-5',
temperature: 0.2,
})


```

### React Example

Here's how to use multimodal messages in a React component:

```tsx
import { useChat } from '@tanstack/ai-react'
import { fetchServerSentEvents } from '@tanstack/ai-client'
import { useState } from 'react'

function ChatWithImages() {
const [imageUrl, setImageUrl] = useState('')
const { sendMessage, messages } = useChat({
connection: fetchServerSentEvents('/api/chat'),
})

const handleSendWithImage = () => {
if (imageUrl) {
sendMessage({
content: [
{ type: 'text', content: 'What do you see in this image?' },
{ type: 'image', source: { type: 'url', value: imageUrl } }
]
})
}
}

return (
<div>
<input
type="url"
placeholder="Image URL"
value={imageUrl}
onChange={(e) => setImageUrl(e.target.value)}
/>
<button onClick={handleSendWithImage}>Send with Image</button>
</div>
)
}
```

### File Upload Example

Here's how to handle file uploads and send them as multimodal content:

```tsx
import { useChat } from '@tanstack/ai-react'
import { fetchServerSentEvents } from '@tanstack/ai-client'

function ChatWithFileUpload() {
const { sendMessage } = useChat({
connection: fetchServerSentEvents('/api/chat'),
})

const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
Comment on lines +443 to +453
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

File reading promise missing error handling.

The FileReader promise doesn't handle the onerror event. If file reading fails, the promise will never resolve, causing the function to hang.

πŸ“ Suggested improvement
     // Convert file to base64
-    const base64 = await new Promise<string>((resolve) => {
+    const base64 = await new Promise<string>((resolve, reject) => {
       const reader = new FileReader()
       reader.onload = () => {
         const result = reader.result as string
         // Remove data URL prefix (e.g., "data:image/png;base64,")
         resolve(result.split(',')[1])
       }
+      reader.onerror = () => reject(new Error('Failed to read file'))
       reader.readAsDataURL(file)
     })
πŸ“ Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve, reject) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.onerror = () => reject(new Error('Failed to read file'))
reader.readAsDataURL(file)
})
πŸ€– Prompt for AI Agents
In `@docs/guides/multimodal-content.md` around lines 432 - 442, The FileReader
promise in handleFileUpload lacks error handling and can hang if reading fails;
update the promise used in handleFileUpload to attach reader.onerror and reject
the promise with the error (and optionally reader.onabort) so callers receive an
error instead of waiting forever, and ensure any cleanup (e.g., removing
handlers) happens on both success and error paths.


// Determine content type based on file type
const type = file.type.startsWith('image/')
? 'image'
: file.type.startsWith('audio/')
? 'audio'
: file.type.startsWith('video/')
? 'video'
: 'document'

await sendMessage({
content: [
{ type: 'text', content: `Please analyze this ${type}` },
{
type,
source: { type: 'data', value: base64 },
metadata: { mimeType: file.type }
}
]
})
Comment on lines +464 to +473
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

Incorrect mimeType placement in file upload example.

The example places mimeType in metadata, but according to the type definitions and all other examples in this document, mimeType should be in the source object for data sources.

πŸ“ Proposed fix
     await sendMessage({
       content: [
         { type: 'text', content: `Please analyze this ${type}` },
         {
           type,
-          source: { type: 'data', value: base64 },
-          metadata: { mimeType: file.type }
+          source: { type: 'data', value: base64, mimeType: file.type }
         }
       ]
     })
πŸ€– Prompt for AI Agents
In `@docs/guides/multimodal-content.md` around lines 464 - 473, The example passed
to sendMessage places mimeType inside the metadata object incorrectly; update
the content payload so that the file item's mimeType is moved into the source
object (i.e., for the array element with keys type and source, add mimeType
under source rather than metadata) to match the type definitions and other
examples; locate the sendMessage call and adjust the file element structure (the
object with fields type, source, metadata) so source includes mimeType and
metadata is either removed or left for other metadata only.

}

return (
<input
type="file"
accept="image/*,audio/*,video/*,.pdf"
onChange={(e) => {
const file = e.target.files?.[0]
if (file) handleFileUpload(file)
}}
/>
)
}
```

Loading
Loading