Skip to content

feat: multiple modalities from the client#263

Open
AlemTuzlak wants to merge 3 commits intomainfrom
feat/multiple-modalities
Open

feat: multiple modalities from the client#263
AlemTuzlak wants to merge 3 commits intomainfrom
feat/multiple-modalities

Conversation

@AlemTuzlak
Copy link
Contributor

@AlemTuzlak AlemTuzlak commented Feb 2, 2026

🎯 Changes

Added the ability to send multi-modal messages from the client
Added the ability to send extra body data with sendMessage api
added ability to add messageIds from the client
Added new e2e tests to the smoke tests harness

✅ Checklist

  • I have followed the steps in the Contributing guide.
  • I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

  • This change affects published code, and I have generated a changeset.
  • This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

  • New Features

    • Multimodal messaging: send text with images, audio, video, and documents in one message
    • Image attachment UI with previews, removal, and Enter-to-send support
    • Per-message options (custom IDs and per-message metadata) and MIME-aware handling
  • Documentation

    • Expanded multimodal guides and client-side examples, including React and file upload flows
  • Tests

    • New multimodal test suites covering image, audio, video, and document scenarios

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 2, 2026

📝 Walkthrough

Walkthrough

This PR adds first-class multimodal message support: types, client API changes to accept multimodal payloads, UI/example updates for attachments, adapter updates to handle mimeType and data URIs, message conversion/streaming changes, expanded tests, and new smoke tests for multimodal scenarios.

Changes

Cohort / File(s) Summary
Docs & Example App
docs/guides/multimodal-content.md, examples/ts-react-chat/src/routes/index.tsx
Added client-side multimodal guide and React example with image attachment UI, base64/mimeType handling, previews, removal, and send flow supporting multimodal parts.
Chat Client Core
packages/typescript/ai-client/src/chat-client.ts, packages/typescript/ai-client/src/events.ts, packages/typescript/ai-client/src/types.ts, packages/typescript/ai-client/src/index.ts
sendMessage now accepts string
Core Types & Utils
packages/typescript/ai/src/types.ts, packages/typescript/ai/src/utils.ts, packages/typescript/ai/src/index.ts
Introduced discriminated ContentPart source types (data vs url) with mimeType rules, extended MessagePart with image/audio/video/document, and added/exported detectImageMimeType utility.
Message Conversion & Stream
packages/typescript/ai/src/activities/chat/messages.ts, packages/typescript/ai/src/activities/chat/stream/processor.ts
UI↔Model conversion updated to preserve ordered ContentPart arrays for multimodal content; addUserMessage signature widened to accept ContentPart[] and optional id.
Provider Adapters
packages/typescript/ai-anthropic/src/adapters/text.ts, packages/typescript/ai-gemini/src/adapters/text.ts, packages/typescript/ai-grok/src/adapters/text.ts, packages/typescript/ai-openai/src/adapters/text.ts, packages/typescript/ai-openrouter/src/adapters/text.ts
Adapters now prefer part.source.mimeType for base64/URL sources and construct data URIs when needed; removed older metadata-based mime fallbacks and standardized data URI creation.
DevTools
packages/typescript/ai-devtools/src/store/ai-context.tsx
Devtools MessagePart extended with multimodal variants and source/metadata fields; mapping updated to include/skip multimodal parts.
Tests — Unit
packages/typescript/ai-client/tests/chat-client.test.ts, packages/typescript/ai/tests/message-converters.test.ts, packages/typescript/ai-react/tests/use-chat.test.ts
Added multimodal sendMessage and message-conversion test suites covering image/audio/video/document parts, mime handling, per-message body merging, and id propagation.
Smoke Tests
packages/typescript/smoke-tests/adapters/src/tests/index.ts, .../mmi-multimodal-image.ts, .../mms-multimodal-structured.ts
Added four multimodal adapter smoke tests (MMJ, MMP, MMS, MMT) for JPEG/PNG image flows and structured JSON validation; minor type-signature generics changes.
API Surface Cleanups
packages/typescript/ai-anthropic/src/index.ts, packages/typescript/ai-gemini/src/index.ts
Removed some provider-specific exported mime/media-type type aliases from public barrels.
Misc
packages/typescript/smoke-tests/adapters/src/adapters/index.ts
Updated default GROK_MODEL from 'grok-3' to 'grok-4'.

Sequence Diagram

sequenceDiagram
    participant React as React Component
    participant ChatClient
    participant StreamProc as StreamProcessor
    participant Converter as MessageConverter
    participant Adapter as LLMAdapter
    participant LLM as LLM Provider

    React->>ChatClient: sendMessage(MultimodalContent)
    ChatClient->>ChatClient: normalizeMessageInput() / store pendingMessageBody
    ChatClient->>StreamProc: addUserMessage(parts[], id?)
    StreamProc-->>ChatClient: UIMessage(parts[])
    ChatClient->>ChatClient: emit messageSent(messageId, parts[])
    ChatClient->>Converter: uiMessageToModelMessages(UIMessage)
    Converter-->>ChatClient: ModelMessage with ContentPart[]
    ChatClient->>Adapter: convertContentParts(ContentPart[])
    Adapter->>Adapter: detect mimeType / build data: URIs for base64
    Adapter-->>ChatClient: provider-specific payload
    ChatClient->>LLM: API request (merged body + conversationId)
    LLM-->>ChatClient: response stream
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • AlemTuzlak
  • jherr

Poem

🐰 I nibble bytes and sniff the mime,

Base64 carrots, images sublime.
From React burrow to adapters’ lair,
I hop your parts and keep their care.
Multimodal treats — hooray, let's share! 🎉

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.42% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description covers the main changes (multimodal messages, extra body data, messageIds, e2e tests) but the checklist items remain unchecked, indicating incomplete adherence to contribution requirements. Check the boxes in the checklist section to confirm you followed the Contributing guide, tested locally, and addressed changeset/release impact requirements.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: multiple modalities from the client' is specific and directly relates to the main change—enabling multimodal content sending from client-side APIs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/multiple-modalities

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@docs/guides/multimodal-content.md`:
- Around line 432-442: The FileReader promise in handleFileUpload lacks error
handling and can hang if reading fails; update the promise used in
handleFileUpload to attach reader.onerror and reject the promise with the error
(and optionally reader.onabort) so callers receive an error instead of waiting
forever, and ensure any cleanup (e.g., removing handlers) happens on both
success and error paths.

In `@examples/ts-react-chat/src/routes/index.tsx`:
- Around line 362-364: The forEach callback currently uses a concise arrow body
that implicitly returns the value of URL.revokeObjectURL, which static analysis
flags; update the callback to a block body that does not return anything — e.g.,
change the call on attachedImages (attachedImages.forEach((img) =>
URL.revokeObjectURL(img.preview))) to use a statement body like
attachedImages.forEach((img) => { URL.revokeObjectURL(img.preview); }); to
ensure no value is returned from the forEach callback.
- Around line 292-300: The FileReader promise for producing `base64` lacks error
handling and can hang on read failures; change the constructor to new
Promise<string>((resolve, reject) => { ... }) and add `reader.onerror = (e) =>
reject(e)` (and optionally `reader.onabort = () => reject(new Error('File read
aborted'))`) alongside the existing `reader.onload` handler; also consider
removing/clearing handlers after resolution/rejection to avoid leaks and keep
using `reader.readAsDataURL(file)` to start the read.
- Around line 159-174: The image data URL is hardcoded to "image/png" when
part.source.type !== 'url'; update the construction of imageUrl to use the
actual MIME type from the part metadata (e.g., read a mime/type field such as
part.source.mediaType or part.source.mimeType or part.metadata.mimeType) instead
of "image/png" so the prefix becomes
`data:{actualMime};base64,${part.source.value}` when rendering in the branch
that handles part.source.type !== 'url'; keep the existing branch for URL
sources unchanged.

In `@packages/typescript/smoke-tests/adapters/src/tests/index.ts`:
- Around line 129-157: The multimodal tests (MMJ, MMP, MMS, MMT) declare
requires: ['text'] but send image content; update the test metadata and
capability enum by adding a new AdapterCapability value (e.g., 'vision') to the
AdapterCapability enum, then change the four tests (identifiable by id:
'MMJ','MMP','MMS','MMT' in the tests array) to requires: ['text','vision'];
alternatively, if you prefer the existing IMG/TTS/TRN pattern, set
skipByDefault: true on those test objects instead of changing requires — ensure
references to AdapterCapability and the test objects are updated consistently.
🧹 Nitpick comments (10)
packages/typescript/ai-openai/src/adapters/text.ts (1)

816-824: Consider using detectImageMimeType for consistent MIME type detection.

The Anthropic and Gemini adapters use detectImageMimeType to infer the actual image format from base64 magic bytes, but this adapter hardcodes image/jpeg. While data URIs with incorrect MIME types often still work, using the utility would provide more accurate MIME types.

♻️ Proposed fix to use detectImageMimeType

First, add the import at the top of the file:

import { detectImageMimeType } from '@tanstack/ai'

Then update the data URI construction:

         // For base64 data, construct a data URI if not already one
         const imageValue = part.source.value
+        const detectedMimeType = detectImageMimeType(imageValue) ?? 'image/jpeg'
         const imageUrl = imageValue.startsWith('data:')
           ? imageValue
-          : `data:image/jpeg;base64,${imageValue}`
+          : `data:${detectedMimeType};base64,${imageValue}`
packages/typescript/ai-grok/src/adapters/text.ts (1)

505-517: Consider using detectImageMimeType for consistent MIME type detection.

Similar to the OpenAI adapter, this hardcodes image/jpeg for the data URI MIME type. Using detectImageMimeType (as done in Anthropic and Gemini adapters) would provide more accurate MIME types based on the actual image format.

♻️ Proposed fix to use detectImageMimeType

First, add the import at the top of the file:

import { detectImageMimeType } from '@tanstack/ai'

Then update the data URI construction:

         // For base64 data, construct a data URI if not already one
         const imageValue = part.source.value
+        const detectedMimeType = detectImageMimeType(imageValue) ?? 'image/jpeg'
         const imageUrl =
           part.source.type === 'data' && !imageValue.startsWith('data:')
-            ? `data:image/jpeg;base64,${imageValue}`
+            ? `data:${detectedMimeType};base64,${imageValue}`
             : imageValue
packages/typescript/ai/src/utils.ts (1)

17-41: Consider handling data URI prefixes gracefully.

The function assumes raw base64 input, but callers might accidentally pass a full data URI (e.g., ...). This would return undefined since the string starts with data: rather than the magic bytes.

♻️ Proposed enhancement to handle data URI input
 export function detectImageMimeType(
   base64Data: string,
 ): 'image/jpeg' | 'image/png' | 'image/gif' | 'image/webp' | undefined {
+  // Strip data URI prefix if present
+  const data = base64Data.includes(',')
+    ? base64Data.split(',')[1] ?? base64Data
+    : base64Data
+
   // Get first few bytes (base64 encoded)
-  const prefix = base64Data.substring(0, 20)
+  const prefix = data.substring(0, 20)
packages/typescript/ai-anthropic/src/adapters/text.ts (1)

312-337: Minor optimization: detect MIME type only for base64 data sources.

detectImageMimeType is called unconditionally, but for URL sources the detection is wasteful since the value is a URL string (not base64) and the result is unused.

♻️ Proposed optimization
       case 'image': {
         const metadata = part.metadata as AnthropicImageMetadata | undefined
-        // Detect mime type from base64 magic bytes if not provided
-        const detectedMimeType = detectImageMimeType(part.source.value)
         const imageSource: Base64ImageSource | URLImageSource =
           part.source.type === 'data'
             ? {
                 type: 'base64',
                 data: part.source.value,
                 media_type:
-                  metadata?.mediaType ?? detectedMimeType ?? 'image/jpeg',
+                  metadata?.mediaType ??
+                  detectImageMimeType(part.source.value) ??
+                  'image/jpeg',
               }
             : {
                 type: 'url',
                 url: part.source.value,
               }
packages/typescript/ai-openrouter/src/adapters/text.ts (1)

593-605: Consider using detectImageMimeType for consistent MIME type detection.

Similar to OpenAI and Grok adapters, this hardcodes image/jpeg for the data URI MIME type. For consistency with Anthropic and Gemini adapters, consider using detectImageMimeType.

♻️ Proposed fix to use detectImageMimeType

First, add the import at the top of the file:

import { detectImageMimeType } from '@tanstack/ai'

Then update the data URI construction:

         case 'image': {
           const meta = part.metadata as OpenRouterImageMetadata | undefined
           // For base64 data, construct a data URI if not already one
           const imageValue = part.source.value
+          const detectedMimeType = detectImageMimeType(imageValue) ?? 'image/jpeg'
           const imageUrl =
             part.source.type === 'data' && !imageValue.startsWith('data:')
-              ? `data:image/jpeg;base64,${imageValue}`
+              ? `data:${detectedMimeType};base64,${imageValue}`
               : imageValue
examples/ts-react-chat/src/routes/index.tsx (2)

24-29: Use the exported generateMessageId from @tanstack/ai instead of duplicating.

This function duplicates generateMessageId which is already exported from @tanstack/ai (visible in the re-exports at packages/typescript/ai-client/src/index.ts line 41). Consider importing and using the shared implementation for consistency.

♻️ Suggested change
-/**
- * Generate a random message ID
- */
-function generateMessageId(): string {
-  return `msg-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`
-}
+import { generateMessageId } from '@tanstack/ai-react'

Remove the local function and add generateMessageId to the existing imports from @tanstack/ai-react.


347-354: Redundant metadata fields: both mediaType and mimeType are set to the same value.

Consider using a single field name for consistency. Based on the type definitions, metadata is provider-specific, but having both fields with the same value adds no benefit.

♻️ Suggested simplification
         contentParts.push({
           type: 'image',
           source: { type: 'data', value: img.base64 },
-          metadata: { mediaType: img.mimeType, mimeType: img.mimeType },
+          metadata: { mimeType: img.mimeType },
         })
docs/guides/multimodal-content.md (1)

370-377: Model override example doesn't demonstrate the override.

The example shows model: 'gpt-5' in both the base body and the per-message override, which doesn't clearly demonstrate the override behavior.

📝 Suggested improvement
 const client = new ChatClient({
   connection: fetchServerSentEvents('/api/chat'),
   body: { model: 'gpt-5' }, // Base body params
 })

 // Override model for this specific message
 await client.sendMessage('Analyze this complex problem', {
-  model: 'gpt-5',
+  model: 'gpt-5-turbo', // Overrides base model for this request
   temperature: 0.2,
 })
packages/typescript/smoke-tests/adapters/src/tests/mms-multimodal-structured.ts (2)

10-25: Extract duplicated getMimeType to a shared utility.

This function is duplicated verbatim in mmi-multimodal-image.ts. Consider extracting it to a shared module (e.g., test-utils.ts) to follow DRY principles.

♻️ Proposed refactor

Create a new file packages/typescript/smoke-tests/adapters/src/tests/utils.ts:

/**
 * Detect image mime type from file extension
 */
export function getMimeType(filename: string): string {
  const ext = filename.toLowerCase().split('.').pop()
  switch (ext) {
    case 'jpg':
    case 'jpeg':
      return 'image/jpeg'
    case 'png':
      return 'image/png'
    case 'gif':
      return 'image/gif'
    case 'webp':
      return 'image/webp'
    default:
      return 'image/jpeg'
  }
}

Then import from both test files:

-function getMimeType(filename: string): string {
-  // ... implementation
-}
+import { getMimeType } from './utils'

99-118: Consider extracting shared JSON validation logic.

The JSON parsing and validation logic in runMMS (lines 99-118) and runMMT (lines 201-220) are nearly identical. For better maintainability, consider extracting a helper function.

♻️ Proposed helper extraction
function parseAndValidateImageDescription(
  response: string
): { 
  parsed: ImageDescription | null
  error?: string
  validationMeta: Record<string, boolean>
} {
  let parsed: ImageDescription | null = null
  try {
    const jsonMatch = response.match(/```(?:json)?\s*([\s\S]*?)```/)
    const jsonStr = jsonMatch && jsonMatch[1] ? jsonMatch[1].trim() : response.trim()
    parsed = JSON.parse(jsonStr)
  } catch {
    try {
      parsed = JSON.parse(response)
    } catch {
      return {
        parsed: null,
        error: `Failed to parse response as JSON: ${response.substring(0, 200)}`,
        validationMeta: {}
      }
    }
  }

  const hasDescription = typeof parsed?.description === 'string' && parsed.description.length > 0
  const hasMainSubject = typeof parsed?.mainSubject === 'string' && parsed.mainSubject.length > 0
  const hasColors = Array.isArray(parsed?.colors) && parsed.colors.length > 0
  const hasTextBoolean = typeof parsed?.hasText === 'boolean'

  return {
    parsed,
    validationMeta: { hasDescription, hasMainSubject, hasColors, hasTextBoolean }
  }
}

Also applies to: 201-220

Comment on lines +432 to +442
const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

File reading promise missing error handling.

The FileReader promise doesn't handle the onerror event. If file reading fails, the promise will never resolve, causing the function to hang.

📝 Suggested improvement
     // Convert file to base64
-    const base64 = await new Promise<string>((resolve) => {
+    const base64 = await new Promise<string>((resolve, reject) => {
       const reader = new FileReader()
       reader.onload = () => {
         const result = reader.result as string
         // Remove data URL prefix (e.g., "data:image/png;base64,")
         resolve(result.split(',')[1])
       }
+      reader.onerror = () => reject(new Error('Failed to read file'))
       reader.readAsDataURL(file)
     })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
const handleFileUpload = async (file: File) => {
// Convert file to base64
const base64 = await new Promise<string>((resolve, reject) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.onerror = () => reject(new Error('Failed to read file'))
reader.readAsDataURL(file)
})
🤖 Prompt for AI Agents
In `@docs/guides/multimodal-content.md` around lines 432 - 442, The FileReader
promise in handleFileUpload lacks error handling and can hang if reading fails;
update the promise used in handleFileUpload to attach reader.onerror and reject
the promise with the error (and optionally reader.onabort) so callers receive an
error instead of waiting forever, and ensure any cleanup (e.g., removing
handlers) happens on both success and error paths.

Comment on lines +292 to +300
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add error handling for FileReader.

The FileReader promise doesn't handle the onerror case, which could cause the promise to hang indefinitely if the file read fails.

🛡️ Proposed fix to add error handling
       const base64 = await new Promise<string>((resolve, reject) => {
         const reader = new FileReader()
         reader.onload = () => {
           const result = reader.result as string
           // Remove data URL prefix (e.g., "data:image/png;base64,")
           resolve(result.split(',')[1])
         }
+        reader.onerror = () => reject(reader.error)
         reader.readAsDataURL(file)
       })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.readAsDataURL(file)
})
const base64 = await new Promise<string>((resolve, reject) => {
const reader = new FileReader()
reader.onload = () => {
const result = reader.result as string
// Remove data URL prefix (e.g., "data:image/png;base64,")
resolve(result.split(',')[1])
}
reader.onerror = () => reject(reader.error)
reader.readAsDataURL(file)
})
🤖 Prompt for AI Agents
In `@examples/ts-react-chat/src/routes/index.tsx` around lines 292 - 300, The
FileReader promise for producing `base64` lacks error handling and can hang on
read failures; change the constructor to new Promise<string>((resolve, reject)
=> { ... }) and add `reader.onerror = (e) => reject(e)` (and optionally
`reader.onabort = () => reject(new Error('File read aborted'))`) alongside the
existing `reader.onload` handler; also consider removing/clearing handlers after
resolution/rejection to avoid leaks and keep using `reader.readAsDataURL(file)`
to start the read.

Comment on lines +362 to +364
// Clean up image previews
attachedImages.forEach((img) => URL.revokeObjectURL(img.preview))
setAttachedImages([])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the forEach callback to not return a value.

The static analysis tool flagged this: the callback passed to forEach() should not return a value. URL.revokeObjectURL returns undefined, but using it in an arrow function expression body implies a return.

🐛 Proposed fix
       // Clean up image previews
-      attachedImages.forEach((img) => URL.revokeObjectURL(img.preview))
+      for (const img of attachedImages) {
+        URL.revokeObjectURL(img.preview)
+      }
       setAttachedImages([])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Clean up image previews
attachedImages.forEach((img) => URL.revokeObjectURL(img.preview))
setAttachedImages([])
// Clean up image previews
for (const img of attachedImages) {
URL.revokeObjectURL(img.preview)
}
setAttachedImages([])
🧰 Tools
🪛 Biome (2.3.13)

[error] 363-363: This callback passed to forEach() iterable method should not return a value.

Either remove this return or remove the returned value.

(lint/suspicious/useIterableCallbackReturn)

🤖 Prompt for AI Agents
In `@examples/ts-react-chat/src/routes/index.tsx` around lines 362 - 364, The
forEach callback currently uses a concise arrow body that implicitly returns the
value of URL.revokeObjectURL, which static analysis flags; update the callback
to a block body that does not return anything — e.g., change the call on
attachedImages (attachedImages.forEach((img) =>
URL.revokeObjectURL(img.preview))) to use a statement body like
attachedImages.forEach((img) => { URL.revokeObjectURL(img.preview); }); to
ensure no value is returned from the forEach callback.

Comment on lines +129 to +157
{
id: 'MMJ',
name: 'Multimodal JPEG',
description:
'Describe a JPEG image (meme with man, React icon, code/email text)',
run: runMMJ,
requires: ['text'],
},
{
id: 'MMP',
name: 'Multimodal PNG',
description: 'Describe a PNG image (beach scene with AG UI text)',
run: runMMP,
requires: ['text'],
},
{
id: 'MMS',
name: 'Multimodal Structured JPEG',
description: 'Describe a JPEG image with structured JSON output',
run: runMMS,
requires: ['text'],
},
{
id: 'MMT',
name: 'Multimodal Structured PNG',
description: 'Describe a PNG image with structured JSON output',
run: runMMT,
requires: ['text'],
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the file structure and locate the test file
git ls-files packages/typescript/smoke-tests/adapters/ | head -20

Repository: TanStack/ai

Length of output: 1368


🏁 Script executed:

# Find AdapterCapability definition
rg -n "type AdapterCapability|enum AdapterCapability|AdapterCapability\s*=" --type ts -g '**/*.ts'

Repository: TanStack/ai

Length of output: 149


🏁 Script executed:

# Check the full test file to see all test definitions and how capabilities are used
wc -l packages/typescript/smoke-tests/adapters/src/tests/index.ts

Repository: TanStack/ai

Length of output: 118


🏁 Script executed:

# Search for vision/multimodal capability or references
rg -n "vision|multimodal" --type ts -g '**/adapters/**'

Repository: TanStack/ai

Length of output: 2253


🏁 Script executed:

# Check IMG, TTS, TRN tests mentioned in the review to see skipByDefault pattern
rg -n "'IMG'|'TTS'|'TRN'" --type ts -g '**/tests/**'

Repository: TanStack/ai

Length of output: 291


🏁 Script executed:

# Look at how ContentPart is used in adapters
rg -n "ContentPart" --type ts -g '**/adapters/**'

Repository: TanStack/ai

Length of output: 4726


🏁 Script executed:

# Read the AdapterCapability definition and context
sed -n '20,50p' packages/typescript/smoke-tests/adapters/src/tests/index.ts

Repository: TanStack/ai

Length of output: 785


🏁 Script executed:

# Read the test definitions including IMG, TTS, TRN to see skipByDefault pattern
sed -n '100,160p' packages/typescript/smoke-tests/adapters/src/tests/index.ts

Repository: TanStack/ai

Length of output: 1512


🏁 Script executed:

# Check if there's a skipByDefault property mentioned anywhere
rg -n "skipByDefault" --type ts -g '**/adapters/**'

Repository: TanStack/ai

Length of output: 883


🏁 Script executed:

# Read the mmi-multimodal-image.ts test implementation to see what happens
cat packages/typescript/smoke-tests/adapters/src/tests/mmi-multimodal-image.ts

Repository: TanStack/ai

Length of output: 5918


🏁 Script executed:

# Check how adapters handle image content that they may not support
# Look for error handling in adapter implementations
rg -n "type.*image|image.*support|unsupported.*image" --type ts -g '**/ai-*/src/adapters/**' -A 2

Repository: TanStack/ai

Length of output: 2916


🏁 Script executed:

# Check the harness to understand test execution and error handling
cat packages/typescript/smoke-tests/adapters/src/harness.ts

Repository: TanStack/ai

Length of output: 11441


Add 'vision' capability to AdapterCapability enum and update multimodal tests accordingly.

Multimodal tests (MMJ, MMP, MMS, MMT) declare requires: ['text'] but actually send image content that requires vision support. The AdapterCapability enum lacks a 'vision' or 'multimodal' option to properly declare this dependency. This mismatch means these tests will attempt to run on all text adapters, failing on those without vision support (e.g., text-only models).

Recommended approach: Add 'vision' to the AdapterCapability enum and update these four tests to requires: ['text', 'vision']. Alternatively, follow the IMG/TTS/TRN pattern by adding skipByDefault: true to reduce noise from unsupported adapters.

🤖 Prompt for AI Agents
In `@packages/typescript/smoke-tests/adapters/src/tests/index.ts` around lines 129
- 157, The multimodal tests (MMJ, MMP, MMS, MMT) declare requires: ['text'] but
send image content; update the test metadata and capability enum by adding a new
AdapterCapability value (e.g., 'vision') to the AdapterCapability enum, then
change the four tests (identifiable by id: 'MMJ','MMP','MMS','MMT' in the tests
array) to requires: ['text','vision']; alternatively, if you prefer the existing
IMG/TTS/TRN pattern, set skipByDefault: true on those test objects instead of
changing requires — ensure references to AdapterCapability and the test objects
are updated consistently.

@nx-cloud
Copy link

nx-cloud bot commented Feb 4, 2026

View your CI Pipeline Execution ↗ for commit d332304

Command Status Duration Result
nx affected --targets=test:sherif,test:knip,tes... ✅ Succeeded 2m 58s View ↗
nx run-many --targets=build --exclude=examples/** ✅ Succeeded 1m 12s View ↗

☁️ Nx Cloud last updated this comment at 2026-02-04 15:37:12 UTC

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 4, 2026

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai@263

@tanstack/ai-anthropic

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-anthropic@263

@tanstack/ai-client

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-client@263

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-devtools-core@263

@tanstack/ai-gemini

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-gemini@263

@tanstack/ai-grok

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-grok@263

@tanstack/ai-ollama

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-ollama@263

@tanstack/ai-openai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openai@263

@tanstack/ai-openrouter

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openrouter@263

@tanstack/ai-preact

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-preact@263

@tanstack/ai-react

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react@263

@tanstack/ai-react-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react-ui@263

@tanstack/ai-solid

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid@263

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid-ui@263

@tanstack/ai-svelte

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-svelte@263

@tanstack/ai-vue

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue@263

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue-ui@263

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/preact-ai-devtools@263

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/react-ai-devtools@263

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/solid-ai-devtools@263

commit: d332304

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@docs/guides/multimodal-content.md`:
- Around line 464-473: The example passed to sendMessage places mimeType inside
the metadata object incorrectly; update the content payload so that the file
item's mimeType is moved into the source object (i.e., for the array element
with keys type and source, add mimeType under source rather than metadata) to
match the type definitions and other examples; locate the sendMessage call and
adjust the file element structure (the object with fields type, source,
metadata) so source includes mimeType and metadata is either removed or left for
other metadata only.
🧹 Nitpick comments (4)
packages/typescript/ai-react/tests/use-chat.test.ts (1)

1313-1345: Consider adding audio URL test for completeness.

The tests cover audio with data source but not with URL source. For consistency with image, video, and document tests (which cover both URL and data sources), consider adding a test for audio URLs.

📝 Suggested test case
it('should send a multimodal message with audio URL', async () => {
  const chunks = createTextChunks('The audio says hello')
  const adapter = createMockConnectionAdapter({ chunks })
  const { result } = renderUseChat({ connection: adapter })

  await result.current.sendMessage({
    content: [
      { type: 'text', content: 'Transcribe this audio' },
      {
        type: 'audio',
        source: { type: 'url', value: 'https://example.com/audio.mp3' },
      },
    ],
  })

  await waitFor(() => {
    expect(result.current.messages.length).toBeGreaterThan(0)
  })

  const userMessage = result.current.messages.find((m) => m.role === 'user')
  expect(userMessage?.parts[1]).toEqual({
    type: 'audio',
    source: { type: 'url', value: 'https://example.com/audio.mp3' },
  })
})
examples/ts-react-chat/src/routes/index.tsx (1)

24-29: Consider using the exported generateMessageId from @tanstack/ai instead of duplicating it.

This local implementation duplicates the utility already exported from packages/typescript/ai/src/activities/chat/messages.ts. Using the exported version would reduce code duplication and ensure consistency across the codebase.

Note: There's a subtle difference - the exported version uses substring(7) while this uses substring(2, 9). If the 7-character output is intentional for consistency, import from @tanstack/ai.

♻️ Suggested change
-/**
- * Generate a random message ID
- */
-function generateMessageId(): string {
-  return `msg-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`
-}
+import { generateMessageId } from '@tanstack/ai'
packages/typescript/smoke-tests/adapters/src/tests/mms-multimodal-structured.ts (2)

30-45: Rename STRUCTURED_PROMPT to camelCase for consistency.
This keeps variable naming aligned with the codebase convention.

♻️ Proposed rename
-const STRUCTURED_PROMPT = `Analyze this image and provide a structured description. Return ONLY valid JSON (no markdown code blocks) matching this schema:
+const structuredPrompt = `Analyze this image and provide a structured description. Return ONLY valid JSON (no markdown code blocks) matching this schema:
 {
   "description": "A brief description of what the image shows",
   "hasText": true/false,
   "textContent": "The text content visible in the image, if any",
   "mainSubject": "The main subject or focal point of the image",
   "colors": ["array", "of", "primary", "colors"]
 }`
@@
-      content: STRUCTURED_PROMPT,
+      content: structuredPrompt,
@@
-      content: STRUCTURED_PROMPT,
+      content: structuredPrompt,

As per coding guidelines: **/*.{ts,tsx,js,jsx}: Use camelCase for function and variable names throughout the codebase.


55-147: Consider extracting shared validation/payload logic to reduce duplication with runMMT.
Both runners repeat the same fixture loading, content construction, and JSON validation; a shared helper would make future changes safer and smaller.

Comment on lines +464 to +473
await sendMessage({
content: [
{ type: 'text', content: `Please analyze this ${type}` },
{
type,
source: { type: 'data', value: base64 },
metadata: { mimeType: file.type }
}
]
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Incorrect mimeType placement in file upload example.

The example places mimeType in metadata, but according to the type definitions and all other examples in this document, mimeType should be in the source object for data sources.

📝 Proposed fix
     await sendMessage({
       content: [
         { type: 'text', content: `Please analyze this ${type}` },
         {
           type,
-          source: { type: 'data', value: base64 },
-          metadata: { mimeType: file.type }
+          source: { type: 'data', value: base64, mimeType: file.type }
         }
       ]
     })
🤖 Prompt for AI Agents
In `@docs/guides/multimodal-content.md` around lines 464 - 473, The example passed
to sendMessage places mimeType inside the metadata object incorrectly; update
the content payload so that the file item's mimeType is moved into the source
object (i.e., for the array element with keys type and source, add mimeType
under source rather than metadata) to match the type definitions and other
examples; locate the sendMessage call and adjust the file element structure (the
object with fields type, source, metadata) so source includes mimeType and
metadata is either removed or left for other metadata only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant