feat: Add social media video import (YouTube, TikTok, Instagram) by AurelienPautet · Pull Request #6764 · mealie-recipes/mealie

AurelienPautet · 2025-12-22T13:30:25Z

What this PR does / why we need it:

This PR introduces the ability to import recipes directly from social media video URLs (Instagram Reels, TikTok, Facebook, and YouTube).

Currently, Mealie excels at importing from blogs, but many modern recipes exist primarily in video format where the instructions are spoken rather than written. This feature bridges that gap by using AI to transcribe and parse video content into structured recipe data.

Technical Implementation

I opted for a native implementation using yt-dlp and ffmpeg rather than third-party scraping APIs (like Apify) to keep dependencies local and avoid vendor lock-in.

Following some advice from michael-genson, the import from video URL is now in the same page as the classic web import, so the modified workflow is the following:

Try to scrape the URL with the classic web scraper
If it fails, scrape it with the video URL scraper

Moreover, this also works with the bulk importer.

The video url scrapper workflow operates as follows:

Metadata Extraction: yt-dlp fetches the video title, description, and thumbnail.
Smart Transcription:
- Priority 1: If official subtitles/captions are available, we download and use them (fastest).
- Priority 2: If no subtitles exist, we download the audio stream.
- Processing: The audio is processed via ffmpeg into a lightweight, mono-channel MP3 to minimize bandwidth.
- AI Transcription: The audio file is sent to a transcription provider (default: OpenAI Whisper, configurable via .env).
Recipe Generation: The video metadata (title, description) and the transcript are sent to an LLM, which parses the unstructured data into a valid Mealie recipe JSON.

Docker Changes:
I added ffmpeg to the Dockerfile. This is a standard, lightweight tool required for yt-dlp's audio post-processing. It allows us to standardize audio input from various platforms and consumes zero system resources when idle.

Here is a demo of the new import from video URL page:

Enregistrement.de.l.ecran.2025-12-22.a.13.53.05.mp4

Special notes for your reviewer:

It’s my biggest contribution to Mealie yet, and I’m not sure whether my code is structured perfectly.

Like, I don’t know if my functions are always in the best matching folders and files.

Testing

I have added unit tests using mock responses to verify the new API routes without hitting external services.

I also performed extensive manual testing of the full flow using:

YouTube, Instagram, Facebook & TikTok links: Tested videos with and without hardcoded subtitles.
Providers: Validated successful imports using both:
- OpenAI (Whisper model)
- Google Gemini (Gemini 2.5 Flash)

Both providers successfully generated valid recipes, with Gemini showing slightly faster processing times during my tests.

… insta, facebook, youtube...)

…into video-parser

docs/docs/documentation/getting-started/installation/backend-config.md

Co-authored-by: Maxime Louward <61564950+mlouward@users.noreply.github.com>

michael-genson · 2026-01-30T18:55:23Z

At a high level this looks good, I like the usage of ffmpeg and whisper to process video/audio, great work! I still need to dive into the implementation details.

Would it be better to build this into the URL import, instead of having a dedicated page for it? I think it would be nicer to have a single "URL" entrypoint for users (and the UI is cleaner, our import page is already a bit bloated). I haven't looked into the mechanics of how you locate the video before downloading/processing, if we're unable to do that automatically then I see a good reason to keep it as a separate page.

…ror handling

BinEP · 2026-02-06T16:09:18Z

Having it under the same url import might make api clients easier to implement. i.e. - the iOS shortcut or home assistant. iOS shortcut is the primary way I get videos into mealie. The interface of the shortcut isn't ideal for figuring out if it's a video URL.
Maybe there could be a configurable list of domains that would decide if a URL matched a video (if "auto" detect isn't possible that is)

I used another repo, but I also added a "choose best thumbnail" AI step so the mealie thumbnails would be better.
I also added the link to the original video in the description

export async function selectBestFoodThumbnail(
  thumbnailUrls: string[],
): Promise<string> {
  if (thumbnailUrls.length <= 1) {
    return thumbnailUrls[0] || '';
  }

  try {
    const { text } = await generateText({
      model: textModel,
      messages: [
        {
          role: 'user',
          content: [
            {
              type: 'text',
              text: `You are analyzing thumbnails from a cooking video to select the best one for a recipe. 
                            Please analyze these ${thumbnailUrls.length} thumbnails and return ONLY the index number (0-${thumbnailUrls.length - 1}) of the thumbnail that:
                            1. Shows food most prominently
                            2. Has the best visual quality/clarity
                            3. Would be most appealing as a recipe thumbnail
                            
                            Return only a single number (the index), no other text.`,
            },
            ...thumbnailUrls.map(url => ({
              type: 'image' as const,
              image: url,
            })),
          ],
        },
      ],
    });

    const selectedIndex = parseInt(text.trim());
    if (
      isNaN(selectedIndex) ||
      selectedIndex < 0 ||
      selectedIndex >= thumbnailUrls.length
    ) {
      console.warn(
        'AI returned invalid thumbnail index, using first thumbnail',
      );
      return thumbnailUrls[0];
    }

    console.log(
      `AI selected thumbnail ${selectedIndex} out of ${thumbnailUrls.length} options`,
    );
    return thumbnailUrls[selectedIndex];
  } catch (error) {
    console.error('Error selecting best thumbnail with AI:', error);
    // Fallback to first thumbnail
    return thumbnailUrls[0];
  }
}

…allback if the web scraping failed. And also better error handling + bulk video url import working

AurelienPautet · 2026-02-07T12:06:20Z

Would it be better to build this into the URL import, instead of having a dedicated page for it? I think it would be nicer to have a single "URL" entrypoint for users (and the UI is cleaner, our import page is already a bit bloated). I haven't looked into the mechanics of how you locate the video before downloading/processing, if we're unable to do that automatically then I see a good reason to keep it as a separate page.

I totally agree with you. I've updated the PR so that both web and video URL scraping are handled through a single URL entrypoint (I've retained the video URL route exclusively for API use).

Now, when calling recipe/create/url:

It first attempts to scrape the URL using the classic web scraping method.
If that fails, it falls back to the new video URL scraping method.

This way, all websites supported by the yt-dl library can now be used to import recipes into Mealie.

chunkychode · 2026-02-18T22:16:33Z

this is so much better than my share to email, n8n, social to mealie automation :)
is this PR kinda dead now?!?

michael-genson · 2026-02-18T22:27:58Z

I'm quite excited to get this one in, just haven't had the time to properly review and provide feedback yet!

michael-genson

Overall looks great! I made a few small tweaks:

updated docs to include version tags
simplified the prompt a bit to be more in-line with our new prompts

I provided some feedback, only one major issue. I want to test a few different video sources and see how well it works but otherwise this is pretty close to being ready.

michael-genson · 2026-02-20T15:44:21Z

frontend/lang/messages/en-US.json

-    "url-form-hint": "Copy and paste a link from your favorite recipe website",
+    "url-form-hint": "Copy and paste a link from your favorite recipe website or a link to a social media video",


Let's simplify this to "Copy and paste a link from your favorite website" (drop the word recipe from the original). More on this below.

michael-genson · 2026-02-20T15:45:45Z

frontend/lang/messages/en-US.json

-    "scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",
+    "scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site or the video you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",


Since these options are only available if transcriptions are enabled, can we separate this out? Something like:

"Scrape a recipe by url. Provide the url for the site you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection."
(if transcriptions) "You can also provide the url to a video and Mealie will attempt to transcribe it into a recipe."

michael-genson · 2026-02-20T15:47:44Z

frontend/lang/messages/en-US.json

    "error-title": "Looks Like We Couldn't Find Anything",
+    "error-title-rate-limit": "Rate Limit Exceeded",
+    "error-details-rate-limit": "The AI service is currently rate-limited. Please wait a moment and try again.",
+    "error-title-server": "Something Went Wrong",


Can we use events.something-went-wrong instead?

michael-genson · 2026-02-20T15:48:53Z

frontend/lang/messages/en-US.json

+    "error-title-rate-limit": "Rate Limit Exceeded",
+    "error-details-rate-limit": "The AI service is currently rate-limited. Please wait a moment and try again.",
+    "error-title-server": "Something Went Wrong",
+    "error-details-server": "An unexpected error occurred while processing your request. Please try again later.",


Can we switch this to "an-unexpected-error-occurred-request": "...same-text" and move it to general?

Actually, if this is for server errors (500 errors) we can probably drop this entirely and just stick with "Something went wrong". We use this pattern elsewhere in the app

michael-genson · 2026-02-20T16:13:25Z

mealie/routes/recipe/recipe_crud_routes.py

+        video_fallback_enabled = self.settings.OPENAI_ENABLED and self.settings.OPENAI_ENABLE_TRANSCRIPTION_SERVICES
+
+        try:
+            return await self._create_recipe_from_web(req)
+        except HTTPException as e:
+            if e.status_code != 400:
+                raise
+            # If OpenAI transcription is not available so re-raises the original error
+            if not video_fallback_enabled:
+                raise
+
+        # Normal scraping failed so try parsing as a video URL
+        return await self._create_recipe_from_video_url(req.url, translate_language=translate_language)


We have multiple scraper strategies prioritized in mealie.services.scraper.recipe_scraper. Particularly the RecipeScraperOpenGraph strategy works on most websites, so waiting for an exception and falling back to video processing won't work (try a YT link, e.g. https://www.youtube.com/watch?v=Cyskqnp1j64).

Can we add some logic in the OpenAI scraper which does this? I imagine it goes something like:

Attempt to download the video. If successful, process it like a video

If that fails, process the HTML (the existing way)

If that fails, assume OpenAI cannot process the recipe

Open to better suggestions than that, that's just my gut, but we definitely shouldn't rely on route-level exception handling to trigger the fallback.

Alternatively we can create a new scraper strategy called OpenAIVideo or something, and that inherits from the existing OpenAI service, then just register that before the existing one. This is probably cleaner.

michael-genson · 2026-02-20T16:19:32Z

mealie/services/recipe/recipe_service.py

+        temp_id = os.getpid()
+        output_template = f"/tmp/mealie_{temp_id}"  # No extension here


Change this to use get_temporary_path (from mealie.core.dependencies.dependencies import get_temporary_path)

michael-genson · 2026-02-20T16:25:54Z

mealie/services/recipe/recipe_service.py

+                for line in subtitle_content.split("\n"):
+                    if line.strip() and not line.startswith("WEBVTT") and "-->" not in line and not line.isdigit():
+                        lines.append(line.strip())


Is there a better way to parse this? I'm okay leaving this for a future PR if there's not a quick solution.

For instance, from my YT video, all the text is wrapped in XML:
<00:02:58.000><c> beef</c>
Which adds a lot of unneeded tokens/cost to the OpenAI request.

michael-genson · 2026-02-20T16:30:17Z

frontend/pages/g/[groupSlug]/r/create/url.vue

            <BaseButton
              :disabled="recipeUrl === null"
              rounded
              block
              type="submit"
              :loading="loading"
            />
          </div>


This is probably not in scope of this PR, but just wanted to comment on it. Right now we have a single loading state for all import strategies, and video processing takes waaaayyyy longer than other strategies, so users might start to think something's broken.

I don't think there's a quick solution to this (since the backend determines the strategy and doesn't communicate it until the very end), but something to keep in mind for a follow-up PR if you (or anyone) thinks of something.

AurelienPautet added 8 commits December 18, 2025 14:04

feat: first working implementation of import by video url (ex tiktok,…

528693c

… insta, facebook, youtube...)

reformat: better naming for video data variable

05d50da

fix: smaller prompt (from my n8n automation)

6b09417

feat: better create from video url page

372e0a6

feat: audio transcritption now works with all the providers

6489a89

docs: added/updated the doc for the new video url import method

11203aa

test: added test using mock data for the create video route

6ab0395

Merge branch 'mealie-next' of https://github.com/mealie-recipes/mealie …

83593a3

…into video-parser

github-actions bot added the feature label Dec 22, 2025

mlouward reviewed Dec 26, 2025

View reviewed changes

docs/docs/documentation/getting-started/installation/backend-config.md Outdated Show resolved Hide resolved

docs: fix missing opening table symbol

0965d4b

Co-authored-by: Maxime Louward <61564950+mlouward@users.noreply.github.com>

AurelienPautet added 2 commits February 2, 2026 07:16

merge branche mealie-next into this branch

e514f5b

feat: Enhance video processing with custom exceptions and improved er…

02f4487

…ror handling

feat: Video scrapping is now available on the create/url route as a f…

a0ac12e

…allback if the web scraping failed. And also better error handling + bulk video url import working

michael-genson added 5 commits February 20, 2026 15:31

Merge branch 'mealie-next' into pr/AurelienPautet/6764

b487ec3

restore missing env var doc

d83867e

tweak docs

18d7257

simplify prompt

53a7f28

small import change

147b359

michael-genson requested changes Feb 20, 2026

View reviewed changes

		"url-form-hint": "Copy and paste a link from your favorite recipe website",
		"url-form-hint": "Copy and paste a link from your favorite recipe website or a link to a social media video",

		"scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",
		"scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site or the video you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",

		temp_id = os.getpid()
		output_template = f"/tmp/mealie_{temp_id}" # No extension here

Uh oh!

Comments

Conversation

AurelienPautet commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Technical Implementation

Special notes for your reviewer:

Testing

Uh oh!

Uh oh!

michael-genson commented Jan 30, 2026

Uh oh!

BinEP commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AurelienPautet commented Feb 7, 2026

Uh oh!

chunkychode commented Feb 18, 2026

Uh oh!

michael-genson commented Feb 18, 2026

Uh oh!

michael-genson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michael-genson Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

AurelienPautet commented Dec 22, 2025 •

edited

Loading

BinEP commented Feb 6, 2026 •

edited

Loading

michael-genson Feb 20, 2026 •

edited

Loading