Skip to content

Comments

feat: Add social media video import (YouTube, TikTok, Instagram)#6764

Open
AurelienPautet wants to merge 17 commits intomealie-recipes:mealie-nextfrom
AurelienPautet:video-parser
Open

feat: Add social media video import (YouTube, TikTok, Instagram)#6764
AurelienPautet wants to merge 17 commits intomealie-recipes:mealie-nextfrom
AurelienPautet:video-parser

Conversation

@AurelienPautet
Copy link
Contributor

@AurelienPautet AurelienPautet commented Dec 22, 2025

What this PR does / why we need it:

This PR introduces the ability to import recipes directly from social media video URLs (Instagram Reels, TikTok, Facebook, and YouTube).

Currently, Mealie excels at importing from blogs, but many modern recipes exist primarily in video format where the instructions are spoken rather than written. This feature bridges that gap by using AI to transcribe and parse video content into structured recipe data.

Technical Implementation

I opted for a native implementation using yt-dlp and ffmpeg rather than third-party scraping APIs (like Apify) to keep dependencies local and avoid vendor lock-in.

Following some advice from michael-genson, the import from video URL is now in the same page as the classic web import, so the modified workflow is the following:

  1. Try to scrape the URL with the classic web scraper
  2. If it fails, scrape it with the video URL scraper

Moreover, this also works with the bulk importer.

The video url scrapper workflow operates as follows:

  1. Metadata Extraction: yt-dlp fetches the video title, description, and thumbnail.
  2. Smart Transcription:
    • Priority 1: If official subtitles/captions are available, we download and use them (fastest).
    • Priority 2: If no subtitles exist, we download the audio stream.
    • Processing: The audio is processed via ffmpeg into a lightweight, mono-channel MP3 to minimize bandwidth.
    • AI Transcription: The audio file is sent to a transcription provider (default: OpenAI Whisper, configurable via .env).
  3. Recipe Generation: The video metadata (title, description) and the transcript are sent to an LLM, which parses the unstructured data into a valid Mealie recipe JSON.

Docker Changes:
I added ffmpeg to the Dockerfile. This is a standard, lightweight tool required for yt-dlp's audio post-processing. It allows us to standardize audio input from various platforms and consumes zero system resources when idle.

Here is a demo of the new import from video URL page:

Enregistrement.de.l.ecran.2025-12-22.a.13.53.05.mp4

Special notes for your reviewer:

It’s my biggest contribution to Mealie yet, and I’m not sure whether my code is structured perfectly.

Like, I don’t know if my functions are always in the best matching folders and files.

Testing

I have added unit tests using mock responses to verify the new API routes without hitting external services.

I also performed extensive manual testing of the full flow using:

  • YouTube, Instagram, Facebook & TikTok links: Tested videos with and without hardcoded subtitles.
  • Providers: Validated successful imports using both:
    • OpenAI (Whisper model)
    • Google Gemini (Gemini 2.5 Flash)

Both providers successfully generated valid recipes, with Gemini showing slightly faster processing times during my tests.

Co-authored-by: Maxime Louward <61564950+mlouward@users.noreply.github.com>
@michael-genson
Copy link
Collaborator

At a high level this looks good, I like the usage of ffmpeg and whisper to process video/audio, great work! I still need to dive into the implementation details.

Would it be better to build this into the URL import, instead of having a dedicated page for it? I think it would be nicer to have a single "URL" entrypoint for users (and the UI is cleaner, our import page is already a bit bloated). I haven't looked into the mechanics of how you locate the video before downloading/processing, if we're unable to do that automatically then I see a good reason to keep it as a separate page.

@BinEP
Copy link

BinEP commented Feb 6, 2026

Having it under the same url import might make api clients easier to implement. i.e. - the iOS shortcut or home assistant. iOS shortcut is the primary way I get videos into mealie. The interface of the shortcut isn't ideal for figuring out if it's a video URL.
Maybe there could be a configurable list of domains that would decide if a URL matched a video (if "auto" detect isn't possible that is)

I used another repo, but I also added a "choose best thumbnail" AI step so the mealie thumbnails would be better.
I also added the link to the original video in the description

export async function selectBestFoodThumbnail(
  thumbnailUrls: string[],
): Promise<string> {
  if (thumbnailUrls.length <= 1) {
    return thumbnailUrls[0] || '';
  }

  try {
    const { text } = await generateText({
      model: textModel,
      messages: [
        {
          role: 'user',
          content: [
            {
              type: 'text',
              text: `You are analyzing thumbnails from a cooking video to select the best one for a recipe. 
                            Please analyze these ${thumbnailUrls.length} thumbnails and return ONLY the index number (0-${thumbnailUrls.length - 1}) of the thumbnail that:
                            1. Shows food most prominently
                            2. Has the best visual quality/clarity
                            3. Would be most appealing as a recipe thumbnail
                            
                            Return only a single number (the index), no other text.`,
            },
            ...thumbnailUrls.map(url => ({
              type: 'image' as const,
              image: url,
            })),
          ],
        },
      ],
    });

    const selectedIndex = parseInt(text.trim());
    if (
      isNaN(selectedIndex) ||
      selectedIndex < 0 ||
      selectedIndex >= thumbnailUrls.length
    ) {
      console.warn(
        'AI returned invalid thumbnail index, using first thumbnail',
      );
      return thumbnailUrls[0];
    }

    console.log(
      `AI selected thumbnail ${selectedIndex} out of ${thumbnailUrls.length} options`,
    );
    return thumbnailUrls[selectedIndex];
  } catch (error) {
    console.error('Error selecting best thumbnail with AI:', error);
    // Fallback to first thumbnail
    return thumbnailUrls[0];
  }
}

…allback if the web scraping failed. And also better error handling + bulk video url import working
@AurelienPautet
Copy link
Contributor Author

Would it be better to build this into the URL import, instead of having a dedicated page for it? I think it would be nicer to have a single "URL" entrypoint for users (and the UI is cleaner, our import page is already a bit bloated). I haven't looked into the mechanics of how you locate the video before downloading/processing, if we're unable to do that automatically then I see a good reason to keep it as a separate page.

I totally agree with you. I've updated the PR so that both web and video URL scraping are handled through a single URL entrypoint (I've retained the video URL route exclusively for API use).

Now, when calling recipe/create/url:

  1. It first attempts to scrape the URL using the classic web scraping method.
  2. If that fails, it falls back to the new video URL scraping method.

This way, all websites supported by the yt-dl library can now be used to import recipes into Mealie.

@chunkychode
Copy link

this is so much better than my share to email, n8n, social to mealie automation :)
is this PR kinda dead now?!?

@michael-genson
Copy link
Collaborator

I'm quite excited to get this one in, just haven't had the time to properly review and provide feedback yet!

Copy link
Collaborator

@michael-genson michael-genson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great! I made a few small tweaks:

  • updated docs to include version tags
  • simplified the prompt a bit to be more in-line with our new prompts

I provided some feedback, only one major issue. I want to test a few different video sources and see how well it works but otherwise this is pretty close to being ready.

Comment on lines -451 to +455
"url-form-hint": "Copy and paste a link from your favorite recipe website",
"url-form-hint": "Copy and paste a link from your favorite recipe website or a link to a social media video",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's simplify this to "Copy and paste a link from your favorite website" (drop the word recipe from the original). More on this below.

Comment on lines -639 to +643
"scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",
"scrape-recipe-description": "Scrape a recipe by url. Provide the url for the site or the video you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these options are only available if transcriptions are enabled, can we separate this out? Something like:

"Scrape a recipe by url. Provide the url for the site you want to scrape, and Mealie will attempt to scrape the recipe from that site and add it to your collection."
(if transcriptions) "You can also provide the url to a video and Mealie will attempt to transcribe it into a recipe."

"error-title": "Looks Like We Couldn't Find Anything",
"error-title-rate-limit": "Rate Limit Exceeded",
"error-details-rate-limit": "The AI service is currently rate-limited. Please wait a moment and try again.",
"error-title-server": "Something Went Wrong",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use events.something-went-wrong instead?

"error-title-rate-limit": "Rate Limit Exceeded",
"error-details-rate-limit": "The AI service is currently rate-limited. Please wait a moment and try again.",
"error-title-server": "Something Went Wrong",
"error-details-server": "An unexpected error occurred while processing your request. Please try again later.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch this to "an-unexpected-error-occurred-request": "...same-text" and move it to general?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if this is for server errors (500 errors) we can probably drop this entirely and just stick with "Something went wrong". We use this pattern elsewhere in the app

Comment on lines +150 to +162
video_fallback_enabled = self.settings.OPENAI_ENABLED and self.settings.OPENAI_ENABLE_TRANSCRIPTION_SERVICES

try:
return await self._create_recipe_from_web(req)
except HTTPException as e:
if e.status_code != 400:
raise
# If OpenAI transcription is not available so re-raises the original error
if not video_fallback_enabled:
raise

# Normal scraping failed so try parsing as a video URL
return await self._create_recipe_from_video_url(req.url, translate_language=translate_language)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have multiple scraper strategies prioritized in mealie.services.scraper.recipe_scraper. Particularly the RecipeScraperOpenGraph strategy works on most websites, so waiting for an exception and falling back to video processing won't work (try a YT link, e.g. https://www.youtube.com/watch?v=Cyskqnp1j64).

Can we add some logic in the OpenAI scraper which does this? I imagine it goes something like:

  1. Attempt to download the video. If successful, process it like a video
  2. If that fails, process the HTML (the existing way)
  3. If that fails, assume OpenAI cannot process the recipe

Open to better suggestions than that, that's just my gut, but we definitely shouldn't rely on route-level exception handling to trigger the fallback.

Copy link
Collaborator

@michael-genson michael-genson Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we can create a new scraper strategy called OpenAIVideo or something, and that inherits from the existing OpenAI service, then just register that before the existing one. This is probably cleaner.

Comment on lines +615 to +616
temp_id = os.getpid()
output_template = f"/tmp/mealie_{temp_id}" # No extension here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to use get_temporary_path (from mealie.core.dependencies.dependencies import get_temporary_path)

Comment on lines 742 to 744
for line in subtitle_content.split("\n"):
if line.strip() and not line.startswith("WEBVTT") and "-->" not in line and not line.isdigit():
lines.append(line.strip())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to parse this? I'm okay leaving this for a future PR if there's not a quick solution.

For instance, from my YT video, all the text is wrapped in XML:
<00:02:58.000><c> beef</c>
Which adds a lot of unneeded tokens/cost to the OpenAI request.

Comment on lines 61 to 68
<BaseButton
:disabled="recipeUrl === null"
rounded
block
type="submit"
:loading="loading"
/>
</div>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not in scope of this PR, but just wanted to comment on it. Right now we have a single loading state for all import strategies, and video processing takes waaaayyyy longer than other strategies, so users might start to think something's broken.

I don't think there's a quick solution to this (since the backend determines the strategy and doesn't communicate it until the very end), but something to keep in mind for a follow-up PR if you (or anyone) thinks of something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants