An AI-powered nutrition assistant that identifies food items from images and estimates calorie intake using Google’s Gemini Vision model — all through an interactive Streamlit app.
This project demonstrates how to integrate Google’s Gemini Vision model into a Streamlit web app to perform multimodal reasoning — combining text prompts and image understanding.
Upload a photo of food, and the app will analyze its contents and estimate calorie intake for each item.
- 🖼️ Image Understanding: Processes and interprets uploaded images using Gemini Vision.
- 🍽️ Calorie Estimation: Provides estimated calorie counts for each food item detected.
- 💬 Prompt Customization: Accepts user input for custom queries or contextual instructions.
- ⚙️ Streamlit UI: Simple and lightweight frontend for easy interaction.
⚠️ Input Validation: Ensures both image and prompt are provided before generating a response.
- Python 3.10+
- Streamlit
- Google Generative AI (Gemini)
- Pillow (PIL)
- dotenv
📂 gemini-nutrition-analyzer
├── app.py
├── requirements.txt
├── .env
└── README.mdUsing conda:
conda create -p venv python==3.10 -y
conda activate ./venvMake sure you have a requirements.txt file in your project root, then run:
pip install -r requirements.txtCreate a .env file in your project directory and add your Google API key:
GOOGLE_API_KEY=your_google_api_key_hereLaunch the app locally using:
streamlit run app.py- Enter a text prompt (e.g., “Identify and estimate calories in this image”).
- Upload an image (JPEG, PNG).
- Click “Analyze Image” to process the inputs.
- The app validates that both a prompt and image are provided.
- View the AI-generated calorie breakdown and total estimate.
This project was inspired by the tutorial by Krish Naik:
🔗 Complete Langchain GEN AI Crash Course With 6 End To End LLM Projects With OPENAI,LLAMA2,Gemini Pro