Skip to content

AI StoryTeller is a multimodal AI application that converts images into creative short stories by combining computer vision and natural language generation. The system uses a pretrained image captioning model to understand visual content and Google Gemini to generate context-aware narratives grounded in the image.

Notifications You must be signed in to change notification settings

SouravLenka/AI_StoryTeller

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

✨ AI StoryTeller: Transforming Images into Magic πŸ“–

AI StoryTeller is a premium web application that breathes life into your photos. By leveraging the power of BLIP (Image Captioning) and Google Gemini 2.0, it transforms visual moments into enchanting, human-like stories with just a few clicks.

πŸš€ Features

  • 🎨 Premium Dark UI: A modern, immersive interface featuring glassmorphism, smooth animations, and a "Magic" aesthetic.
  • πŸ–ΌοΈ Instant Image Preview: See your moments immediately before they are transformed.
  • πŸ€– Gemini 2.0 Powered: Utilizes the latest gemini-2.0-flash model for intelligent and creative storytelling.
  • 🎭 Genre & Mood Control: Guide the AI's creativity by selecting specific genres (Fantasy, Sci-Fi, Mystery) and moods (Whimsical, Cinematic, Tense).
  • ⚑ Real-time Feedback: Engaging loading states and refined error handling ("Magic Interrupted") for a seamless experience.

πŸ› οΈ Tech Stack

  • Backend: FastAPI (Python)
  • AI Models:
    • Image Captioning: Salesforce BLIP
    • Storytelling: Google Gemini 2.0 Flash
  • Frontend: Vanilla HTML5, CSS3 (Modern Glassmorphism Design), JavaScript (ES6+)
  • Environment: Python Dotenv for secure key management.

πŸ“¦ Installation & Setup

1. Clone the repository

git clone https://github.com/yourusername/AI-StoryTeller.git
cd AI-StoryTeller

2. Create a Virtual Environment

python -m venv venv
# On Windows:
.\venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Configuration

Create a .env file in the root directory and add your Google Gemini API key:

GEMINI_API_KEY=your_actual_api_key_here

5. Run the Application

python -m app.main

Head over to http://127.0.0.1:8000 to start crafting your stories!

πŸ›‘οΈ Security & Privacy

  • Secure Keys: The .env file is protected and ignored by Git.
  • Private Media: The uploads/ directory and temporary image files are excluded from commits to ensure your privacy.

πŸ“œ License

This project is licensed under the MIT License.

About

AI StoryTeller is a multimodal AI application that converts images into creative short stories by combining computer vision and natural language generation. The system uses a pretrained image captioning model to understand visual content and Google Gemini to generate context-aware narratives grounded in the image.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published