Image Generation and Analysis with DIAL API

A Python implementation task to work with AI image generation and analysis capabilities via DIAL API

🎓 Learning Goals

By completing these tasks, you will learn:

How to generate images from text prompts using DALL-E 3
How to analyze images using different AI models (GPT-4o, Claude-3-Sonnet)
Two different approaches for handling images in AI systems:
- OpenAI approach: Base64 encoding for direct embedding
- DIAL approach: Bucket storage with attachment references
How to work with file uploads, downloads with DIAL bucket
How DIAL adapts requests for different AI model vendors

📋 Requirements

Python 3.11+
pip
API key for DIAL service

🔧 Setup

Install dependencies:
```
pip install -r requirements.txt
```
Set your API key:
- Ensure that you are connected to the EPAM VPN
- Get the DIAL API key here: https://support.epam.com/ess?id=sc_cat_item&table=sc_cat_item&sys_id=910603f1c3789e907509583bb001310c
- Update the API_KEY constant in task/_utils/constants.py
- Get available models from: https://ai-proxy.lab.epam.com/openai/models

Project structure:

task/
├── _models/
│   ├── conversation.py          # ✅ Complete
│   ├── message.py               # ✅ Complete  
│   ├── role.py                  # ✅ Complete
│   └── custom_content.py        # ✅ Complete
├── _utils/
│   ├── model_client.py          # ✅ Complete
│   ├── bucket_client.py         # ✅ Complete
│   ├── constants.py             # ✅ Complete
│   └── request.py               # ✅ Complete
├── image_to_text/
│   ├── openai/
│   │   ├── message.py           # ✅ Complete
│   │   └── task_openai_itt.py   # 🚧 TODO
│   └── task_dial_itt.py         # 🚧 TODO
└── text_to_image/
    └── task_tti.py              # 🚧 TODO
dialx-banner.png                 # 📁 Sample image

📝 Your Tasks

If the task in the main branch is hard for you, then switch to the `with-detailed-description` branch

Complete the implementation of these three practice files:

1. task_openai_itt.py - OpenAI-Style Image Analysis

Goal: Analyze an image using base64 encoding approach

Create DialModelClient with GPT-4o model (and other models)
Encode image as base64 data URL
Send ContentedMessage with text and image content
Key Learning: Direct image embedding in messages

2. task_dial_itt.py - DIAL-Style Image Analysis

Goal: Analyze an image using bucket storage approach

Upload image to DIAL bucket storage
Create message with attachment reference
Test with different AI models
Key Learning: File storage and attachment handling

3. task_tti.py - Text-to-Image Generation

Goal: Generate images from text prompts

Create text prompt for image generation
Use DALL-E 3 model for generation
Download and save generated images
Experiment with size, quality and style of output via custom_fields configuration parameter
Key Learning: AI image generation and file handling

🎯 Expected Outputs

Text-to-Image Task

Generated image file saved locally with timestamp
Console output showing request/response details

Image-to-Text Tasks

AI description of the dialx-banner.png image
Comparison between different models' responses

🛠️ Troubleshooting

Common Issues:

Empty API key: Update constants.py with your DIAL API key
VPN connection: Ensure EPAM VPN is active
File not found: Verify dialx-banner.png exists in project root
Network errors: Check DIAL service status and connectivity

🌟 Bonus Challenges

Once you complete the basic tasks, try these extensions:

Multi-model comparison: Run the same image analysis with different models
Custom prompts: Create your own text-to-image prompts
Batch processing: Analyze multiple images at once
Error handling: Add robust error handling and retries

📚 Key Concepts Covered

Multimodal AI: Working with both text and images
API Design Patterns: Different approaches to handle media content
Async Programming: File operations and HTTP requests
Model Abstraction: How DIAL Core adapts requests across vendors
File Management: Upload, download, and storage operations

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
task		task
README.md		README.md
dialx-banner.png		dialx-banner.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Generation and Analysis with DIAL API

🎓 Learning Goals

📋 Requirements

🔧 Setup

📝 Your Tasks

If the task in the main branch is hard for you, then switch to the `with-detailed-description` branch

1. task_openai_itt.py - OpenAI-Style Image Analysis

2. task_dial_itt.py - DIAL-Style Image Analysis

3. task_tti.py - Text-to-Image Generation

🎯 Expected Outputs

Text-to-Image Task

Image-to-Text Tasks

🛠️ Troubleshooting

🌟 Bonus Challenges

📚 Key Concepts Covered

About

Uh oh!

Languages

khshanovskyi/ai-dial-content-generation

Folders and files

Latest commit

History

Repository files navigation

Image Generation and Analysis with DIAL API

🎓 Learning Goals

📋 Requirements

🔧 Setup

📝 Your Tasks

If the task in the main branch is hard for you, then switch to the with-detailed-description branch

1. task_openai_itt.py - OpenAI-Style Image Analysis

2. task_dial_itt.py - DIAL-Style Image Analysis

3. task_tti.py - Text-to-Image Generation

🎯 Expected Outputs

Text-to-Image Task

Image-to-Text Tasks

🛠️ Troubleshooting

🌟 Bonus Challenges

📚 Key Concepts Covered

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages

If the task in the main branch is hard for you, then switch to the `with-detailed-description` branch