feat: Adding Gemma3 inference server #6

lukebrady · 2025-09-03T04:04:51Z

Infrastructure Changes:

Add Gemma 3 27B model deployment (google/gemma-3-27b-it)
Configure g6.12xlarge instance type for 4-GPU tensor parallelism
Set tensor-parallel-size=4 for distributed inference
Increase timeout to 900s for large model loading

Documentation Updates:

Update main README to reflect three-model deployment (Qwen 3 0.6B, GPT-OSS 20B, Gemma 3 27B)
Create comprehensive OpenTofu README with detailed deployment guide
Add cost considerations and instance type selection guidance
Include tensor parallelism configuration examples
Document security features and customization options

Model Configuration Details:

Qwen 3 0.6B: Lightweight testing model on g5.2xlarge
GPT-OSS 20B: Medium production model on g5.2xlarge
Gemma 3 27B: High-performance model on g6.12xlarge with 4-GPU parallelism

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

…ocumentation Infrastructure Changes: - Add Gemma 3 27B model deployment (google/gemma-3-27b-it) - Configure g6.12xlarge instance type for 4-GPU tensor parallelism - Set tensor-parallel-size=4 for distributed inference - Increase timeout to 900s for large model loading Documentation Updates: - Update main README to reflect three-model deployment (Qwen 3 0.6B, GPT-OSS 20B, Gemma 3 27B) - Create comprehensive OpenTofu README with detailed deployment guide - Add cost considerations and instance type selection guidance - Include tensor parallelism configuration examples - Document security features and customization options Model Configuration Details: - Qwen 3 0.6B: Lightweight testing model on g5.2xlarge - GPT-OSS 20B: Medium production model on g5.2xlarge - Gemma 3 27B: High-performance model on g6.12xlarge with 4-GPU parallelism 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

lukebrady and others added 2 commits September 3, 2025 00:04

Adding Gemma3 inference server

9d34443

lukebrady changed the title ~~Adding Gemma3 inference server~~ feat: Adding Gemma3 inference server Sep 5, 2025

Removing readme

3a76ac5

lukebrady merged commit 097799e into main Sep 5, 2025
6 checks passed

lukebrady deleted the gemma-27b branch September 5, 2025 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Adding Gemma3 inference server #6

feat: Adding Gemma3 inference server #6

Uh oh!

lukebrady commented Sep 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Adding Gemma3 inference server #6

feat: Adding Gemma3 inference server #6

Uh oh!

Conversation

lukebrady commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lukebrady commented Sep 3, 2025 •

edited

Loading