feat: Add AI inference infrastructure and OSS agent example #3

lukebrady · 2025-08-22T18:24:46Z

🚀 Overview

This PR introduces a comprehensive AI inference infrastructure solution with
modular architecture, automated AMI building, and practical OSS agent examples. The
implementation follows Infrastructure as Code best practices and provides an
"executable codebase" approach where all functionality is discoverable through a
unified Makefile interface.

🏗️ Infrastructure Components

Custom AMI Building (Packer)

Ubuntu 24.04 base optimized for AI workloads
GPU support with NVIDIA drivers and container toolkit
Docker integration with systemd service management
Automated provisioning scripts for consistent deployments

Modular OpenTofu Configuration

IAM Module: Dedicated roles, policies, and instance profiles for secure EC2
access
Inference Module: EC2 instances with vLLM server deployment and container
lifecycle management
Security: EBS encryption, security groups, and IAM best practices
Scalability: Configurable instance types and storage volumes

🤖 Agent Examples

OSS Agent Implementation

vLLM Integration: OpenAI-compatible API for local model inference
Function Calling: Wikipedia search tools with structured Pydantic schemas
Interactive REPL: Rich formatting with markdown rendering
Comprehensive Documentation: Inline comments explaining agent patterns and
OSS model usage

🛠️ Automation & Tooling

Makefile Interface (Executable Codebase)

# Infrastructure Operations
make ami-build                    # Build custom Ubuntu 24.04 AI AMI
make tofu-iam-apply              # Deploy IAM resources
make tofu-inference-apply        # Deploy inference infrastructure
make tofu-apply                  # Deploy all infrastructure

# Agent Operations
make agent-oss-run               # Run OSS agent interactively
make agent-oss-install           # Install dependencies
make agent-oss-check             # Validate environment

# Development & Validation
make check-prerequisites         # Verify required tools
make setup                       # Complete project initialization

CI/CD Pipeline

GitHub Actions: Automated validation of Makefile commands
Infrastructure Testing: Packer configuration validation and OpenTofu planning
Quality Assurance: Continuous integration for infrastructure changes

🔧 Key Features

Modular Architecture: Separate IAM and inference deployments for better security
and maintainability
Container Management: systemd integration with Docker for reliable service
lifecycle
GPU Acceleration: L4-equivalent instances (g5.2xlarge) with NVIDIA container
runtime
Agent Framework: Practical examples of building agents with OSS models
Documentation: Comprehensive inline documentation and README updates
Executable Interface: Unified Makefile serving as a discoverable tool catalog

📊 Changes Summary

26 files changed: 2,297 insertions, 2 deletions
New Infrastructure: Complete OpenTofu modules for IAM and inference
Custom AMI: Packer configuration for GPU-enabled Ubuntu 24.04
Agent Examples: Full OSS agent implementation with rich documentation
Automation: 20+ Makefile commands for streamlined operations
CI/CD: GitHub Actions workflow for continuous validation

🚀 Quick Start

  # Setup and validate environment
  make setup

  # Build custom AMI
  make ami-build

  # Deploy infrastructure
  make tofu-apply

  # Run OSS agent
  make agent-oss-run

🎯 Design Philosophy

This implementation embodies the "executable codebase" principle where:

All functionality is discoverable through make help
Both humans and AI agents can interact with the same interface
Infrastructure and code are equally documented and maintainable
Every command is self-contained and includes necessary context

🔄 Future Enhancements

Configurable hostname and vLLM arguments in user_data template
Auto-scaling groups for production workloads
Monitoring and observability stack integration
Additional agent examples and patterns

- Added detailed comments explaining OSS model connection setup - Documented function calling patterns and tool creation - Clarified vLLM-specific configuration with Ollama alternatives - Added usage examples and extension suggestions for developers 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Split monolithic OpenTofu config into separate iam and inference modules - Created iam module for IAM roles, policies, and instance profiles - Created inference module for EC2 instances and related resources - Simplified user_data.sh to focus only on vLLM systemd service setup - Updated Makefile with module-specific commands (tofu-iam-*, tofu-inference-*) - Maintained backward compatibility with existing tofu-* commands - Applied IAM infrastructure successfully 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Infrastructure Updates: - Increased systemd service timeout to 6 minutes for vLLM container pulls - Updated inference module to use data source for IAM instance profile - Simplified Packer provision script with better organization - Removed GPU driver installation from user_data (now in AMI) Agent Updates: - Enhanced OSS agent with Rich formatting and interactive REPL - Changed tool_choice from required to auto for better UX - Added comprehensive inline documentation for agent patterns Documentation Updates: - Updated README with modular OpenTofu structure and OSS agent - Added new Makefile commands for agent operations (run, install, check) - Documented vLLM integration and systemd service management - Updated project structure and technology stack TODO: Consider making hostname and vLLM args configurable in user_data template 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

The AWS CLI is pre-installed on GitHub Actions runners, so we can remove the installation steps and cache paths to simplify the workflow. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

OpenTofu installs to /usr/bin/tofu when using the deb method, so we need to include both /usr/bin/tofu and /usr/local/bin/tofu in the cache paths to ensure it's found correctly after cache restoration. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Copy tofu binary from /usr/bin to /usr/local/bin after installation to avoid permission issues with cache restoration. The /usr/bin path cannot be cached due to permission restrictions. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Set PATH globally to include /usr/local/bin so that the tofu binary installed to that location can be found by all jobs in the workflow. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Remove invalid workflow-level env.PATH reference and add PATH environment variable at job level for all jobs that need tofu. Also add explicit PATH export in verification step. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Remove all OpenTofu-related code from the workflow: - Remove OpenTofu installation step - Remove OpenTofu from cache paths - Remove tofu version check from verification - Remove entire opentofu-validation job - Remove OpenTofu references from logs-and-output job - Remove PATH environment variables that were added for tofu This simplifies the workflow to focus on Packer validation only. We can add OpenTofu back later when needed. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

lukebrady and others added 11 commits August 22, 2025 14:24

vibe coding ai infra

bc41baf

Vibe coding more infrastructure code :)

32a43fa

Updating Makefile and README

9e47851

Uninstall aws cli silently

bbf76df

Adding DevOps Engineer subagent

c5cc7ca

Adding subagents and updating infra code

c452738

Adding an oss-agent to use with the inference server infra

fc49e62

Add basic repl

3c9b55f

lukebrady changed the title ~~vibe coding ai infra~~ feat: Add AI inference infrastructure and OSS agent example Aug 25, 2025

lukebrady and others added 7 commits August 25, 2025 23:34

Add newline to sources.pkr.hcl for proper file formatting

ee263b4

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

lukebrady merged commit c725b8e into main Aug 26, 2025
6 checks passed

lukebrady deleted the feat/infra-ai-inference-ami branch August 26, 2025 03:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add AI inference infrastructure and OSS agent example #3

feat: Add AI inference infrastructure and OSS agent example #3

Uh oh!

lukebrady commented Aug 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add AI inference infrastructure and OSS agent example #3

feat: Add AI inference infrastructure and OSS agent example #3

Uh oh!

Conversation

lukebrady commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Overview

🏗️ Infrastructure Components

Custom AMI Building (Packer)

Modular OpenTofu Configuration

🤖 Agent Examples

OSS Agent Implementation

🛠️ Automation & Tooling

Makefile Interface (Executable Codebase)

CI/CD Pipeline

🔧 Key Features

📊 Changes Summary

🚀 Quick Start

🎯 Design Philosophy

🔄 Future Enhancements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lukebrady commented Aug 22, 2025 •

edited

Loading