Skip to content

Conversation

@lukebrady
Copy link
Owner

@lukebrady lukebrady commented Aug 22, 2025

🚀 Overview

This PR introduces a comprehensive AI inference infrastructure solution with
modular architecture, automated AMI building, and practical OSS agent examples. The
implementation follows Infrastructure as Code best practices and provides an
"executable codebase" approach where all functionality is discoverable through a
unified Makefile interface.

🏗️ Infrastructure Components

Custom AMI Building (Packer)

  • Ubuntu 24.04 base optimized for AI workloads
  • GPU support with NVIDIA drivers and container toolkit
  • Docker integration with systemd service management
  • Automated provisioning scripts for consistent deployments

Modular OpenTofu Configuration

  • IAM Module: Dedicated roles, policies, and instance profiles for secure EC2
    access
  • Inference Module: EC2 instances with vLLM server deployment and container
    lifecycle management
  • Security: EBS encryption, security groups, and IAM best practices
  • Scalability: Configurable instance types and storage volumes

🤖 Agent Examples

OSS Agent Implementation

  • vLLM Integration: OpenAI-compatible API for local model inference
  • Function Calling: Wikipedia search tools with structured Pydantic schemas
  • Interactive REPL: Rich formatting with markdown rendering
  • Comprehensive Documentation: Inline comments explaining agent patterns and
    OSS model usage

🛠️ Automation & Tooling

Makefile Interface (Executable Codebase)

# Infrastructure Operations
make ami-build                    # Build custom Ubuntu 24.04 AI AMI
make tofu-iam-apply              # Deploy IAM resources
make tofu-inference-apply        # Deploy inference infrastructure
make tofu-apply                  # Deploy all infrastructure

# Agent Operations
make agent-oss-run               # Run OSS agent interactively
make agent-oss-install           # Install dependencies
make agent-oss-check             # Validate environment

# Development & Validation
make check-prerequisites         # Verify required tools
make setup                       # Complete project initialization

CI/CD Pipeline

  • GitHub Actions: Automated validation of Makefile commands
  • Infrastructure Testing: Packer configuration validation and OpenTofu planning
  • Quality Assurance: Continuous integration for infrastructure changes

🔧 Key Features

  • Modular Architecture: Separate IAM and inference deployments for better security
    and maintainability
  • Container Management: systemd integration with Docker for reliable service
    lifecycle
  • GPU Acceleration: L4-equivalent instances (g5.2xlarge) with NVIDIA container
    runtime
  • Agent Framework: Practical examples of building agents with OSS models
  • Documentation: Comprehensive inline documentation and README updates
  • Executable Interface: Unified Makefile serving as a discoverable tool catalog

📊 Changes Summary

  • 26 files changed: 2,297 insertions, 2 deletions
  • New Infrastructure: Complete OpenTofu modules for IAM and inference
  • Custom AMI: Packer configuration for GPU-enabled Ubuntu 24.04
  • Agent Examples: Full OSS agent implementation with rich documentation
  • Automation: 20+ Makefile commands for streamlined operations
  • CI/CD: GitHub Actions workflow for continuous validation

🚀 Quick Start

  # Setup and validate environment
  make setup

  # Build custom AMI
  make ami-build

  # Deploy infrastructure
  make tofu-apply

  # Run OSS agent
  make agent-oss-run

🎯 Design Philosophy

This implementation embodies the "executable codebase" principle where:

  • All functionality is discoverable through make help
  • Both humans and AI agents can interact with the same interface
  • Infrastructure and code are equally documented and maintainable
  • Every command is self-contained and includes necessary context

🔄 Future Enhancements

  • Configurable hostname and vLLM arguments in user_data template
  • Auto-scaling groups for production workloads
  • Monitoring and observability stack integration
  • Additional agent examples and patterns

lukebrady and others added 11 commits August 22, 2025 14:24
- Added detailed comments explaining OSS model connection setup
- Documented function calling patterns and tool creation
- Clarified vLLM-specific configuration with Ollama alternatives
- Added usage examples and extension suggestions for developers

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Split monolithic OpenTofu config into separate iam and inference modules
- Created iam module for IAM roles, policies, and instance profiles
- Created inference module for EC2 instances and related resources
- Simplified user_data.sh to focus only on vLLM systemd service setup
- Updated Makefile with module-specific commands (tofu-iam-*, tofu-inference-*)
- Maintained backward compatibility with existing tofu-* commands
- Applied IAM infrastructure successfully

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Infrastructure Updates:
- Increased systemd service timeout to 6 minutes for vLLM container pulls
- Updated inference module to use data source for IAM instance profile
- Simplified Packer provision script with better organization
- Removed GPU driver installation from user_data (now in AMI)

Agent Updates:
- Enhanced OSS agent with Rich formatting and interactive REPL
- Changed tool_choice from required to auto for better UX
- Added comprehensive inline documentation for agent patterns

Documentation Updates:
- Updated README with modular OpenTofu structure and OSS agent
- Added new Makefile commands for agent operations (run, install, check)
- Documented vLLM integration and systemd service management
- Updated project structure and technology stack

TODO: Consider making hostname and vLLM args configurable in user_data template

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@lukebrady lukebrady changed the title vibe coding ai infra feat: Add AI inference infrastructure and OSS agent example Aug 25, 2025
lukebrady and others added 7 commits August 25, 2025 23:34
The AWS CLI is pre-installed on GitHub Actions runners, so we can remove
the installation steps and cache paths to simplify the workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
OpenTofu installs to /usr/bin/tofu when using the deb method, so we need
to include both /usr/bin/tofu and /usr/local/bin/tofu in the cache paths
to ensure it's found correctly after cache restoration.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy tofu binary from /usr/bin to /usr/local/bin after installation
to avoid permission issues with cache restoration. The /usr/bin path
cannot be cached due to permission restrictions.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Set PATH globally to include /usr/local/bin so that the tofu binary
installed to that location can be found by all jobs in the workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove invalid workflow-level env.PATH reference and add PATH
environment variable at job level for all jobs that need tofu.
Also add explicit PATH export in verification step.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove all OpenTofu-related code from the workflow:
- Remove OpenTofu installation step
- Remove OpenTofu from cache paths
- Remove tofu version check from verification
- Remove entire opentofu-validation job
- Remove OpenTofu references from logs-and-output job
- Remove PATH environment variables that were added for tofu

This simplifies the workflow to focus on Packer validation only.
We can add OpenTofu back later when needed.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@lukebrady lukebrady merged commit c725b8e into main Aug 26, 2025
6 checks passed
@lukebrady lukebrady deleted the feat/infra-ai-inference-ami branch August 26, 2025 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant