-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add AI inference infrastructure and OSS agent example #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Added detailed comments explaining OSS model connection setup - Documented function calling patterns and tool creation - Clarified vLLM-specific configuration with Ollama alternatives - Added usage examples and extension suggestions for developers 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Split monolithic OpenTofu config into separate iam and inference modules - Created iam module for IAM roles, policies, and instance profiles - Created inference module for EC2 instances and related resources - Simplified user_data.sh to focus only on vLLM systemd service setup - Updated Makefile with module-specific commands (tofu-iam-*, tofu-inference-*) - Maintained backward compatibility with existing tofu-* commands - Applied IAM infrastructure successfully 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Infrastructure Updates: - Increased systemd service timeout to 6 minutes for vLLM container pulls - Updated inference module to use data source for IAM instance profile - Simplified Packer provision script with better organization - Removed GPU driver installation from user_data (now in AMI) Agent Updates: - Enhanced OSS agent with Rich formatting and interactive REPL - Changed tool_choice from required to auto for better UX - Added comprehensive inline documentation for agent patterns Documentation Updates: - Updated README with modular OpenTofu structure and OSS agent - Added new Makefile commands for agent operations (run, install, check) - Documented vLLM integration and systemd service management - Updated project structure and technology stack TODO: Consider making hostname and vLLM args configurable in user_data template 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
The AWS CLI is pre-installed on GitHub Actions runners, so we can remove the installation steps and cache paths to simplify the workflow. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
OpenTofu installs to /usr/bin/tofu when using the deb method, so we need to include both /usr/bin/tofu and /usr/local/bin/tofu in the cache paths to ensure it's found correctly after cache restoration. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Copy tofu binary from /usr/bin to /usr/local/bin after installation to avoid permission issues with cache restoration. The /usr/bin path cannot be cached due to permission restrictions. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Set PATH globally to include /usr/local/bin so that the tofu binary installed to that location can be found by all jobs in the workflow. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove invalid workflow-level env.PATH reference and add PATH environment variable at job level for all jobs that need tofu. Also add explicit PATH export in verification step. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove all OpenTofu-related code from the workflow: - Remove OpenTofu installation step - Remove OpenTofu from cache paths - Remove tofu version check from verification - Remove entire opentofu-validation job - Remove OpenTofu references from logs-and-output job - Remove PATH environment variables that were added for tofu This simplifies the workflow to focus on Packer validation only. We can add OpenTofu back later when needed. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🚀 Overview
This PR introduces a comprehensive AI inference infrastructure solution with
modular architecture, automated AMI building, and practical OSS agent examples. The
implementation follows Infrastructure as Code best practices and provides an
"executable codebase" approach where all functionality is discoverable through a
unified Makefile interface.
🏗️ Infrastructure Components
Custom AMI Building (Packer)
Modular OpenTofu Configuration
access
lifecycle management
🤖 Agent Examples
OSS Agent Implementation
OSS model usage
🛠️ Automation & Tooling
Makefile Interface (Executable Codebase)
CI/CD Pipeline
🔧 Key Features
and maintainability
lifecycle
runtime
📊 Changes Summary
🚀 Quick Start
🎯 Design Philosophy
This implementation embodies the "executable codebase" principle where:
🔄 Future Enhancements