Note: This repository showcases a complete DevOps implementation of the OpenTelemetry Demo Application. The original application is licensed under Apache-2.0. This implementation adds production-grade CI/CD pipelines, infrastructure-as-code using Terraform, GitOps deployment with ArgoCD, and secure cloud infrastructure on AWS.
- Project Overview
- Architecture
- Demo Video
- Technologies Used
- Project Structure
- Infrastructure Details
- CI/CD Pipeline
- Prerequisites
- Getting Started
- Deployment Process
- Challenges & Solutions
- Cost Estimation
- Key Learnings
- Acknowledgments
- License
This project demonstrates a complete DevOps transformation of a microservices-based e-commerce application. The OpenTelemetry Demo application consists of 20+ microservices written in different programming languages (Go, Python, Java, Node.js, C#, etc.).
Project Duration: December 2024 - January 2025 (1 month)
Key Achievements:
- Containerized multiple microservices with Docker best practices
- Built production-ready CI/CD pipelines using GitHub Actions
- Implemented Infrastructure-as-Code with Terraform custom modules and remote state management
- Deployed highly available EKS cluster across multiple availability zones
- Implemented GitOps workflow with ArgoCD
- Secured applications with TLS certificates and AWS ALB
- Configured custom domain with Route53 DNS management
Complete AWS infrastructure showing VPC, EKS, ALB, and other configurations
GitHub Actions workflow with GitOps deployment using ArgoCD
Full project walkthrough and implementation demonstration
Video Contents:
- CI/CD pipeline execution
- GitOps deployment with ArgoCD
- Application demonstration
- Troubleshooting and monitoring
- Cloud Provider: AWS (EKS, VPC, ALB, Route53, ACM, S3, DynamoDB)
- Infrastructure as Code: Terraform (v1.x) with custom modules and Remote State (S3 + DynamoDB)
- Container Orchestration: Kubernetes (EKS)
- Compute: EC2 t3.medium instances (EKS Node Groups)
- CI/CD Platform: GitHub Actions
- GitOps Tool: ArgoCD
- Container Registry: DockerHub
- Version Control: Git & GitHub
- Containerization: Docker (Multi-stage builds)
- Ingress Controller: AWS ALB Ingress Controller
- TLS/SSL: AWS Certificate Manager (ACM)
- DNS Management: AWS Route53
- Languages: Go (1.22), Python, Java
- Code Quality: static code analysis tools for respective languages
- CLI Tools: kubectl, eksctl, AWS CLI, terraform
.
├── src/
│ ├── product-catalog/ # Go microservice
│ │ ├── Dockerfile
│ │ ├── main.go
│ │ └── products/
│ ├── recommendation-service/ # Python microservice
│ └── ad-service/ # Java microservice
│ ... # other services
|
├── infra/
│ ├── backend/ # S3 + DynamoDB remote state
│ │ ├── main.tf
│ │ └── outputs.tf
│ ├── modules/
│ │ ├── eks/ # Custom EKS module
│ │ └── vpc/ # Custom VPC module
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
│
├── kubernetes/
│ ├── argocd
│ | ├── app-of-apps.yml
│ | ├── argocd-ingress.yml
│ | └── applications/
│ │ ├── ad-app.yml
│ │ └── product-catalog-app.yml
| | └── recommendation-app.yml
│ ├── productcatalog/
│ │ ├── deploy.yml
│ │ └── svc.yml
│ ├── recommendation/
│ │ ├── deploy.yml
│ │ └── svc.yml
│ ├── adservice/
│ │ ├── deploy.yml
│ │ └── svc.yml
│ └── frontendproxy
│ ├── deploy.yml
│ ├── svc.yml
│ └── frontendproxy-ingress.yml
│
├── .github/
│ └── workflows/
│ ├── ci-product-catalog.yml
│ ├── ci-recommendation.yml
│ └── ci-ad-service.yml
│
├── README.md # This file
├── ORIGINAL_README.md # Original OpenTelemetry demo docs
└── LICENSE # Apache-2.0 License
Network Architecture:
- VPC: Custom VPC with CIDR block
- Subnets: 4 subnets across 2 availability zones (us-east-1a, us-east-1b)
- 2 Public subnets (for NAT Gateways and ALB)
- 2 Private subnets (for EKS nodes)
- NAT Gateways: 2 NAT Gateways for high availability
- Internet Gateway: For public internet access
EKS Cluster Configuration:
- Node Type: EC2 t3.medium instances
- Node Group: Managed Node Group in private subnets
- Kubernetes Version: 1.31 (avoid extended support versions for cost optimization)
- Security: Private API endpoint with controlled access
DNS & Certificate Management:
- Domain: vishukumarpatel.com
- DNS Records:
www.vishukumarpatel.com- Application frontendapi.vishukumarpatel.com- (For testing) Application frontendargocd.vishukumarpatel.com- ArgoCD dashboard
- TLS Certificate: Wildcard certificate
*.vishukumarpatel.comfrom ACM
Load Balancing:
- AWS ALB: Application Load Balancer with TLS termination
- Target Type: IP-based targeting for direct pod communication
- Health Checks: Configured for HTTP/HTTPS endpoints
Remote State Management:
# Backend configuration
backend "s3" {
bucket = "devops-terraform-otel-eks-state-s3-bucket"
key = "terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-eks-state-locks"
encrypt = true
}Custom Modules:
- VPC Module: Reusable VPC with configurable subnets, NAT, IGW
- EKS Module: EKS cluster with managed node groups and IAM roles
This explanation is for product-catalog microservice, which is golang based, checkout .github/ repo under root directory of this project, for more information regarding CI/CD pipeline setup. The CI/CD pipeline (GitHub Actions) consists of 4 main stages:
- Checkout source code
- Setup Go 1.22 environment
- Download dependencies (
go mod download) - Compile application binary
- Run unit tests (
go test ./...)
- Run
golangci-lintfor static code analysis - Check for:
- Unused variables and imports
- Code formatting issues
- Potential bugs and code smells
- Best practice violations
- Build multi-stage Docker image
- Tag with unique identifier (
github.run_id) - Push to DockerHub repository
- Image naming:
<username>/devops-otel-product-catalog:<run_id>
- Update Kubernetes deployment manifest
- Replace image tag with new build ID
- Commit and push changes to repository
- ArgoCD detects changes and syncs automatically
on:
pull_request:
branches: [main]
push:
branches: [main]
paths:
- 'src/product-catalog/**'
- '.github/workflows/ci-product-catalog.yml'- Developer commits code changes to GitHub
- GitHub Actions pipeline executes automatically
- Pipeline builds, tests, and pushes Docker image
- Pipeline updates Kubernetes manifest with new image tag
- ArgoCD detects manifest changes in Git repository
- ArgoCD automatically syncs and deploys to EKS cluster
- Kubernetes performs rolling update with zero downtime
- Docker: Latest version
- Docker Compose: v2.0.0 or higher
- System Requirements:
- 6 GB RAM minimum
- 14 GB disk space
- Make: (Optional) for build automation
- AWS Account: With appropriate permissions
- AWS CLI: Configured with credentials
- eksctl: For EKS cluster management
- kubectl: Kubernetes command-line tool
- Terraform: v1.x or higher
- GitHub Account: For CI/CD pipelines
- DockerHub Account: For container registry
- Domain Name: For production deployment (optional)
Ensure your AWS user/role has the following policies:
AmazonEKSClusterPolicyAmazonEKSServicePolicyAmazonEC2ContainerRegistryReadOnly⚠️ Critical for EKS NodegroupAmazonEKS_CNI_Policy⚠️ Critical for EKS NodegroupAmazonEKSWorkerNodePolicy⚠️ Critical for EKS NodegroupAmazonVPCFullAccess- Custom policies for S3, DynamoDB (for Terraform state)
git clone https://github.com/VishuPatel-27/ultimate-devops-project.git
cd ultimate-devops-projectaws configure
# Enter your AWS Access Key ID, Secret Access Key, and Regioncd infra/
cd backend/
terraform init
terraform plan
terraform apply
cd .. (go back to infra directory)
terraform init
terraform plan
terraform applyaws eks update-kubeconfig --name <cluster-name> --region us-east-1
kubectl get nodesNote: Before installing ArgoCD on your cluster, go through this installation guide, official docs ,for safer side deployment.
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yml
# Apply custom ingress configuration
kubectl apply -f kubernetes/ingress/argocd-ingress.ymlNote: Before installing AWS ALB Ingress Controller on your cluster, go through this installation guide, official docs ,for safer side deployment.
# Create IAM service account for ALB controller
eksctl create iamserviceaccount \
--cluster=<cluster-name> \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=arn:aws:iam::<account-id>:policy/AWSLoadBalancerControllerIAMPolicy \
--approve
# Install ALB controller using Helm
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=<cluster-name> \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controllerAdd the following secrets to your GitHub repository:
DOCKER_TOKEN: DockerHub access tokenULTIMATE_DEVOPS_PROJECT_GITHUB_TOKEN: GitHub PAT with repo accessGIT_USER_EMAIL: Your git email
Add variables:
DOCKER_USERNAME: Your DockerHub usernameGIT_USER_NAME: Your git username
# Apply Kubernetes manifests
kubectl apply -f kubernetes/
# Or use ArgoCD UI to sync applicationsNote: For testing you could also make any dummy comment or change in codebase and commit it to Github repo which would trigger CI/CD pipeline and ArgoCD deploys the changes.
# Build Docker image locally
cd src/product-catalog
docker build -t <username>/product-catalog:<tag> .
docker push <username>/product-catalog:<tag>
# Deploy to Kubernetes
kubectl apply -f kubernetes/productcatalog/deploy.yml- Make code changes in
src/product-catalog/ - Commit and push to
mainbranch - GitHub Actions pipeline triggers automatically
- Pipeline builds, tests, and pushes Docker image
- Pipeline updates Kubernetes manifest
- ArgoCD detects changes and syncs to cluster
Note: THESE ENDPOINTS ARE NOT WORKING, I HAVE DELETED THE INFRA.
- Application: https://www.vishukumarpatel.com
- API: https://api.vishukumarpatel.com
- ArgoCD Dashboard: https://argocd.vishukumarpatel.com
Error:
FailedScheduling: 0/2 nodes are available: 2 Too many pods.
preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.
Root Cause: EKS nodes running out of available pod slots. Each instance type has a maximum pod limit based on ENI and IP constraints.
Solution:
- Analyzed pod distribution across nodes
- Increased node count in the node group
Error:
AccessDenied: AssumeRoleWithWebIdentity
Root Cause: IAM service account for AWS ALB Ingress Controller was not created properly or had stale configuration.
Solution:
# Delete existing service account
eksctl delete iamserviceaccount --cluster=<cluster-name> --namespace=kube-system --name=aws-load-balancer-controller
# Recreate with proper OIDC trust relationship
eksctl create iamserviceaccount \
--cluster=<cluster-name> \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=arn:aws:iam::<account-id>:policy/AWSLoadBalancerControllerIAMPolicy \
--role-name=AmazonEKSLoadBalancerControllerRole
--override-existing-serviceaccounts \
--approveError: ALB health checks returning 307 Redirect, marking targets as unhealthy.
Root Cause:
- Health check was hitting the root path
/which redirects to HTTPS - ArgoCD expects HTTPS health checks on
/healthzendpoint
Solution: Updated ArgoCD Ingress annotations (argocd-ingress.yml):
annotations:
alb.ingress.kubernetes.io/healthcheck-protocol: HTTPS
alb.ingress.kubernetes.io/healthcheck-path: /healthz
alb.ingress.kubernetes.io/backend-protocol: HTTPSError: Terraform apply failed during EKS node group creation.
Root Cause: Accidentally removed AmazonEC2ContainerRegistryReadOnly policy from IAM role. This policy is critical for nodes to pull container images from ECR (even when using DockerHub, for AWS CNI images).
Solution:
- Identified missing policy through CloudTrail and terraform logs
- Re-attached the policy to the node IAM role:
aws iam attach-role-policy \
--role-name <eks-node-role> \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnlyTotal Project Cost: ~$5-10 CAD [It may vary based on your config and duration]
| Service | Estimated Cost | Notes |
|---|---|---|
| EKS Cluster Control Plane | ~$3.60 | $0.10/hour × ~36 hours |
| EC2 Instances (t3.medium × 3) | ~$3-5 | Depends on runtime |
| NAT Gateway | ~$2-3 | Data transfer costs |
| ALB | ~$1 | Minimal traffic |
| Route53 | ~$0.50 | Hosted zone |
| S3 (Terraform State) | <$0.10 | Minimal storage |
| ACM Certificate | Free |
Cost Optimization Tips:
- Avoid Kubernetes extended support versions (e.g., 1.30+) - adds significant cost
- Use standard support versions (e.g., 1.31)
- Destroy resources when not in use:
terraform destroy - Use
t3.mediuminstead of larger instances for learning projects - Consider AWS Free Tier credits if available
Note: Costs may vary based on region, runtime duration, and data transfer. This implementation was run twice for testing, accumulating the mentioned costs.
-
Infrastructure as Code:
- Terraform custom module development
- Remote state management with S3 and DynamoDB
- AWS resource provisioning and management
-
Container Orchestration:
- Kubernetes deployment strategies
- Pod resource management and scheduling
- Ingress configuration
-
CI/CD Best Practices:
- GitHub Actions workflow design
- Multi-stage Docker builds
- Automated testing and deployment
-
GitOps Methodology:
- Declrative app-of-apps approach
- ArgoCD application configuration
- Declarative deployment management
- Automated synchronization
-
Cloud Security:
- IAM roles and service accounts
- TLS/SSL certificate management
- Network security with VPCs and security groups
-
Troubleshooting:
- Debugging Kubernetes scheduling issues
- Resolving IAM permission problems
- Fixing health check configurations
- Multi-stage Docker builds for smaller images
- Non-root container users for security
- Immutable infrastructure with IaC
- GitOps for deployment automation
- Separation of concerns (CI vs CD)
- Proper secrets management
- High availability across AZs
- TLS encryption for all endpoints##
-
OpenTelemetry Community: For the excellent demo application
- Original repository: https://github.com/open-telemetry/opentelemetry-demo
- Licensed under Apache-2.0
-
AWS: For comprehensive cloud services and documentation
-
ArgoCD & CNCF: For GitOps tooling and Kubernetes ecosystem
This DevOps implementation maintains the original Apache-2.0 license from the OpenTelemetry Demo project.
- Project: OpenTelemetry Demo
- License: Apache License 2.0
- Copyright: OpenTelemetry Authors
- Author: Vishu Patel (@VishuPatel-27)
- License: Apache License 2.0
- Modifications: Added CI/CD pipelines, infrastructure code, and deployment configurations
See LICENSE file for full license text.
Vishu Patel
- GitHub: @VishuPatel-27
- LinkedIn: LinkedIn Profile
Note: This project was created as a portfolio piece to demonstrate hands-on DevOps skills and practical implementation experience. For the original OpenTelemetry Demo documentation, please refer to ORIGINAL_README.md.
⭐ If you find this project helpful, please consider giving it a star!
