A production-grade distributed API health monitoring system with real-time alerting, incident management, and full observability.
- Real-time Monitoring β Continuous health checks at configurable intervals
- Incident Detection β Automatic threshold-based incident creation and tracking
- Multi-channel Alerts β Email and webhook notifications with retry mechanisms
- Full Observability β Prometheus metrics, structured logging, health endpoints
- Horizontal Scaling β Microservices architecture with BullMQ job queues
- Modern Dashboard β Next.js web interface for monitor management
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INFRASTRUCTURE β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β AWS / Docker ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββ βββββββββββββββββ βββββββββββββββββ
β FRONTEND β β BACKEND β β WORKERS β
βββββββββββββββββ€ βββββββββββββββββ€ βββββββββββββββββ€
β β HTTP/REST β β Job Queue β β
β βββββββββββ ββββββββββββββββββββββββββββββββββββββββββΆβ βββββββββββ βββββββββββββββββββββββββββββββββββββββββββΆβ βββββββββββ β
β β Next.js β β β β Express β β β β Worker β β
β β Web β βββββββββββββββββββββββββββββββββββββββββββ β API β β β β Process β β
β β App β β JSON Response β β :4000 β β β β :4001 β β
β βββββββββββ β β βββββββββββ β β βββββββββββ β
β :3000 β β β β β β β
βββββββββββββββββ β β β β β β
β βΌ β β βΌ β
β βββββββββββ β β βββββββββββ β
β βSchedulerβ β β βIncident β β
β β Service β β β β Manager β β
β β :4002 β β β βββββββββββ β
β βββββββββββ β β β β
βββββββββββββββββ β βΌ β
β β βββββββββββ β
β β βNotifier β β
β β β Service β β
β β β :4003 β β
β β βββββββββββ β
β βββββββββββββββββ
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA LAYER β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β PostgreSQL β β Redis β β
β βββββββββββββββββββββββββββββββββββββββ€ βββββββββββββββββββββββββββββββββββββββ€ β
β β β’ Users & Authentication β β β’ BullMQ Job Queues β β
β β β’ Monitors Configuration β β β’ health-check-queue β β
β β β’ Health Check Results β β β’ notification-queue β β
β β β’ Incidents & Alerts β β β’ Dead Letter Queue (DLQ) β β
β β β’ Notification Logs β β β’ Rate Limiting (future) β β
β βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β 1οΈβ£ MONITOR CREATION 2οΈβ£ JOB SCHEDULING 3οΈβ£ HEALTH CHECK EXECUTION β
β βββββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββββ β
β β
β ββββββββββ POST /monitors βββββββ βββββββββββββ Enqueue Job βββββββββ ββββββββββ HTTP GET ββββββββββββββββ β
β β User βββββββββββββββββββΆβ API ββββΆβ Scheduler ββββββββββββββββΆβ Redis ββββΆβ Worker βββββββββββββββΆβ Target API β β
β ββββββββββ βββββββ βββββββββββββ βββββββββ ββββββββββ ββββββββββββββββ β
β β β β β
β βΌ βΌ β β
β ββββββββββββββ ββββββββββββββ β β
β β PostgreSQL ββββββββββββββββββββββββββββββββββββββββSave Result βββββββββββββββββββββ β
β ββββββββββββββ ββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β 4οΈβ£ INCIDENT DETECTION 5οΈβ£ NOTIFICATION DELIVERY β
β ββββββββββββββββββββββ ββββββββββββββββββββββββ β
β β
β ββββββββββ Threshold Breach ββββββββββββββββ βββββββββ Dequeue ββββββββββββ βββββββββββββββββββββββββββββββββββ β
β β Worker βββββββββββββββββββββΆβ Create ββββΆβ Redis βββββββββββΆβ Notifier ββββΆβ Email (Resend) β β
β ββββββββββ β Incident β βββββββββ ββββββββββββ β Webhook (HTTP POST) β β
β β ββββββββββββββββ β β Slack/Discord (future) β β
β β β β βββββββββββββββββββββββββββββββββββ β
β β βΌ β β
β β ββββββββββββββ β Retry with Exponential Backoff β
β β β PostgreSQL β β βββββββ βββββββ βββββββ β
β β ββββββββββββββ βββΆβ 1s ββββΆβ 2s ββββΆβ 4s ββββΆ DLQ β
β βΌ βββββββ βββββββ βββββββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STATE MACHINE β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β π’ HEALTHY ββ[failures >= threshold]βββΆ π΄ INCIDENT ββ[successes >= threshold]βββΆ π’ RESOLVED β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β SERVICE MESH β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β Web β β API β βSchedulerβ β Worker β βNotifier β β β
β β β :3000 β β :4000 β β :4002 β β :4001 β β :4003 β β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β β β β β β β β β
β β β ββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββββββ β β
β β β β β β
β β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β OBSERVABILITY LAYER β β β
β β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β
β β β β β β β β
β β β β β /metrics βββββββββΆ Prometheus βββββββββΆ Grafana Dashboards β β β
β β β β β β β β
β β β β β π /health β β β
β β β β β π /ready β β β
β β β β β π /live β β β
β β β β β β β β
β β β β β π Pino Logs ββββββββΆ Structured JSON βββββββββΆ Log Aggregator β β β
β β β β β β β β
β β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β β
β ββββββββββΌβββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PERSISTENCE LAYER β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β β β
β β ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β PostgreSQL β β Redis β β β
β β β (Primary DB) β β (Message Broker) β β β
β β β β β β β β
β β β ββββββββββββββββββββββ β β ββββββββββββββββββββββ β β β
β β β β users β β β β health-check-queue β β β β
β β β β monitors β β β β notification-queue β β β β
β β β β health_check_results β β β dead-letter-queue β β β β
β β β β incidents β β β ββββββββββββββββββββββ β β β
β β β β notifications β β β β β β
β β β ββββββββββββββββββββββ β β β β β
β β ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Category | Technology |
|---|---|
| Runtime | Node.js 18+, TypeScript |
| Backend | Express.js |
| Frontend | Next.js, React, Tailwind CSS |
| Database | PostgreSQL, Prisma ORM |
| Queue | Redis, BullMQ |
| Observability | Prometheus, Pino |
| Auth | JWT, bcrypt |
| Infrastructure | Docker, Terraform (AWS) |
- Node.js 18+
- PostgreSQL
- Redis
# Clone and install
git clone https://github.com/sAchin-680/HyperVerge-api-health-monitor-system.git
cd HyperVerge-api-health-monitor-system
npm install
# Setup database
npx prisma migrate deploy --schema=packages/db/prisma/schema.prisma
npx prisma generate --schema=packages/db/prisma/schema.prisma
# Copy environment template
cp .env.example .env# Database
DATABASE_URL="postgresql://user:password@localhost:5432/healthmonitor"
# Redis
REDIS_URL="redis://localhost:6379"
# Auth
JWT_SECRET="your-secret-key"# Using Make (recommended)
make dev
# Or individually
npm run dev --workspace=apps/api # :4000
npm run dev --workspace=apps/scheduler # :4002
npm run dev --workspace=apps/worker # :4001
npm run dev --workspace=apps/notifier # :4003
npm run dev --workspace=apps/web # :3000docker-compose upβββ apps/
β βββ api/ # REST API service
β β βββ src/
β β β βββ controllers/ # Request handlers
β β β βββ routes/ # API route definitions
β β β βββ services/ # Business logic
β β β βββ middlewares/ # Auth, validation, logging
β β β βββ validators/ # Zod schemas
β β β βββ lib/ # Shared utilities
β β βββ prisma/ # Database schema
β β
β βββ scheduler/ # Job scheduling service
β β βββ src/
β β βββ index.ts # Scheduler entry point
β β βββ lib/ # Scheduling logic
β β
β βββ worker/ # Health check executor
β β βββ src/
β β βββ worker.ts # BullMQ worker
β β βββ checkExecutor.ts
β β βββ stateEvaluator.ts
β β βββ incidentManager.ts
β β
β βββ notifier/ # Notification service
β β βββ src/
β β βββ providers/ # Email, webhook providers
β β βββ queue/ # Notification queue
β β βββ services/ # Delivery logic
β β
β βββ web/ # Next.js dashboard
β βββ src/
β βββ app/ # App router pages
β βββ components/ # React components
β βββ hooks/ # Custom hooks
β
βββ packages/
β βββ db/ # Shared Prisma client
β β βββ prisma/schema.prisma
β βββ shared/ # Shared types
β βββ src/types.ts
β
βββ architecture/ # Architecture documentation
βββ runbooks/ # Operational runbooks
βββ terraform/ # AWS infrastructure (ECS, RDS, etc.)
β βββ modules/
β β βββ vpc/
β β βββ ecs/
β β βββ rds/
β β βββ alb/
β β βββ security/
β βββ envs/ # Environment configs
β
βββ docker-compose.yml
βββ Makefile
βββ package.json
| Method | Endpoint | Description |
|---|---|---|
| POST | /auth/register |
Register user |
| POST | /auth/login |
Login user |
| Method | Endpoint | Description |
|---|---|---|
| GET | /monitors |
List monitors |
| POST | /monitors |
Create monitor |
| GET | /monitors/:id |
Get monitor |
| PATCH | /monitors/:id |
Update monitor |
| DELETE | /monitors/:id |
Delete monitor |
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check |
| GET | /ready |
Readiness probe |
| GET | /live |
Liveness probe |
| GET | /metrics |
Prometheus metrics |
| Document | Description |
|---|---|
| Architecture Overview | System design and data flow |
| Engineering Log | Design decisions and tradeoffs |
| Runbooks | Operational procedures |
| Deployment Guide | Deployment instructions |
| Terraform | Infrastructure as Code |
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Commit your changes (
git commit -m 'feat: add feature') - Push to your branch (
git push origin feature/your-feature) - Open a Pull Request
MIT License - see LICENSE for details


