Skip to content

ysimonx/SaaS-Industry4.0-Backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

235 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SaaS Multi-Tenant Backend Industry 4.0 Platform

image

A Production-Ready Foundation for Industry 4.0 SaaS Applications

SaaS Backend for Industry 4.0

This platform is specifically designed to support the development of backend systems for Industry 4.0 SaaS applications. As of today, it provides:

  • Mobile Application Foundation: Serves as the backbone for modern mobile and web applications in industrial contexts
  • Flexible Cloud Deployment: Can be hosted on French sovereign cloud, European cloud providers, or GAFAM platforms (AWS, Azure, GCP)
  • Multi-Tenant Architecture: Complete data and file isolation between tenants for maximum security and compliance
  • Enterprise Security Standards: Uses HashiCorp Vault for secrets management and data encryption at rest and in transit
  • Enterprise SSO Integration: Seamlessly integrates with Microsoft Entra ID (Azure SSO) for enterprise authentication
  • Comprehensive API Documentation: Fully documented RESTful APIs with OpenAPI/Swagger specification

A production-ready, scalable multi-tenant SaaS backend platform built with Flask, PostgreSQL, Kafka, and S3 storage. Features isolated tenant databases, JWT authentication, asynchronous document processing, and RESTful APIs.

banniere_20251118

Quick Start

Features

✅ User Management

  • User registration with email validation
  • Secure login with JWT tokens (15-min access, 7-day refresh)
  • Password hashing with bcrypt
  • User profile management
  • Token refresh and logout (blacklist)

✅ Multi-Tenant System

  • Dynamic tenant creation with isolated databases
  • Automatic database provisioning
  • Role-based access control (admin, user, viewer)
  • User-tenant associations
  • Tenant member management

✅ Enterprise Authentication (SSO)

  • Azure AD / Microsoft Entra ID integration
  • Per-tenant SSO configuration
  • Confidential Application mode (client_secret REQUIRED)
  • NOT using PKCE - requires client_secret for secure authentication
  • Auto-provisioning with configurable rules
  • Azure AD group to role mapping
  • Hybrid authentication modes (local, SSO, or both)
  • Encrypted token storage via HashiCorp Vault
  • Multi-tenant identity mapping (different Azure IDs per tenant)
  • Azure AD's token refresh with a celery worker

✅ Document Management

  • Document upload with multipart/form-data
  • MD5-based file deduplication (storage optimization)
  • S3-compatible storage with sharded paths
  • Document metadata management
  • Pre-signed URL generation for downloads
  • Pagination and filtering
  • RFC 3161 TSA timestamping for legal proof of existence

✅ File Management

  • Immutable file storage
  • Reference counting (shared files across documents)
  • Orphaned file detection and cleanup
  • Storage statistics per tenant

✅ TSA Timestamping (RFC 3161)

  • Per-tenant configuration: Enable/disable TSA per organization
  • Automatic timestamping: New file uploads automatically timestamped
  • DigiCert Public TSA: Free, no authentication required
  • SHA-256 fingerprints: Cryptographically secure hashing
  • Asynchronous processing: Non-blocking Celery tasks
  • Complete certificate chain: Stored for long-term verification
  • OpenSSL verification: Independent timestamp validation
  • Legally binding: RFC 3161 compliant timestamps
  • Download as .tsr: Standard format compatible with all tools

✅ Async Processing

  • Kafka: Event streaming for real-time data processing

    • Event topics: tenant.created, document.uploaded, etc.
    • Background worker for event consumption
  • Celery: Distributed task queue for scheduled jobs

    • SSO token refresh (automatic renewal before expiry)
    • Expired token cleanup
    • Encryption key rotation
    • Scheduled maintenance tasks
  • Flower: Real-time monitoring dashboard for Celery tasks

  • *** ✅ celery-worker-sso : Exécute les tâches de rafraîchissement

  • *** ✅ celery-beat : Schedule les tâches périodiques

  • *** ✅ celery-worker-tsa : Exécute les tâches d'horodatage

  • *** ✅ celery-worker-monitoring : Exécute les vérifications de santé

  • *** ✅ flower : Dashboard de monitoring (http://localhost:5555)

✅ API Features

  • RESTful API design
  • OpenAPI 3.0 specification (Swagger)
  • Standardized response formats
  • Comprehensive error handling
  • Request validation with Marshmallow schemas
  • CORS support

✅ Security

  • JWT-based authentication
  • Password strength validation
  • Azure AD / Microsoft Entra ID SSO support
  • Multi-factor authentication (via Azure AD)
  • Rate limiting (configurable)
  • SQL injection prevention (SQLAlchemy ORM)
  • XSS protection
  • HTTPS/TLS support (production)

✅ DevOps Ready

  • Docker and Docker Compose support
  • Multi-stage Docker builds
  • Health check endpoints
  • Logging and monitoring hooks
  • Environment-based configuration
  • Database migration system

✅ Health Monitoring (Healthchecks.io)

  • Self-Hosted Monitoring: Healthchecks.io integration for service uptime
  • Comprehensive Coverage: Monitoring for all critical services
    • PostgreSQL, Redis, Flask API
    • Celery workers and Beat scheduler
    • Kafka, MinIO, Vault
  • Automated Checks: Celery-based periodic health verifications
  • Alert Channels: Email, Slack, webhook notifications
  • Grace Periods: Smart timeout configuration to avoid false alerts
  • Dashboard: Web UI at http://localhost:8000 for monitoring status

✅ IoT (ThingsBoard Integration)

For all IoT functionalities, this platform integrates ThingsBoard Community Edition, an open-source, enterprise-grade IoT platform that provides comprehensive device management, data collection, processing, and visualization capabilities.

Why ThingsBoard Community Edition?

  • Open Source & Free: Community Edition is fully open-source (Apache 2.0 license) with no licensing costs
  • Production-Ready: Battle-tested platform used by thousands of companies worldwide
  • Comprehensive IoT Stack: Complete solution for device connectivity, data processing, and visualization
  • Multi-Tenant Native: Built-in multi-tenancy aligns perfectly with our SaaS architecture
  • Scalable Architecture: Supports millions of devices and messages per second
  • Rich Ecosystem: Extensive protocol support, widgets, rule engine, and integrations
  • Protocol Support: MQTT, CoAP, HTTP, LwM2M, SNMP, Modbus, OPC-UA, BLE, LoRaWAN, Sigfox, NB-IoT
  • IoT Gateway: Pre-built gateways for Modbus, OPC-UA, BACnet, CAN bus, BLE, MQTT
  • Bi-directional Communication: RPC (Remote Procedure Calls) for device control

Table of Contents



Project Overview

This platform provides a complete SaaS backend solution with the following capabilities:

  • Multi-Tenant Architecture: Each tenant (organization) gets an isolated PostgreSQL database for data isolation and security
  • User Management: User registration, authentication, and profile management with JWT tokens
  • Tenant Management: Create organizations, manage members with role-based access control (admin, user, viewer)
  • Document Management: Upload, store, and manage documents with MD5-based deduplication
  • File Storage: S3-compatible storage (MinIO) with sharded path strategy for efficient file organization
  • Async Processing: Kafka-based message queue for asynchronous event processing
  • RESTful APIs: Well-documented REST APIs with OpenAPI/Swagger specification

Use Cases

  • SaaS applications requiring data isolation per customer
  • Document management systems with multi-tenant support
  • Enterprise applications with organization-based access control
  • B2B platforms with separate data domains per client

Industry 4.0 Target Functionalities

This platform is designed as a foundation to support advanced Industry 4.0 capabilities.

A) Security and Compliance

Industrial Security (OT):

  • Ensure security of critical data and operational technology systems
  • Identity and Access Management (IAM) for industrial environments
  • Network segmentation and zero-trust architecture
  • Compliance with IEC 62443 industrial security standards

Traceability and Audit:

  • Log all operations with immutable audit trails
  • Compliance with industrial standards and regulations (ISO 50001, ISO 9001, ISO 27001)
  • ESG (Environmental, Social, Governance) reporting capabilities
  • GDPR, HIPAA, and sector-specific compliance support

B) Platform Management (SaaS)

Multi-Tenancy (✅ Currently Implemented):

  • Complete data isolation for multiple clients on shared infrastructure
  • Per-tenant customization and configuration
  • Tenant-specific database schemas for regulatory compliance

Scalability (✅ Currently Implemented):

  • Rapid scaling of compute and storage capacity based on load
  • Cloud-native architecture with Kubernetes support
  • Elastic resource allocation

DevOps/SRE Operations (✅ Currently Implemented):

  • Continuous Integration/Continuous Deployment (CI/CD) pipelines
  • Infrastructure as Code (IaC) with Docker and Docker Compose
  • Monitoring, alerting, and observability (ready for Prometheus, Grafana)
  • High availability and disaster recovery strategies

Current Implementation Status

Category Capability Status
Foundation Multi-tenant architecture ✅ Production-ready
User authentication & RBAC ✅ Production-ready
Azure AD SSO integration ✅ Production-ready
RESTful API with documentation ✅ Production-ready
Vault secrets management ✅ Production-ready
File storage with S3 ✅ Production-ready
Async Processing Kafka event streaming ✅ Production-ready
Celery task queue ✅ Production-ready
Background workers ✅ Production-ready
Monitoring Healthchecks.io integration ✅ Production-ready
Service health checks ✅ Production-ready
Automated alerting ✅ Production-ready
IoT & Big Data High-throughput data ingestion ✅ Production-ready
Time-series database ✅ Production-ready
Industrial protocol support ✅ Production-ready
AI/ML Predictive maintenance 🚧 Roadmap
Real-time analytics 🚧 Roadmap
Digital twin integration 🚧 Roadmap
Advanced Features Edge computing gateway 🚧 Roadmap
Closed-loop control 🚧 Roadmap
ESG reporting 🚧 Roadmap

Legend:

  • ✅ Production-ready: Implemented and tested
  • 🚧 Roadmap: Planned for future releases

Architecture

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          Client Layer                           │
│                     (Web/Mobile/Desktop)                        │
└────────────────────────────┬────────────────────────────────────┘
                             │ HTTPS/REST
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Load Balancer / CDN                        │
└────────────────────────────┬────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Flask API Server (Gunicorn)                  │
│  ┌─────────────┐  ┌────────────────┐  ┌─────────────────────┐   │
│  │   Routes    │→ │   Services     │→ │  Models/Schemas     │   │
│  │ (REST APIs) │  │(Business Logic)│  │  (Validation)       │   │
│  └─────────────┘  └────────────────┘  └─────────────────────┘   │
└────────────────────────────┬────────────────────────────────────┘
                             │
         ┌───────────────────┼───────────────────┬────────────────┐
         │                   │                   │                │
         ▼                   ▼                   ▼                ▼
┌─────────────────┐  ┌──────────────┐  ┌─────────────────┐  ┌───────────────┐
│   PostgreSQL    │  │    Kafka     │  │  MinIO (S3)     │  │HashiCorp Vault│
│                 │  │              │  │                 │  │               │
│  Main Database: │  │  Message     │  │  File Storage:  │  │Secrets Mgmt:  │
│  - Users        │  │  Broker:     │  │  - Documents    │  │  - DB Creds   │
│  - Tenants      │  │  - Events    │  │  - Uploads      │  │  - JWT Keys   │
│  - Associations │  │  - Async Jobs│  │  - Backups      │  │  - S3 Keys    │
│                 │  │              │  │                 │  │  - Encryption │
│  Tenant DBs:    │  └──────┬───────┘  └─────────────────┘  │  - Audit Log  │
│  - Documents    │         │                               └───────────────┘
│  - Files        │         │         ┌───────────────────┐
│  (Isolated)     │         │         │     Redis         │
└─────────────────┘         │         │                   │
                            │         │  Cache & Session  │
                     ┌──────▼───────┐ │  - Token Blacklist│
                     │Kafka Consumer│ │  - SSO Sessions   │
                     │   Worker     │ │  - API Cache      │
                     │ (Background) │ └───────────────────┘
                     └──────────────┘

Multi-Tenant Database Strategy

Each tenant has an isolated PostgreSQL database:

  • Main Database (saas_platform): Stores users, tenants, and user-tenant associations
  • Tenant Databases (tenant_<name>_<uuid>): Each tenant gets a separate database for documents and files

This approach provides:

  • Strong Data Isolation: Complete database separation per tenant
  • Security: No risk of cross-tenant data leakage
  • Scalability: Easy to scale individual tenant databases
  • Compliance: Meets strict data isolation requirements (GDPR, HIPAA, etc.)

Tech Stack

Backend Framework

  • Flask 3.0: Lightweight Python web framework
  • Gunicorn 21.2: Production WSGI HTTP server
  • SQLAlchemy 2.0: SQL toolkit and ORM
  • Flask-Migrate 4.0: Database migration management (Alembic)

Authentication & Security

  • Flask-JWT-Extended 4.6: JWT token management
  • bcrypt 4.1: Password hashing
  • cryptography 42.0: Encryption utilities
  • HashiCorp Vault: Centralized secrets management and encryption

Database

  • PostgreSQL 14+: Primary database (multi-database support)
  • psycopg2-binary 2.9: PostgreSQL adapter

Cache & Session Store

  • Redis 7.0: High-performance cache and session store
  • redis-py 5.0: Python Redis client
  • Token blacklist storage with TTL
  • SSO session management
  • API response caching (planned)

Message Queue & Task Processing

  • Apache Kafka: Event streaming and async processing
  • kafka-python 2.0: Python Kafka client
  • Zookeeper: Kafka coordination
  • Celery 5.3: Distributed task queue for scheduled jobs
  • Flower 2.0: Real-time Celery monitoring dashboard

Object Storage

  • MinIO: S3-compatible object storage
  • boto3 1.34: AWS SDK for Python (S3 client)

Data Validation

  • marshmallow 3.20: Object serialization and validation
  • marshmallow-sqlalchemy: SQLAlchemy integration

Development Tools

  • pytest 7.4: Testing framework
  • black 24.1: Code formatter
  • flake8 7.0: Linting
  • mypy 1.8: Static type checking

Containerization

  • Docker 20.10+: Containerization
  • Docker Compose 2.0+: Multi-container orchestration

Prerequisites

For Docker Deployment (Recommended)

  • Docker: 20.10 or higher
  • Docker Compose: 2.0 or higher
  • System Requirements: 4GB RAM minimum
  • Ports: 4999, 5432, 6379, 9000, 9001, 9092, 9093 available

For Local Development

  • Python: 3.11 or higher
  • PostgreSQL: 14 or higher
  • Redis: 7.0 or higher
  • Kafka: 3.0+ with Zookeeper
  • MinIO: Latest version (or AWS S3 account)
  • virtualenv: For Python virtual environment

Quick Start

Get the platform running in 5 minutes with Docker.

# ============================================================================
# ÉTAPE 1: Configuration initiale
# ============================================================================

# 1.1. Clone the repository
git clone https://github.com/ysimonx/SaaS-Industry4.0-Backend.git
cd SaaS-Industry4.0-Backend

# 1.2. Copy minimal environment file (NO SECRETS - configuration only)
cp .env.docker.minimal .env
cp .env.healthchecks.example .env.healthchecks
# Note: Secrets will be managed by Vault, NOT by .env file

# 1.3. Create secrets file for Vault (OBLIGATOIRE)
# Les scripts Vault sont déjà dans le repo (vault/config/, vault/scripts/)
mkdir -p docker/volumes/vault/init-data/
cat > docker/volumes/vault/init-data/docker.env <<'EOF'
DATABASE_URL=postgresql://postgres:postgres@postgres:5432/saas_platform
TENANT_DATABASE_URL_TEMPLATE=postgresql://postgres:postgres@postgres:5432/{database_name}
JWT_SECRET_KEY=$(head -c 32 /dev/urandom | xxd -p -c 64)
JWT_ACCESS_TOKEN_EXPIRES=900
S3_ENDPOINT_URL=http://minio:9000
S3_ACCESS_KEY_ID=minioadmin
S3_SECRET_ACCESS_KEY=minioadmin
S3_BUCKET=saas-documents
S3_REGION=us-east-1
EOF

# ============================================================================
# ÉTAPE 2: Démarrage de Vault avec auto-unseal
# ============================================================================

# 2.0. (Optional) Réinitialisation complète de Vault
# ⚠️  ATTENTION: Cette opération supprime TOUTES les données Vault !
# Utilisez ceci uniquement si vous voulez recommencer à zéro
rm -Rf docker/volumes/vault/data
docker-compose down vault vault-unseal

# 2.1. Start Vault and auto-unseal services
docker-compose up -d vault vault-unseal

# 2.2. Wait for Vault to initialize and unseal (30 secondes environ)
sleep 30
docker logs saas-vault-unseal

# 2.3. Verify Vault is unsealed and ready
docker exec saas-vault vault status
# Expected: "Sealed: false"

# 2.4. IMPORTANT: Sauvegarder le token root (première fois seulement)
cat docker/volumes/vault/data/root-token.txt
# ⚠️  Sauvegarder ce token dans un gestionnaire de mots de passe !

# ============================================================================
# ÉTAPE 3: Initialisation des secrets dans Vault
# ============================================================================

# 3.1. Start vault-init service to inject secrets
docker-compose up -d vault-init

# 3.2. Wait for initialization (20 secondes environ)
sleep 20
docker logs saas-vault-init

# 3.3. Verify AppRole credentials were created
cat .env.vault
# Ce fichier contient VAULT_ROLE_ID et VAULT_SECRET_ID

# 3.4. (Optional) Verify secrets are stored in Vault
VAULT_TOKEN=$(cat docker/volumes/vault/data/root-token.txt)
docker exec -e VAULT_TOKEN=$VAULT_TOKEN saas-vault vault kv get secret/saas-project/docker/database

# ============================================================================
# ÉTAPE 4: Démarrage de l'application
# ============================================================================

# 4.0. Create docker volumes info not exists
rm -Rf docker/volumes/minio/data
rm -Rf docker/volumes/postgres/data
mkdir -p docker/volumes/minio/data
mkdir -p docker/volumes/postgres/data

# 4.1. Start all remaining services (API, Worker, PostgreSQL, Kafka, MinIO)
docker-compose up -d

# 4.2. Wait for services to be healthy (30 secondes environ)
sleep 30
docker-compose ps

# ============================================================================
# ÉTAPE 5: Initialisation de la base de données
# ============================================================================

# 5.0. (IMPORTANT) Remove old migration files if any exist
rm -Rf backend/migrations/versions/*

# 5.1. Create main database
docker-compose exec postgres psql -U postgres -c "CREATE DATABASE saas_platform;"

# 5.2. migration initiale
docker-compose exec api /app/flask-wrapper.sh db migrate -m "Initial migration"

# 5.3. Run database migrations (using Vault secrets)
docker-compose exec api /app/flask-wrapper.sh db upgrade

# 5.4. Create admin user and test tenant
docker-compose exec api python scripts/init_db.py --create-admin --create-test-tenant

# 5.5. (Optional) Migrate tenant databases if needed
docker-compose exec api python scripts/migrate_all_tenants.py

# ============================================================================
# ÉTAPE 6: Vérification
# ============================================================================

# 6.1. Check API health
curl http://localhost:4999/health

# 6.2. View application logs
docker-compose logs -f api

# 6.3. Check all services status
docker-compose ps


**Default Admin Credentials** (change immediately!):
- Email: `admin@example.com`
- Password: `12345678`


# ============================================================================
# ÉTAPE 7: Healthcheck
# ============================================================================

# 7.1 creation du compte admin@example.com, des clés d'API et des checks dans healthcheck"

if [ ! -f ".env.healthchecks" ]; then
    echo "Le fichier n'existe pas, recopie de .env.healthchecks.example dans .env.healthchecks"
    cp .env.healthchecks.example .env.healthchecks
    touch "/path/to/file"
fi

bash scripts/healthcheck/start-healthchecks-enhanced.sh

Default Healthcheck Credentials (change immediately!):

  • Email: admin@example.com
  • Password: 12345678
  • Health Monitoring: See specs/6 - healthcheck/ for detailed monitoring documentation

Access Services:

Important Vault Files (NE JAMAIS COMMITER):

  • docker/volumes/vault/data/unseal-keys.json - Clés pour déverrouiller Vault
  • docker/volumes/vault/data/root-token.txt - Token administrateur Vault
  • .env.vault - Credentials AppRole pour l'application
  • vault/init-data/docker.env - Secrets injectés dans Vault

Au prochain redémarrage:

# Tout redémarre automatiquement avec docker-compose up -d
docker-compose up -d

# Vault se déverrouille automatiquement (vault-unseal)
# Les secrets ne sont PAS réinjectés (vault-init est idempotent)
# L'application récupère automatiquement les secrets depuis Vault

Notes importantes:

  • vault-unseal : S'exécute à chaque démarrage et déverrouille Vault automatiquement
  • vault-init : S'exécute aussi mais ne modifie RIEN si les secrets existent déjà (idempotent)
  • Secrets protégés : Les secrets ne seront jamais écrasés accidentellement

Pour plus de détails sur Vault, consultez:


Azure SSO Configuration

The platform supports Azure AD / Microsoft Entra ID Single Sign-On (SSO) for enterprise authentication. Each tenant can independently configure SSO.

Setting up Azure AD Application

  1. Register an Application in Azure AD:

    # In Azure Portal (portal.azure.com):
    1. Go to Azure Active Directory → App registrations → New registration
    2. Name: "Your SaaS Platform"
    3. Supported account types: "Accounts in this organizational directory only"
    4. Redirect URI:
       - Type: Web
       - URI: http://localhost:4999/api/auth/sso/azure/callback (development)
       - URI: https://yourapp.com/api/auth/sso/azure/callback (production)
    5. Register the application
  2. Configure the Azure Application (Confidential Mode):

    # In Azure App Registration:
    1. Authentication tab:
       - ⚠️ IMPORTANT: Do NOT enable "Public client flows"
       - Add redirect URIs for all environments
       - Enable ID tokens and Access tokens
    
    2. Certificates & secrets tab:
       - ⚠️ CRITICAL: Click "New client secret"
       - Description: "SaaS Platform Secret"
       - Expiration: Choose appropriate duration (12-24 months)
       - Copy the secret value immediately (shown only once!)
    
    3. API Permissions tab:
       - Microsoft Graph → User.Read (default)
       - Microsoft Graph → email (optional)
       - Microsoft Graph → profile (optional)
       - Grant admin consent if required
    
    4. Copy these values:
       - Application (client) ID
       - Directory (tenant) ID
       - Client secret (from step 2)

Configuring SSO for a Tenant

  1. Create SSO Configuration via API:

    # As tenant admin, configure SSO:
    # ⚠️ IMPORTANT: client_secret is REQUIRED
    curl -X POST http://localhost:4999/api/tenants/{tenant_id}/sso/config \
      -H "Authorization: Bearer $TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "client_id": "your-azure-app-client-id",
        "client_secret": "your-azure-client-secret",
        "provider_tenant_id": "your-azure-tenant-id",
        "enable": true,
        "config_metadata": {
          "auto_provisioning": {
            "enabled": true,
            "default_role": "viewer",
            "sync_attributes_on_login": true,
            "allowed_email_domains": ["@yourcompany.com"],
            "allowed_azure_groups": ["All-Employees"],
            "group_role_mapping": {
              "IT-Admins": "admin",
              "Developers": "user",
              "Support": "viewer"
            }
          }
        }
      }'
  2. Set Authentication Mode:

    # Enable SSO for the tenant:
    curl -X POST http://localhost:4999/api/tenants/{tenant_id}/sso/config/enable \
      -H "Authorization: Bearer $TOKEN" \
      -H "Content-Type: application/json" \
      -d '{
        "auth_method": "both"  # Options: "sso", "local", "both"
      }'

SSO Login Flow

  1. Initiate SSO Login:

    // Frontend redirects user to SSO login:
    window.location.href = `${API_URL}/api/auth/sso/azure/login/${tenantId}`;
  2. Handle Callback:

    // After Azure AD authentication, user returns with tokens:
    // GET /api/auth/sso/azure/callback?code=...&state=...
    
    // Response includes JWT tokens:
    {
      "access_token": "eyJ...",
      "refresh_token": "eyJ...",
      "user": {
        "id": "user-uuid",
        "email": "user@company.com",
        "first_name": "John",
        "last_name": "Doe"
      }
    }

Auto-Provisioning Configuration

The platform can automatically create user accounts during SSO login:

{
  "auto_provisioning": {
    "enabled": true,
    "default_role": "viewer",
    "sync_attributes_on_login": true,
    "allowed_email_domains": ["@company.com", "@partner.com"],
    "allowed_azure_groups": ["All-Employees", "Contractors"],
    "group_role_mapping": {
      "IT-Admins": "admin",
      "Developers": "user",
      "Support": "viewer",
      "Contractors": "viewer"
    }
  }
}

Security Features

  • Confidential Application Mode: MANDATORY use of client_secret for secure OAuth2 flow
  • Client Secret Required: Application configured as confidential (NOT public client)
  • NO PKCE: Platform does NOT use Proof Key for Code Exchange (PKCE is for public clients only)
  • State Token: CSRF protection during OAuth flow
  • Encrypted Token Storage: Azure tokens encrypted via HashiCorp Vault
  • Token Refresh: Automatic token refresh before expiration via Celery workers
  • Multi-Factor Authentication: Inherited from Azure AD configuration

SSO Management Endpoints

# Configuration Management
GET    /api/tenants/{id}/sso/config          # Get current configuration
POST   /api/tenants/{id}/sso/config          # Create configuration
PUT    /api/tenants/{id}/sso/config          # Update configuration
DELETE /api/tenants/{id}/sso/config          # Remove configuration

# Enable/Disable SSO
POST   /api/tenants/{id}/sso/config/enable   # Enable SSO
POST   /api/tenants/{id}/sso/config/disable  # Disable SSO
GET    /api/tenants/{id}/sso/config/validate # Validate configuration

# Authentication Flow
GET    /api/auth/sso/azure/login/{tenant_id}       # Initiate login
GET    /api/auth/sso/azure/callback                # OAuth callback
POST   /api/auth/sso/azure/refresh                 # Refresh tokens
POST   /api/auth/sso/azure/logout/{tenant_id}      # SSO logout

# User Information
GET    /api/auth/sso/azure/user-info              # Get Azure profile
GET    /api/auth/sso/identities                   # List user's Azure identities
GET    /api/auth/sso/check-availability/{tenant_id} # Check SSO status

# Statistics
GET    /api/tenants/{id}/sso/statistics           # SSO usage stats

Testing SSO Integration

A test script is provided to verify SSO configuration:

# Run the SSO setup test:
cd backend
python scripts/setup_sso_test.py

# The script will:
# 1. Create a test tenant with SSO configuration
# 2. Display the Azure AD login URL
# 3. Guide you through the authentication flow
# 4. Verify token exchange and user provisioning

Troubleshooting SSO

Common issues and solutions:

  1. "redirect_uri_mismatch" error:

    • Ensure the callback URL in Azure AD matches exactly
    • Check for trailing slashes and protocol (http vs https)
  2. "invalid_client" error:

    • Verify the client_id is correct
    • Ensure client_secret is provided and valid
    • Check that "Public client flows" is DISABLED (we use confidential mode)
  3. Auto-provisioning not working:

    • Check email domain is in allowed list
    • Verify Azure AD group membership
    • Ensure tenant has auto_provisioning enabled
  4. Token refresh failing:

    • Check if refresh token has expired (90 days)
    • Verify Azure AD app permissions haven't changed

TSA Timestamping Configuration

The platform supports RFC 3161 compliant timestamping using DigiCert's public TSA service for legal proof of document existence at a specific time.

What is TSA Timestamping?

TSA (Time-Stamp Authority) timestamping provides cryptographic proof that a document existed at a specific moment in time. This is legally binding and compliant with RFC 3161 standards, making it admissible in court and regulatory proceedings.

Use Cases:

  • Legal document archiving
  • Compliance and audit trails
  • Intellectual property protection
  • Contract signing timestamps
  • Data integrity verification

Enabling TSA for a Tenant

TSA can be enabled/disabled per tenant:

# Enable TSA for a tenant (Admin only)
curl -X PUT http://localhost:4999/api/tenants/{tenant_id}/tsa/enable \
  -H "Authorization: Bearer $ADMIN_TOKEN"

# Check TSA status
curl -X GET http://localhost:4999/api/tenants/{tenant_id}/tsa/status \
  -H "Authorization: Bearer $TOKEN"

Or programmatically in Python:

from app.models.tenant import Tenant
from app.extensions import db

tenant = Tenant.query.filter_by(name='Acme Corp').first()
tenant.tsa_enabled = True
db.session.commit()

How It Works

  1. File Upload: User uploads a file to the tenant
  2. Hashing: System calculates MD5 (for dedup) + SHA-256 (for TSA)
  3. Storage: File record created in tenant database
  4. Timestamping: If TSA enabled, Celery task scheduled (5s delay)
  5. TSA Request: Worker sends SHA-256 hash to DigiCert TSA
  6. Token Storage: RFC 3161 timestamp token + certificate chain stored in metadata

Architecture:

  • Asynchronous: Non-blocking Celery tasks (doesn't slow down uploads)
  • Idempotent: Won't re-timestamp if already done
  • Retry Logic: Automatic retry with exponential backoff
  • Rate Limited: 100 timestamps/hour per worker (configurable)

Downloading and Verifying Timestamps

Step 1: Download Timestamp Token

# Download timestamp as .tsr file (RFC 3161 format)
curl -o timestamp.tsr \
  -H "Authorization: Bearer $TOKEN" \
  http://localhost:4999/api/tenants/{tenant_id}/files/{file_id}/timestamp/download

Step 2: Download Original File

# Download the original file to verify against
curl -o original_file.pdf \
  -H "Authorization: Bearer $TOKEN" \
  http://localhost:4999/api/tenants/{tenant_id}/documents/{doc_id}/download

Step 3: Download DigiCert Certificate Chain

# Download root certificate
curl -o digicert_root.pem https://cacerts.digicert.com/DigiCertAssuredIDRootCA.crt.pem

# Download intermediate certificate
curl -o digicert_intermediate.pem https://cacerts.digicert.com/DigiCertSHA2AssuredIDTimestampingCA.crt.pem

# Create complete certificate chain
cat digicert_intermediate.pem digicert_root.pem > digicert_chain.pem

Step 4: Verify with OpenSSL

# Perform cryptographic verification
openssl ts -verify \
  -data original_file.pdf \
  -in timestamp.tsr \
  -CAfile digicert_chain.pem

# Expected output:
# Using configuration from /opt/homebrew/etc/openssl@3/openssl.cnf
# Verification: OK

Important Notes on Certificates

⚠️ Critical: Use the correct DigiCert certificate chain:

  • Root Certificate: DigiCertAssuredIDRootCA.crt.pem (NOT Global Root G2)
  • Intermediate Certificate: DigiCertSHA2AssuredIDTimestampingCA.crt.pem
  • Chain Order: Intermediate + Root (concatenated in this order)

Without the complete chain, you'll encounter the error:

error:17800064:time stamp routines:ts_verify_cert:certificate verify error
Verify error:unable to get local issuer certificate

Testing TSA Integration

A complete test script is provided:

# Run the TSA upload and verification test
./test_tsa_upload.sh

# The script will:
# 1. Login and get JWT token
# 2. Upload a test file
# 3. Wait for TSA task to complete
# 4. Download timestamp token
# 5. Download DigiCert certificates
# 6. Verify timestamp with OpenSSL
# 7. Display verification result

TSA Endpoints

# Enable/Disable TSA
PUT    /api/tenants/{id}/tsa/enable          # Enable TSA
PUT    /api/tenants/{id}/tsa/disable         # Disable TSA

# Status and Information
GET    /api/tenants/{id}/tsa/status          # TSA configuration status
GET    /api/tenants/{id}/files/{id}/timestamp  # Get timestamp metadata

# Download and Verify
GET    /api/tenants/{id}/files/{id}/timestamp/download  # Download .tsr file
POST   /api/tenants/{id}/files/{id}/timestamp/verify    # Verify timestamp

# Retroactive Timestamping (Admin only)
POST   /api/tenants/{id}/files/{id}/timestamp/create     # Timestamp single file
POST   /api/tenants/{id}/files/timestamp/bulk            # Bulk timestamping

Configuration Options

Environment variables for TSA:

# DigiCert TSA Configuration
DIGICERT_TSA_URL=http://timestamp.digicert.com
TSA_REQUEST_TIMEOUT=30          # Request timeout in seconds
TSA_MAX_RETRIES=3               # Maximum retry attempts

# Celery Configuration
CELERY_TSA_QUEUE=tsa_timestamping
CELERY_TSA_CONCURRENCY=4
CELERY_TSA_RATE_LIMIT=100/h     # Timestamps per hour

Security Features

  • SHA-256 Hashing: Cryptographically secure (MD5 is only for deduplication)
  • Complete Certificate Chain: Stored for long-term verification
  • Immutable Timestamps: Cannot be modified after creation
  • Independent Verification: Can be verified years later without the platform
  • Legally Binding: RFC 3161 compliant, admissible in court
  • Free Service: DigiCert public TSA requires no authentication

Long-Term Verification

One of the key advantages of RFC 3161 timestamps is independent, long-term verification:

  • ✅ Verification possible 10+ years after timestamp creation
  • ✅ No dependency on the original platform
  • ✅ Works even if DigiCert's servers are offline
  • ✅ Compatible with any RFC 3161 compliant tool
  • ✅ Certificate chain ensures trust even after cert expiration

Example: Verifying a 10-year-old timestamp:

Even in 2035, you can verify a 2025 timestamp using:

  1. Original file (must be preserved)
  2. Timestamp token (.tsr file from database)
  3. DigiCert public certificates (available online)
  4. OpenSSL or any RFC 3161 tool

Monitoring TSA Tasks

# View TSA worker logs
docker-compose logs -f celery-worker-tsa

# Access Flower monitoring dashboard
open http://localhost:5555

# Check TSA queue status
docker-compose exec celery-worker-tsa celery -A app.celery_app inspect active_queues

Troubleshooting TSA

Common issues and solutions:

  1. "Timestamp download failed":

    • Check if file has been timestamped (may still be processing)
    • Wait 10-15 seconds after upload for async task to complete
    • Check Celery worker logs for errors
  2. "Verification: FAILED" with OpenSSL:

    • Ensure you're using the correct certificate chain (Assured ID Root CA)
    • Verify file hasn't been modified since timestamp
    • Check that intermediate certificate is included in chain
  3. "Unable to get local issuer certificate":

    • Missing intermediate certificate in chain
    • Download both root + intermediate certificates
    • Concatenate in correct order: intermediate first, then root
  4. TSA task not running:

    • Check Celery worker is running: docker-compose ps celery-worker-tsa
    • Verify tenant has TSA enabled: tenant.tsa_enabled == True
    • Check Celery logs: docker-compose logs celery-worker-tsa

Environment Variables

The platform uses environment variables for configuration. Three environment files are provided:

.env.development - Local Development

Safe defaults for local development with localhost services.

.env.docker - Docker Compose

Configuration for containerized deployment with service discovery.

.env.production - Production Template

Security-hardened template with checklists and placeholders.

Key Variables

Flask Configuration

FLASK_APP=run.py                    # Flask application entry point
FLASK_ENV=development               # Environment: development/production
FLASK_DEBUG=1                       # Debug mode: 1=enabled, 0=disabled
FLASK_PORT=4999                     # Server port
SECRET_KEY=your-secret-key          # Flask secret key (change in production!)

Database Configuration

# Main database for users and tenants
DATABASE_URL=postgresql://user:password@host:5432/saas_platform

# Tenant database template (dynamic substitution)
TENANT_DATABASE_URL_TEMPLATE=postgresql://user:password@host:5432/{database_name}

# Connection pooling
DATABASE_POOL_SIZE=10               # Connection pool size
DATABASE_POOL_TIMEOUT=30            # Pool timeout in seconds
DATABASE_MAX_OVERFLOW=20            # Max overflow connections

JWT Configuration

JWT_SECRET_KEY=your-jwt-secret      # CRITICAL: Generate secure key!
JWT_ACCESS_TOKEN_EXPIRES=900        # Access token expiry (15 minutes)
JWT_REFRESH_TOKEN_EXPIRES=604800   # Refresh token expiry (7 days)
JWT_ALGORITHM=HS256                 # JWT signing algorithm

Kafka Configuration

KAFKA_BOOTSTRAP_SERVERS=kafka:9092         # Kafka broker address
KAFKA_CLIENT_ID=saas-platform              # Client identifier
KAFKA_CONSUMER_GROUP_ID=saas-consumer-group # Consumer group
KAFKA_AUTO_OFFSET_RESET=earliest           # Offset reset strategy
KAFKA_ENABLE_AUTO_COMMIT=true              # Auto-commit offsets
KAFKA_MAX_POLL_RECORDS=100                 # Max records per poll

S3/MinIO Configuration

S3_ENDPOINT_URL=http://minio:9000   # S3 endpoint (blank for AWS S3)
S3_ACCESS_KEY_ID=minioadmin         # Access key
S3_SECRET_ACCESS_KEY=minioadmin     # Secret key
S3_BUCKET=saas-documents            # Bucket name
S3_REGION=us-east-1                 # Region
S3_USE_SSL=false                    # Use SSL/TLS (true for production)

Redis Configuration

REDIS_URL=redis://redis:6379/0      # Redis connection URL
REDIS_MAX_CONNECTIONS=20            # Maximum connection pool size
REDIS_DECODE_RESPONSES=true         # Auto-decode responses to strings
REDIS_TOKEN_BLACKLIST_EXPIRE=86400  # Token blacklist TTL (24 hours)
REDIS_SESSION_EXPIRE=600            # SSO session TTL (10 minutes)

CORS Configuration

CORS_ORIGINS=http://localhost:3000,http://localhost:4999
CORS_SUPPORTS_CREDENTIALS=true
CORS_MAX_AGE=3600

Logging Configuration

LOG_LEVEL=DEBUG                     # Logging level: DEBUG/INFO/WARNING/ERROR
LOG_FORMAT=%(asctime)s - %(name)s - %(levelname)s - %(message)s
LOG_FILE=logs/app.log              # Log file path

Security Best Practices

NEVER commit .env files with actual credentials to version control!

  1. Generate Strong Secrets:

    # JWT Secret (64 characters)
    python -c "import secrets; print(secrets.token_urlsafe(64))"
    
    # Database Password (32 characters)
    python -c "import secrets; print(secrets.token_urlsafe(32))"
  2. Use Environment-Specific Files:

    • Development: .env.development
    • Production: .env.production (never commit with real values!)
  3. Enable SSL/TLS in Production:

    • Set S3_USE_SSL=true
    • Use sslmode=require in DATABASE_URL
    • Configure HTTPS for API server
  4. Restrict CORS Origins:

    • Development: Allow localhost
    • Production: Only allow production domains

HashiCorp Vault Integration

The platform supports HashiCorp Vault for centralized, secure secrets management as an alternative to environment variables.

Vault Features

  • Centralized Secrets: All secrets stored in one secure location
  • Audit Logging: Track all secret access and modifications
  • Dynamic Rotation: Automatically rotate secrets without downtime
  • Encryption: Secrets encrypted at rest and in transit
  • Access Control: Fine-grained policies for secret access

Vault Setup with Docker

The platform includes Vault as an optional service in Docker Compose:

# Start all services including Vault
docker-compose up -d

# Vault will be available at http://localhost:8201
# Default token (dev mode): root-token

Storing Secrets in Vault

# Initialize secrets in Vault (run once)
docker-compose exec vault sh -c '
  vault kv put secret/saas-platform \
    jwt_secret="$(head -c 32 /dev/urandom | xxd -p -c 64)" \
    db_password="secure_password_here" \
    db_user="postgres" \
    aws_access_key="AKIA..." \
    aws_secret="secret_key_here"
'

# View stored secrets
docker-compose exec vault vault kv get secret/saas-platform

# Update a specific secret
docker-compose exec vault vault kv patch secret/saas-platform \
  jwt_secret="new_secret_value"

Using Vault with Flask Commands

Important: When using Vault, Flask commands in Docker must use the wrapper script /app/flask-wrapper.sh:

# Database migrations with Vault
docker-compose exec api /app/flask-wrapper.sh db upgrade

# Create new migration with Vault
docker-compose exec api /app/flask-wrapper.sh db migrate -m "Add new field"

# Any Flask command with Vault secrets
docker-compose exec api /app/flask-wrapper.sh <command>

The wrapper script automatically:

  1. Connects to Vault
  2. Retrieves secrets
  3. Sets them as environment variables
  4. Executes the Flask command

Migrating from .env to Vault

If you have existing .env files, you can migrate them to Vault:

# Use the migration script
./backend/scripts/migrate_to_vault.sh

# Or manually migrate specific secrets
docker-compose exec vault vault kv put secret/saas-platform \
  jwt_secret="$JWT_SECRET_KEY" \
  db_password="$DATABASE_PASSWORD"

Vault in Production

For production deployments:

  1. Use Production Mode: Don't use dev mode

    vault:
      command: server  # Remove -dev flag
  2. Configure Storage Backend: Use Consul, etcd, or integrated storage

  3. Enable TLS: Secure all Vault communications

  4. Use AppRole Authentication: Instead of root tokens

  5. Set Up Policies: Restrict access to specific paths

  6. Enable Audit Logging: Track all operations

  7. Backup Regularly: Ensure you can recover secrets

Example production policy:

# backend/vault/policies/app-policy.hcl
path "secret/data/saas-platform/*" {
  capabilities = ["read", "list"]
}

Application Integration

The application automatically detects and uses Vault if configured:

# backend/app/config.py
# Automatically tries Vault first, falls back to env vars
if vault_available():
    load_secrets_from_vault()
else:
    load_from_environment()

Vault UI Access

Vault provides a web UI for managing secrets:


Database Migrations

The platform uses two different migration systems depending on the tables:

1. Main Database Migrations (Alembic)

For the main database tables (User, Tenant, UserTenantAssociation), we use Flask-Migrate (Alembic).

IMPORTANT: The documents and files tables are NOT migrated in the main database. These tables are tenant-specific and only exist in individual tenant databases. The migration system is configured to automatically exclude them from main database migrations via the include_object filter in backend/migrations/env.py.

Initialize Migrations (First Time Only)

# Navigate to backend directory
cd backend

# Initialize migrations folder
flask db init

Create Migration

# Auto-generate migration from model changes
flask db migrate -m "Description of changes"

# Example
flask db migrate -m "Add email verification field to User"

Apply Migration

# Upgrade to latest version
flask db upgrade

# Upgrade to specific version
flask db upgrade <revision>

# Downgrade to previous version
flask db downgrade

# Show current version
flask db current

# Show migration history
flask db history

Docker Migrations avec Vault

⚠️ IMPORTANT: Avec l'intégration Vault, utilisez le script wrapper /app/flask-wrapper.sh pour toutes les commandes Flask dans Docker :

# Run migrations in Docker container (avec Vault)
docker-compose exec api /app/flask-wrapper.sh db upgrade

# Create new migration in Docker (avec Vault)
docker-compose exec api /app/flask-wrapper.sh db migrate -m "Add new field"

# Show current migration version
docker-compose exec api /app/flask-wrapper.sh db current

# Show migration history
docker-compose exec api /app/flask-wrapper.sh db history

Le script wrapper charge automatiquement les variables d'environnement Vault avant d'exécuter les commandes Flask.

Verify Table Exclusion

To verify that documents and files tables are correctly excluded from the main database:

# Check main database tables (should NOT include documents/files)
docker-compose exec postgres psql -U postgres -d saas_platform -c "\dt"

# Expected output: Only users, tenants, user_tenant_associations, alembic_version
# NOT documents or files

# When creating a migration, you should see exclusion logs:
docker-compose exec api /app/flask-wrapper.sh db migrate -m "Test migration"
# Expected logs:
# INFO  [alembic.env] Excluding tenant-specific table 'documents' from main database migration
# INFO  [alembic.env] Excluding tenant-specific table 'files' from main database migration

If you accidentally have documents or files tables in the main database (from old migrations), remove them:

# Remove incorrect tables from main database
docker-compose exec postgres psql -U postgres -d saas_platform -c "DROP TABLE IF EXISTS documents CASCADE;"
docker-compose exec postgres psql -U postgres -d saas_platform -c "DROP TABLE IF EXISTS files CASCADE;"

2. Tenant-Specific Migrations (Manual)

For tenant-specific tables (File, Document), which are created dynamically in each tenant's database, we use a custom versioning system.

Why Not Alembic for Tenant Tables?

  • Tenant databases are created dynamically when a tenant is created
  • Each tenant has isolated File and Document tables
  • Alembic doesn't manage these tables in the main database
  • We need a system to evolve these schemas across all tenant databases

Creating a Tenant Migration

Edit backend/app/tenant_db/tenant_migrations.py and add your migration:

from app.tenant_db.tenant_migrations import register_migration
from sqlalchemy import text

@register_migration(2)  # Use next sequential version number
def add_file_metadata_column(db):
    """Ajoute une colonne metadata JSONB à la table files"""
    db.execute(text("""
        ALTER TABLE files
        ADD COLUMN IF NOT EXISTS metadata JSONB DEFAULT '{}'::jsonb
    """))

@register_migration(3)
def add_document_version_column(db):
    """Ajoute une colonne version à la table documents"""
    db.execute(text("""
        ALTER TABLE documents
        ADD COLUMN IF NOT EXISTS version INTEGER DEFAULT 1 CHECK (version > 0)
    """))

Applying Tenant Migrations

Option 1: Automatic (for new tenants)

  • Migrations are automatically applied when creating a new tenant
  • No action needed

Option 2: Manual (for existing tenants)

# Dry-run mode (preview changes without applying)
python backend/scripts/migrate_all_tenants.py --dry-run

# Apply migrations to all tenants
python backend/scripts/migrate_all_tenants.py

# Migrate a specific tenant
python backend/scripts/migrate_all_tenants.py --tenant-id <tenant_id>

# View migration history
python backend/scripts/migrate_all_tenants.py --history

Docker:

# Dry-run
docker-compose exec api python scripts/migrate_all_tenants.py --dry-run

# Apply to all tenants
docker-compose exec api python scripts/migrate_all_tenants.py

# View history
docker-compose exec api python scripts/migrate_all_tenants.py --history

Migration Examples

See backend/app/tenant_db/tenant_migrations_examples.py for 13+ examples including:

  1. Adding simple columns
  2. Adding columns with constraints
  3. Creating indexes
  4. Adding PostgreSQL arrays
  5. Modifying column types
  6. Data migrations
  7. Creating related tables
  8. Soft delete implementation
  9. And more...

Tenant Migration Best Practices

  1. Always use IF NOT EXISTS: Migrations must be idempotent
  2. Test in dry-run mode first: Use --dry-run to preview changes
  3. Sequential version numbers: Use v1, v2, v3... without gaps
  4. Document each migration: Add clear docstrings
  5. Backup before migrating: Tenant data is critical
  6. Test on one tenant first: Use --tenant-id to test on a single tenant
  7. Handle failures gracefully: Failed migrations don't affect other tenants

Complete Database Reset (Development Only)

⚠️ WARNING: This will delete ALL data including all tenant databases!

Use this when you need to completely reset your development environment:

  • After deleting the database manually
  • When migrations are corrupted or out of sync
  • To regenerate migrations from model changes
  • To start fresh with a clean slate
# Option 1: Use automated reset script (recommended)
# IMPORTANT: Run this from the HOST machine, NOT from inside a container
./backend/scripts/reset_db.sh

# Option 2: Manual reset
# Step 1: Drop and recreate database
docker-compose exec postgres psql -U postgres -c "DROP DATABASE IF EXISTS saas_platform;"
docker-compose exec postgres psql -U postgres -c "CREATE DATABASE saas_platform;"

# Step 2: Generate migration from models
docker-compose exec api /app/flask-wrapper.sh db migrate -m "Initial migration"

# Step 3: Apply migration
docker-compose exec api /app/flask-wrapper.sh db upgrade

# Step 4: (Optional) Create admin user and test tenant
docker-compose exec api python scripts/init_db.py --create-admin --create-test-tenant

Note: The reset script automatically handles:

  • Database drop and recreation
  • Migration generation from current models
  • Table creation
  • Excluding tenant-specific models (File, Document) from main database

Migration Architecture Summary

┌─────────────────────────────────────────────────────────────────┐
│                     Database Migrations                          │
└─────────────────────────────────────────────────────────────────┘

Main Database (saas_platform)
├── User, Tenant, UserTenantAssociation
├── Managed by: Alembic/Flask-Migrate
├── Location: backend/migrations/versions/
└── Commands: flask db migrate, flask db upgrade

Tenant Databases (tenant_xyz_*)
├── File, Document (per tenant)
├── Managed by: Custom Migration System
├── Location: backend/app/tenant_db/tenant_migrations.py
└── Commands: python scripts/migrate_all_tenants.py

Migration Best Practices

  1. Always Review Auto-Generated Migrations: Alembic may not capture all changes correctly
  2. Test Migrations: Test upgrade and downgrade paths before production
  3. Backup Before Migrating: Always backup databases before running migrations
  4. Version Control: Commit migration files to Git
  5. Production Migrations: Use maintenance windows for production migrations
  6. Tenant Migrations: Always test in dry-run mode first

API Documentation

Interactive Swagger UI

The API includes an interactive Swagger UI interface for easy exploration and testing:

The Swagger UI provides:

  • Complete endpoint documentation with request/response examples
  • Interactive API testing directly from your browser
  • JWT authentication support (use the "Authorize" button)
  • Schema validation and example payloads
  • Download OpenAPI specification

OpenAPI/Swagger Specification

Complete API documentation is available in OpenAPI 3.0 format:

  • File: backend/swagger.yaml
  • Format: OpenAPI 3.0.3
  • Endpoints: 40+ documented endpoints
  • Schemas: 20+ reusable components

API Overview

Authentication Endpoints (/api/auth)

  • POST /api/auth/register - Register new user
  • POST /api/auth/login - User login (returns JWT tokens)
  • POST /api/auth/refresh - Refresh access token
  • POST /api/auth/logout - Logout (blacklist token)

User Endpoints (/api/users)

  • GET /api/users/me - Get current user profile
  • PUT /api/users/me - Update user profile
  • GET /api/users/me/tenants - Get user's tenants with roles

Tenant Endpoints (/api/tenants)

  • GET /api/tenants - List user's tenants
  • POST /api/tenants - Create new tenant
  • GET /api/tenants/{id} - Get tenant details with members
  • PUT /api/tenants/{id} - Update tenant (admin only)
  • DELETE /api/tenants/{id} - Soft delete tenant (admin only)
  • POST /api/tenants/{id}/users - Add user to tenant (admin only)
  • DELETE /api/tenants/{id}/users/{user_id} - Remove user from tenant (admin only)

Document Endpoints (/api/tenants/{tenant_id}/documents)

  • GET /documents - List documents (paginated)
  • POST /documents - Upload document
  • GET /documents/{id} - Get document details
  • PUT /documents/{id} - Update document metadata
  • DELETE /documents/{id} - Delete document
  • GET /documents/{id}/download - Get download URL

File Endpoints (/api/files/{tenant_id}/files)

  • GET /files - List files (paginated, with stats)
  • GET /files/{id} - Get file details
  • DELETE /files/{id} - Delete orphaned file (admin only)

Authentication

All protected endpoints require a JWT access token:

# Include token in Authorization header
Authorization: Bearer <access_token>

Example API Calls

Register User

curl -X POST http://localhost:4999/api/auth/register \
  -H "Content-Type: application/json" \
  -d '{
    "first_name": "John",
    "last_name": "Doe",
    "email": "john.doe@example.com",
    "password": "SecurePass123"
  }'

Login

curl -X POST http://localhost:4999/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "john.doe@example.com",
    "password": "SecurePass123"
  }'

Create Tenant

curl -X POST http://localhost:4999/api/tenants \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <access_token>" \
  -d '{
    "name": "Acme Corp"
  }'

Upload Document

curl -X POST http://localhost:4999/api/tenants/<tenant_id>/documents \
  -H "Authorization: Bearer <access_token>" \
  -F "file=@/path/to/document.pdf"

For complete API documentation, see backend/swagger.yaml or use Swagger UI.


Testing

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=app tests/

# Run specific test file
pytest tests/unit/test_auth.py

# Run with verbose output
pytest -v

# Run and show print statements
pytest -s

Test Structure

tests/
├── unit/               # Unit tests (isolated, mocked dependencies)
│   ├── test_models.py
│   ├── test_schemas.py
│   └── test_services.py
├── integration/        # Integration tests (with database)
│   ├── test_auth_api.py
│   ├── test_tenants_api.py
│   └── test_documents_api.py
└── conftest.py        # Pytest fixtures and configuration

Writing Tests

# Example unit test
def test_user_password_hashing():
    user = User(email="test@example.com")
    user.set_password("12345678")

    assert user.check_password("12345678") is True
    assert user.check_password("wrongpassword") is False

# Example integration test
def test_register_user(client):
    response = client.post('/api/auth/register', json={
        'first_name': 'John',
        'last_name': 'Doe',
        'email': 'john@example.com',
        'password': 'SecurePass123'
    })

    assert response.status_code == 201
    assert 'user' in response.json

Test Coverage

Aim for 80%+ code coverage. Check coverage report:

# Generate coverage report
pytest --cov=app --cov-report=html tests/

# Open report in browser
open htmlcov/index.html

Project Structure

SaaS-Industry4.0-Backend/
├── backend/                      # Backend application
│   ├── app/                      # Main application package
│   │   ├── __init__.py           # App factory
│   │   ├── config.py             # Configuration classes
│   │   ├── extensions.py         # Flask extensions (db, jwt, etc.)
│   │   ├── models/               # SQLAlchemy models
│   │   │   ├── __init__.py
│   │   │   ├── base.py           # Base model with common fields
│   │   │   ├── user.py           # User model
│   │   │   ├── tenant.py         # Tenant model
│   │   │   ├── user_tenant_association.py  # Many-to-many
│   │   │   ├── document.py       # Document model (tenant DB)
│   │   │   └── file.py           # File model (tenant DB)
│   │   ├── schemas/              # Marshmallow schemas
│   │   │   ├── __init__.py
│   │   │   ├── user_schema.py
│   │   │   ├── tenant_schema.py
│   │   │   ├── document_schema.py
│   │   │   └── file_schema.py
│   │   ├── routes/               # API routes (blueprints)
│   │   │   ├── __init__.py
│   │   │   ├── auth.py           # Authentication routes
│   │   │   ├── users.py          # User routes
│   │   │   ├── tenants.py        # Tenant routes
│   │   │   ├── documents.py      # Document routes
│   │   │   ├── files.py          # File routes
│   │   │   └── kafka_demo.py     # Kafka demo routes
│   │   ├── utils/                # Utility modules
│   │   │   ├── __init__.py
│   │   │   ├── responses.py      # Response helpers
│   │   │   ├── decorators.py     # Custom decorators
│   │   │   └── database.py       # Database utilities
│   │   └── worker/               # Background workers
│   │       ├── __init__.py
│   │       ├── consumer.py       # Kafka consumer
│   │       └── producer.py       # Kafka producer
│   ├── migrations/               # Database migrations (Alembic)
│   ├── scripts/                  # Utility scripts
│   │   └── init_db.py            # Database initialization
│   ├── tests/                    # Test suite
│   │   ├── unit/                 # Unit tests
│   │   ├── integration/          # Integration tests
│   │   └── conftest.py           # Pytest fixtures
│   ├── requirements.txt          # Python dependencies
│   └── run.py                    # Application entry point
├── docker/                       # Docker configuration
│   ├── Dockerfile.api            # API server Dockerfile
│   ├── Dockerfile.worker         # Worker Dockerfile
│   ├── build-api.sh              # API build script
│   └── build-worker.sh           # Worker build script
├── logs/                         # Application logs
├── uploads/                      # Temporary upload directory
├── .env.development              # Development environment
├── .env.docker                   # Docker environment
├── .env.production               # Production environment (template)
├── .dockerignore                 # Docker ignore file
├── .gitignore                    # Git ignore file
├── docker-compose.yml            # Docker Compose configuration
├── backend/
│   ├── swagger.yaml              # OpenAPI specification
├── DOCKER.md                     # Docker deployment guide
├── README.md                     # This file
└── plan.md                       # Implementation plan

Key Directories

  • backend/app/models/: Database models using SQLAlchemy ORM
  • backend/app/schemas/: Request/response validation with Marshmallow
  • backend/app/routes/: API endpoints organized by blueprint
  • backend/app/utils/: Helper functions, decorators, and utilities
  • backend/app/worker/: Kafka consumer for background processing
  • backend/migrations/: Database migration files (Alembic)
  • backend/scripts/: Administrative scripts (DB init, seeding, etc.)
  • backend/tests/: Unit and integration tests
  • docker/: Dockerfiles and build scripts

Deployment

Production Checklist

Before deploying to production, ensure:

  • Change JWT_SECRET_KEY to a strong random value (64+ characters)
  • Use strong database passwords (16+ characters, mixed case, numbers, symbols)
  • Enable SSL/TLS for all connections (database, Redis, Kafka, S3, Vault)
  • Restrict CORS origins to production domains only
  • Set FLASK_ENV=production and FLASK_DEBUG=0
  • Configure Redis for production (persistence, authentication, clustering if needed)
  • Configure rate limiting with Redis
  • Set up external logging service (Sentry, CloudWatch, etc.)
  • Configure monitoring and alerting (Prometheus, Grafana, etc.)
  • Implement backup strategy for databases, S3, and Vault
  • Set up CDN for static assets
  • Configure load balancer with health checks
  • Enable auto-scaling policies
  • Document disaster recovery plan
  • Configure HashiCorp Vault for production (no dev mode, proper backend, TLS)
  • Set up Vault policies and audit logging
  • Implement secret rotation strategy with Vault
  • Configure security headers (CSP, HSTS, etc.)
  • Set up email service (SendGrid, Mailgun, etc.)
  • Test all endpoints with production data volumes

Docker Production Deployment

See DOCKER.md for detailed production deployment instructions with Docker Compose.

Cloud Deployment Options

AWS

  • Compute: ECS/Fargate for containers, EC2 for VMs
  • Database: RDS for PostgreSQL (Multi-AZ for HA)
  • Storage: S3 for file storage
  • Message Queue: MSK (Managed Kafka) or Amazon MQ
  • Load Balancer: ALB with health checks
  • Monitoring: CloudWatch, X-Ray
  • Secrets: AWS Secrets Manager

Google Cloud Platform

  • Compute: GKE for Kubernetes, Cloud Run for containers
  • Database: Cloud SQL for PostgreSQL
  • Storage: Cloud Storage
  • Message Queue: Pub/Sub or Confluent Cloud
  • Load Balancer: Cloud Load Balancing
  • Monitoring: Cloud Monitoring, Cloud Logging
  • Secrets: Secret Manager

Azure

  • Compute: AKS for Kubernetes, Container Instances
  • Database: Azure Database for PostgreSQL
  • Storage: Blob Storage
  • Message Queue: Event Hubs (Kafka-compatible)
  • Load Balancer: Azure Load Balancer
  • Monitoring: Azure Monitor, Application Insights
  • Secrets: Key Vault

Kubernetes Deployment

For Kubernetes deployment:

  1. Create Kubernetes manifests from Docker Compose:

    kompose convert -f docker-compose.yml
  2. Adjust generated manifests for production requirements

  3. Set up Helm charts for easier management

  4. Configure Ingress for external access

  5. Set up persistent volumes for databases

Performance Optimization

  • Database: Use connection pooling, read replicas, query optimization
  • Caching with Redis:
    • Session storage for horizontal scaling
    • API response caching to reduce database load
    • Token blacklist management across instances
    • Rate limiting implementation
  • CDN: Use CloudFront, Cloudflare for static assets
  • Load Balancing: Distribute traffic across multiple API instances
  • Async Processing: Offload heavy operations to Kafka workers
  • Monitoring: Track API response times, database queries, error rates, Redis memory usage

Contributing

We welcome contributions! Please follow these guidelines:

Development Workflow

  1. Fork the Repository

git clone https://github.com/ysimonx/SaaS-Industry4.0-Backend.git cd SaaS-Industry4.0-Backend


2. **Create Feature Branch**
```bash
git checkout -b feature/your-feature-name
  1. Make Changes

    • Follow existing code style
    • Add tests for new features
    • Update documentation
  2. Run Tests

    pytest
    black backend/
    flake8 backend/
  3. Commit Changes

    git add .
    git commit -m "feat: Add your feature description"
  4. Push and Create PR

    git push origin feature/your-feature-name

Code Style

  • Python: Follow PEP 8, use black for formatting
  • Imports: Group by standard library, third-party, local
  • Docstrings: Use Google-style docstrings
  • Type Hints: Use type hints for function parameters and return values
  • Comments: Explain "why", not "what"

Commit Messages

Follow Conventional Commits:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation changes
  • style: Code style changes (formatting, etc.)
  • refactor: Code refactoring
  • test: Adding or updating tests
  • chore: Maintenance tasks

Pull Request Process

  1. Ensure all tests pass
  2. Update documentation if needed
  3. Add entry to CHANGELOG.md
  4. Request review from maintainers
  5. Address review feedback
  6. Squash commits before merge

License

This project is licensed under the MIT License - see the LICENSE file for details.

MIT License

Copyright (c) 2025 SaaS Platform Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Support

Documentation

Getting Help

Reporting Bugs

When reporting bugs, please include:

  1. Environment details (OS, Python version, Docker version)
  2. Steps to reproduce the issue
  3. Expected behavior
  4. Actual behavior
  5. Error messages and logs
  6. Screenshots (if applicable)

Suggesting Features

Feature requests are welcome! Please:

  1. Check existing issues first
  2. Describe the use case
  3. Explain the expected behavior
  4. Provide examples if possible

Acknowledgments

  • Flask framework and ecosystem
  • PostgreSQL community
  • Apache Kafka project
  • MinIO team
  • All open-source contributors

Roadmap

Version 1.1 (Upcoming)

  • Swagger UI integration
  • Advanced search and filtering
  • Bulk operations API
  • Audit logging UI
  • Email notifications

Version 1.2 (Future)

  • Real-time notifications (WebSockets)
  • Advanced analytics dashboard
  • Multi-region support
  • Data export/import tools
  • OAuth2 provider integration

Version 2.0 (Long-term)

  • GraphQL API
  • Mobile SDK
  • Workflow automation
  • AI/ML integrations
  • Advanced reporting

Built with ❤️ using Flask, PostgreSQL, Kafka, and S3

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages