Files
orion/docs/deployment/infrastructure.md
Samir Boulahtit e9253fbd84 refactor: rename Wizamart to Orion across entire codebase
Replace all ~1,086 occurrences of Wizamart/wizamart/WIZAMART/WizaMart
with Orion/orion/ORION across 184 files. This includes database
identifiers, email addresses, domain references, R2 bucket names,
DNS prefixes, encryption salt, Celery app name, config defaults,
Docker configs, CI configs, documentation, seed data, and templates.

Renames homepage-wizamart.html template to homepage-orion.html.
Fixes duplicate file_pattern key in api.yaml architecture rule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 16:46:56 +01:00

31 KiB

Infrastructure Guide

This guide documents the complete infrastructure for the Orion platform, from development to high-end production.

Philosophy: We prioritize debuggability and operational simplicity over complexity. Every component should be directly accessible for troubleshooting.


Table of Contents


Architecture Overview

System Components

┌─────────────────────────────────────────────────────────────────────────┐
│                              CLIENTS                                     │
│  (Browsers, Mobile Apps, API Consumers)                                  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         LOAD BALANCER / PROXY                            │
│  (Nginx, Caddy, or Cloud LB)                                            │
│  - SSL termination                                                       │
│  - Static file serving                                                   │
│  - Rate limiting                                                         │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                    ┌───────────────┼───────────────┐
                    ▼               ▼               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         APPLICATION SERVERS                              │
│  (FastAPI + Uvicorn)                                                     │
│  - API endpoints                                                         │
│  - HTML rendering (Jinja2)                                              │
│  - WebSocket connections                                                 │
└─────────────────────────────────────────────────────────────────────────┘
                    │               │               │
                    ▼               ▼               ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│   PostgreSQL     │ │      Redis       │ │   File Storage   │
│   (Primary DB)   │ │  (Cache/Queue)   │ │  (S3/Local)      │
└──────────────────┘ └──────────────────┘ └──────────────────┘
                                │
                                ▼
                    ┌──────────────────┐
                    │  Celery Workers  │
                    │ (Background Jobs)│
                    └──────────────────┘

Data Flow

  1. Request → Nginx → Uvicorn → FastAPI → Service Layer → Database
  2. Background Job → API creates task → Redis Queue → Celery Worker → Database
  3. Static Files → Nginx serves directly (or CDN in production)

Current State

What We Have Now

Component Technology Dev Required Prod Required Status
Web Framework FastAPI + Uvicorn Production Ready
Database PostgreSQL 15 Production Ready
ORM SQLAlchemy 2.0 Production Ready
Migrations Alembic Production Ready
Templates Jinja2 + Tailwind CSS Production Ready
Authentication JWT (PyJWT) Production Ready
Email SMTP/SendGrid/Mailgun/SES Production Ready
Payments Stripe Production Ready
Task Queue Celery 5.3 + Redis Production Ready
Task Scheduler Celery Beat Production Ready
Task Monitoring Flower Optional Production Ready
Caching Redis 7 Production Ready
File Storage Local / Cloudflare R2 Local R2 Production Ready
Error Tracking Sentry Recommended Production Ready
CDN / WAF CloudFlare Recommended Production Ready

Legend: Required | Optional/Recommended | Not needed

Development vs Production

Development requires only:

  • PostgreSQL (via Docker: make docker-up)
  • Python 3.11+ with dependencies

Production adds:

  • Redis (for Celery task queue)
  • Celery workers (for background tasks)
  • Reverse proxy (Nginx)
  • SSL certificates

Optional but recommended for Production:

  • Sentry (error tracking) - Set SENTRY_DSN to enable
  • Cloudflare R2 (cloud storage) - Set STORAGE_BACKEND=r2 to enable
  • CloudFlare CDN (caching/DDoS) - Set CLOUDFLARE_ENABLED=true to enable

What We Need for Enterprise (Future Growth)

Component Priority When Needed Estimated Users
Load Balancer Medium Horizontal scaling 1,000+ concurrent
Database Replica Medium Read-heavy workloads 1,000+ concurrent
Redis Sentinel Low Cache redundancy 5,000+ concurrent
Prometheus/Grafana Low Advanced metrics Any (nice to have)
Kubernetes Low Multi-region/HA 10,000+ concurrent

Development Environment

# 1. Start PostgreSQL and Redis
make docker-up

# 2. Run migrations
make migrate-up

# 3. Initialize data
make init-prod

# 4. Start development server
make dev

# 5. (Optional) Start Celery worker for background tasks
make celery-dev  # Worker + Beat together

# 6. (Optional) Run tests
make test

Services Running Locally

Service Host Port Purpose
FastAPI localhost 8000 Main application
PostgreSQL localhost 5432 Development database
PostgreSQL (test) localhost 5433 Test database
Redis localhost 6380 Cache and task broker
Celery Worker - - Background task processing
Celery Beat - - Scheduled task scheduler
Flower localhost 5555 Task monitoring dashboard
MkDocs localhost 9991 Documentation

Docker Compose Services

# docker-compose.yml
services:
  db:            # PostgreSQL 15 for development
  redis:         # Redis 7 for cache/queue
  api:           # FastAPI application (profile: full)
  celery-worker: # Background task processor (profile: full)
  celery-beat:   # Scheduled task scheduler (profile: full)
  flower:        # Task monitoring UI (profile: full)

Celery Commands

# Start worker only
make celery-worker

# Start scheduler only
make celery-beat

# Start worker + scheduler together (development)
make celery-dev

# Start Flower monitoring
make flower

# Check worker status
make celery-status

# Purge pending tasks
make celery-purge

Production Options

Best for: Teams who want direct server access, familiar with Linux administration.

┌─────────────────────────────────────────────────────────────┐
│                         VPS (4GB+ RAM)                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   Nginx     │  │  Uvicorn    │  │ PostgreSQL  │          │
│  │  (reverse   │  │  (4 workers)│  │  (local)    │          │
│  │   proxy)    │  │             │  │             │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│         │                │                │                  │
│         └────────────────┼────────────────┘                  │
│                          │                                   │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │   Redis     │  │   Celery    │                           │
│  │  (local)    │  │  (workers)  │                           │
│  └─────────────┘  └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘

Setup:

# On Ubuntu 22.04+ VPS

# 1. Install system packages
sudo apt update
sudo apt install -y nginx postgresql-15 redis-server python3.11 python3.11-venv

# 2. Create application user
sudo useradd -m -s /bin/bash orion
sudo su - orion

# 3. Clone and setup
git clone <repo> /home/orion/app
cd /home/orion/app
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 4. Configure environment
cp .env.example .env
nano .env  # Edit with production values

# 5. Setup database
sudo -u postgres createuser orion_user
sudo -u postgres createdb orion_db -O orion_user
alembic upgrade head
python scripts/seed/init_production.py

# 6. Create systemd service
sudo nano /etc/systemd/system/orion.service

Systemd Service:

# /etc/systemd/system/orion.service
[Unit]
Description=Orion API
After=network.target postgresql.service redis.service

[Service]
User=orion
Group=orion
WorkingDirectory=/home/orion/app
Environment="PATH=/home/orion/app/.venv/bin"
EnvironmentFile=/home/orion/app/.env
ExecStart=/home/orion/app/.venv/bin/uvicorn main:app --host 127.0.0.1 --port 8000 --workers 4
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Celery Workers:

# /etc/systemd/system/orion-celery.service
[Unit]
Description=Orion Celery Worker
After=network.target redis.service

[Service]
User=orion
Group=orion
WorkingDirectory=/home/orion/app
Environment="PATH=/home/orion/app/.venv/bin"
EnvironmentFile=/home/orion/app/.env
ExecStart=/home/orion/app/.venv/bin/celery -A app.celery worker --loglevel=info --concurrency=4
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Nginx Configuration:

# /etc/nginx/sites-available/orion
server {
    listen 80;
    server_name yourdomain.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;

    # Static files (served directly by Nginx)
    location /static {
        alias /home/orion/app/static;
        expires 30d;
        add_header Cache-Control "public, immutable";
    }

    # Uploaded files
    location /uploads {
        alias /home/orion/app/uploads;
        expires 7d;
    }

    # API and application
    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # WebSocket support (for future real-time features)
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Troubleshooting Commands:

# Check service status
sudo systemctl status orion
sudo systemctl status orion-celery
sudo systemctl status postgresql
sudo systemctl status redis

# View logs
sudo journalctl -u orion -f
sudo journalctl -u orion-celery -f

# Connect to database directly
sudo -u postgres psql orion_db

# Check Redis
redis-cli ping
redis-cli monitor  # Watch commands in real-time

# Restart services
sudo systemctl restart orion
sudo systemctl restart orion-celery

Option 2: Docker Compose Production

Best for: Consistent environments, easy rollbacks, container familiarity.

# docker-compose.prod.yml
services:
  api:
    build: .
    restart: always
    ports:
      - "127.0.0.1:8000:8000"
    environment:
      DATABASE_URL: postgresql://orion_user:${DB_PASSWORD}@db:5432/orion_db
      REDIS_URL: redis://redis:6379/0
      CELERY_BROKER_URL: redis://redis:6379/1
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    volumes:
      - ./uploads:/app/uploads
      - ./logs:/app/logs
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  celery:
    build: .
    restart: always
    command: celery -A app.celery worker --loglevel=info --concurrency=4
    environment:
      DATABASE_URL: postgresql://orion_user:${DB_PASSWORD}@db:5432/orion_db
      REDIS_URL: redis://redis:6379/0
      CELERY_BROKER_URL: redis://redis:6379/1
    depends_on:
      - db
      - redis
    volumes:
      - ./logs:/app/logs

  celery-beat:
    build: .
    restart: always
    command: celery -A app.celery beat --loglevel=info
    environment:
      DATABASE_URL: postgresql://orion_user:${DB_PASSWORD}@db:5432/orion_db
      CELERY_BROKER_URL: redis://redis:6379/1
    depends_on:
      - redis

  db:
    image: postgres:15
    restart: always
    environment:
      POSTGRES_DB: orion_db
      POSTGRES_USER: orion_user
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U orion_user -d orion_db"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    restart: always
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  nginx:
    image: nginx:alpine
    restart: always
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./static:/app/static:ro
      - ./uploads:/app/uploads:ro
      - /etc/letsencrypt:/etc/letsencrypt:ro
    depends_on:
      - api

volumes:
  postgres_data:
  redis_data:

Troubleshooting Commands:

# View all containers
docker compose -f docker-compose.prod.yml ps

# View logs
docker compose -f docker-compose.prod.yml logs -f api
docker compose -f docker-compose.prod.yml logs -f celery

# Access container shell
docker compose -f docker-compose.prod.yml exec api bash
docker compose -f docker-compose.prod.yml exec db psql -U orion_user -d orion_db

# Restart specific service
docker compose -f docker-compose.prod.yml restart api

# View resource usage
docker stats

Option 3: Managed Services (Minimal Ops)

Best for: Small teams, focus on product not infrastructure.

Component Service Cost (approx)
App Hosting Railway / Render / Fly.io $5-25/mo
Database Neon / Supabase / PlanetScale $0-25/mo
Redis Upstash / Redis Cloud $0-10/mo
File Storage Cloudflare R2 / AWS S3 $0-5/mo
Email Resend / SendGrid $0-20/mo

Example: Railway + Neon

# Deploy to Railway
railway login
railway init
railway up

# Configure environment
railway variables set DATABASE_URL="postgresql://..."
railway variables set REDIS_URL="redis://..."

Future High-End Architecture

Target Production Architecture

                            ┌─────────────────┐
                            │   CloudFlare    │
                            │   (CDN + WAF)   │
                            └────────┬────────┘
                                     │
                            ┌────────▼────────┐
                            │  Load Balancer  │
                            │  (HA Proxy/ALB) │
                            └────────┬────────┘
                                     │
              ┌──────────────────────┼──────────────────────┐
              │                      │                      │
     ┌────────▼────────┐   ┌────────▼────────┐   ┌────────▼────────┐
     │   API Server 1  │   │   API Server 2  │   │   API Server N  │
     │   (Uvicorn)     │   │   (Uvicorn)     │   │   (Uvicorn)     │
     └────────┬────────┘   └────────┬────────┘   └────────┬────────┘
              │                      │                      │
              └──────────────────────┼──────────────────────┘
                                     │
         ┌───────────────────────────┼───────────────────────────┐
         │                           │                           │
┌────────▼────────┐        ┌────────▼────────┐        ┌────────▼────────┐
│   PostgreSQL    │        │     Redis       │        │   S3 / MinIO    │
│   (Primary)     │        │   (Cluster)     │        │   (Files)       │
│        │        │        │                 │        │                 │
│   ┌────▼────┐   │        │   ┌─────────┐   │        │                 │
│   │ Replica │   │        │   │ Sentinel│   │        │                 │
│   └─────────┘   │        │   └─────────┘   │        │                 │
└─────────────────┘        └─────────────────┘        └─────────────────┘
                                     │
              ┌──────────────────────┼──────────────────────┐
              │                      │                      │
     ┌────────▼────────┐   ┌────────▼────────┐   ┌────────▼────────┐
     │ Celery Worker 1 │   │ Celery Worker 2 │   │ Celery Beat     │
     │ (General)       │   │ (Import Jobs)   │   │ (Scheduler)     │
     └─────────────────┘   └─────────────────┘   └─────────────────┘

                    ┌─────────────────────────────┐
                    │       Monitoring Stack      │
                    │  ┌─────────┐ ┌───────────┐  │
                    │  │Prometheus│ │  Grafana  │  │
                    │  └─────────┘ └───────────┘  │
                    │  ┌─────────┐ ┌───────────┐  │
                    │  │  Sentry │ │  Loki     │  │
                    │  └─────────┘ └───────────┘  │
                    └─────────────────────────────┘

Celery Task Queues

# app/celery.py (to be implemented)
from celery import Celery

celery_app = Celery(
    "orion",
    broker=settings.celery_broker_url,
    backend=settings.celery_result_backend,
)

celery_app.conf.task_queues = {
    "default": {"exchange": "default", "routing_key": "default"},
    "imports": {"exchange": "imports", "routing_key": "imports"},
    "emails": {"exchange": "emails", "routing_key": "emails"},
    "reports": {"exchange": "reports", "routing_key": "reports"},
}

celery_app.conf.task_routes = {
    "app.tasks.import_letzshop_products": {"queue": "imports"},
    "app.tasks.send_email": {"queue": "emails"},
    "app.tasks.generate_report": {"queue": "reports"},
}

Background Tasks to Implement

Task Queue Priority Description
import_letzshop_products imports High Marketplace product sync
import_letzshop_orders imports High Order sync from Letzshop
send_order_confirmation emails High Order emails
send_password_reset emails High Auth emails
send_invoice_email emails Medium Invoice delivery
generate_sales_report reports Low Analytics reports
cleanup_expired_sessions default Low Maintenance
sync_stripe_subscriptions default Medium Billing sync

Component Deep Dives

PostgreSQL Configuration

Production Settings (postgresql.conf):

# Memory (adjust based on server RAM)
shared_buffers = 256MB          # 25% of RAM for dedicated DB server
effective_cache_size = 768MB    # 75% of RAM
work_mem = 16MB
maintenance_work_mem = 128MB

# Connections
max_connections = 100

# Write-Ahead Log
wal_level = replica
max_wal_senders = 3

# Query Planning
random_page_cost = 1.1          # For SSD storage
effective_io_concurrency = 200  # For SSD storage

# Logging
log_min_duration_statement = 1000  # Log queries > 1 second
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '

Backup Strategy:

# Daily backup script
#!/bin/bash
BACKUP_DIR=/backups/postgresql
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump -U orion_user orion_db | gzip > $BACKUP_DIR/orion_$DATE.sql.gz

# Keep last 7 days
find $BACKUP_DIR -name "*.sql.gz" -mtime +7 -delete

Redis Configuration

Use Cases:

Use Case Database TTL Description
Session Cache 0 24h User sessions
API Rate Limiting 0 1h Request counters
Celery Broker 1 - Task queue
Celery Results 2 24h Task results
Feature Flags 3 5m Feature gate cache

Configuration (redis.conf):

maxmemory 256mb
maxmemory-policy allkeys-lru
appendonly yes
appendfsync everysec

Nginx Tuning

# /etc/nginx/nginx.conf
worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}

http {
    # Buffers
    client_body_buffer_size 10K;
    client_header_buffer_size 1k;
    client_max_body_size 50M;
    large_client_header_buffers 2 1k;

    # Timeouts
    client_body_timeout 12;
    client_header_timeout 12;
    keepalive_timeout 15;
    send_timeout 10;

    # Gzip
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css text/xml application/json application/javascript;
}

Troubleshooting Guide

Quick Diagnostics

# Check all services
systemctl status orion orion-celery postgresql redis nginx

# Check ports
ss -tlnp | grep -E '(8000|5432|6379|80|443)'

# Check disk space
df -h

# Check memory
free -h

# Check CPU/processes
htop

Database Issues

# Connect to database
sudo -u postgres psql orion_db

# Check active connections
SELECT count(*) FROM pg_stat_activity;

# Find slow queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;

# Kill stuck query
SELECT pg_terminate_backend(pid);

# Check table sizes
SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;

# Analyze query performance
EXPLAIN ANALYZE SELECT ...;

Redis Issues

# Check connectivity
redis-cli ping

# Monitor real-time commands
redis-cli monitor

# Check memory usage
redis-cli info memory

# List all keys (careful in production!)
redis-cli --scan

# Check queue lengths
redis-cli llen celery

# Flush specific database
redis-cli -n 1 flushdb  # Flush Celery broker

Celery Issues

# Check worker status
celery -A app.celery inspect active
celery -A app.celery inspect reserved
celery -A app.celery inspect stats

# Purge all pending tasks
celery -A app.celery purge

# List registered tasks
celery -A app.celery inspect registered

Application Issues

# Check API health
curl -s http://localhost:8000/health | jq

# View recent logs
journalctl -u orion --since "10 minutes ago"

# Check for Python errors
journalctl -u orion | grep -i error | tail -20

# Test database connection
python -c "from app.core.database import engine; print(engine.connect())"

Common Problems & Solutions

Problem Diagnosis Solution
502 Bad Gateway systemctl status orion Restart app: systemctl restart orion
Database connection refused pg_isready Start PostgreSQL: systemctl start postgresql
High memory usage free -h, ps aux --sort=-%mem Restart app, check for memory leaks
Slow queries PostgreSQL slow query log Add indexes, optimize queries
Celery tasks stuck celery inspect active Restart workers, check Redis
Disk full df -h Clean logs, backups, temp files

Decision Matrix

When to Use Each Option

Scenario Recommended Reason
Solo developer, MVP Managed (Railway) Focus on product
Small team, budget conscious Traditional VPS Full control, low cost
Need direct DB access for debugging Traditional VPS Direct psql access
Familiar with Docker, want consistency Docker Compose Reproducible environments
High availability required Docker + Orchestration Easy scaling
Enterprise, compliance requirements Kubernetes Full orchestration

Cost Comparison (Monthly)

Setup Low Traffic Medium High
Managed (Railway + Neon) $10 $50 $200+
VPS (Hetzner/DigitalOcean) $5 $20 $80
Docker on VPS $5 $20 $80
AWS/GCP Full Stack $50 $200 $1000+

Migration Path

Phase 1: Development COMPLETE

  • PostgreSQL 15 (Docker)
  • FastAPI + Uvicorn
  • Local file storage

Phase 2: Production MVP COMPLETE

  • PostgreSQL (managed or VPS)
  • FastAPI + Uvicorn (systemd or Docker)
  • Redis 7 (cache + task broker)
  • Celery 5.3 (background jobs)
  • Celery Beat (scheduled tasks)
  • Flower (task monitoring)
  • Cloudflare R2 (cloud file storage)
  • Sentry (error tracking)
  • CloudFlare CDN (caching + DDoS protection)

Phase 3: Scale (1,000+ Users)

  • Load balancer (Nginx/HAProxy/ALB)
  • Horizontal app scaling (2-4 Uvicorn instances)
  • PostgreSQL read replica
  • Dedicated Celery workers per queue

Phase 4: Enterprise (5,000+ Users)

  • Redis Sentinel/cluster
  • Database connection pooling (PgBouncer)
  • Full monitoring stack (Prometheus/Grafana)
  • Log aggregation (Loki/ELK)

Phase 5: High Availability (10,000+ Users)

  • Multi-region deployment
  • Database failover (streaming replication)
  • Container orchestration (Kubernetes)
  • Global CDN with edge caching

Enterprise Upgrade Checklist

When you're ready to scale beyond 1,000 concurrent users:

Infrastructure

  • Load Balancer - Add Nginx/HAProxy in front of API servers

    • Enables horizontal scaling
    • Health checks and automatic failover
    • SSL termination at edge
  • Multiple API Servers - Run 2-4 Uvicorn instances

    • Scale horizontally instead of vertically
    • Blue-green deployments possible
  • Database Read Replica - Add PostgreSQL replica

    • Offload read queries from primary
    • Backup without impacting production
  • Connection Pooling - Add PgBouncer

    • Reduce database connection overhead
    • Handle connection spikes

Monitoring & Observability

  • Prometheus + Grafana - Metrics dashboards

    • Request latency, error rates, saturation
    • Database connection pool metrics
    • Celery queue lengths
  • Log Aggregation - Loki or ELK stack

    • Centralized logs from all services
    • Search and alerting
  • Alerting - PagerDuty/OpsGenie integration

    • On-call rotation
    • Escalation policies

Security

  • WAF Rules - CloudFlare or AWS WAF

    • SQL injection protection
    • Rate limiting at edge
    • Bot protection
  • Secrets Management - HashiCorp Vault

    • Rotate credentials automatically
    • Audit access to secrets

Next Steps

You're production-ready now! Optional improvements:

  1. Enable Sentry - Add SENTRY_DSN for error tracking (free tier)
  2. Enable R2 - Set STORAGE_BACKEND=r2 for cloud storage (~$5/mo)
  3. Enable CloudFlare - Proxy domain for CDN + DDoS protection (free tier)
  4. Add load balancer - When you need horizontal scaling

See also: