# Capacity Monitoring Detailed guide for monitoring and managing platform capacity, including growth forecasting and scaling recommendations. ## Overview The Capacity Monitoring system provides insights into resource consumption and helps plan infrastructure scaling. It includes: - **Real-time metrics**: Current resource usage and health status - **Subscription capacity**: Theoretical vs actual capacity based on store subscriptions - **Growth forecasting**: Historical trends and future projections - **Scaling recommendations**: Automated advice for infrastructure planning ## API Endpoints All capacity endpoints are under `/api/v1/admin/platform-health`: | Endpoint | Method | Description | |----------|--------|-------------| | `/health` | GET | Full platform health report | | `/capacity` | GET | Capacity-focused metrics | | `/subscription-capacity` | GET | Subscription-based capacity analysis | | `/trends` | GET | Growth trends over specified period | | `/recommendations` | GET | Prioritized scaling recommendations | | `/snapshot` | POST | Manually capture capacity snapshot | ## Key Metrics ### Client Metrics | Metric | Description | Threshold Indicator | |--------|-------------|---------------------| | Active Clients | Stores with activity in last 30 days | Scale planning | | Total Products | Sum across all stores | Storage/DB sizing | | Products per Client | Average products per store | Tier compliance | | Monthly Orders | Order volume this month | Performance impact | ### Subscription Capacity Track theoretical vs actual capacity based on all store subscriptions: ```python # GET /api/v1/admin/platform-health/subscription-capacity { "total_subscriptions": 150, "tier_distribution": { "essential": 80, "professional": 50, "business": 18, "enterprise": 2 }, "products": { "actual": 125000, "theoretical_limit": 500000, "utilization_percent": 25.0, "headroom": 375000 }, "orders_monthly": { "actual": 45000, "theoretical_limit": 300000, "utilization_percent": 15.0, "headroom": 255000 }, "team_members": { "actual": 320, "theoretical_limit": 1500, "utilization_percent": 21.3, "headroom": 1180 } } ``` ### Storage Metrics | Metric | Description | Warning | Critical | |--------|-------------|---------|----------| | Image Files | Total files in storage | 80% of limit | 95% of limit | | Image Storage (GB) | Total size in gigabytes | 80% of disk | 95% of disk | | Database Size (GB) | PostgreSQL data size | 80% of allocation | 95% of allocation | | Backup Size (GB) | Latest backup size | Informational | N/A | ### Performance Metrics | Metric | Good | Warning | Critical | |--------|------|---------|----------| | Avg Response Time | < 100ms | 100-300ms | > 300ms | | DB Query Time (p95) | < 50ms | 50-200ms | > 200ms | | Cache Hit Rate | > 90% | 70-90% | < 70% | | Connection Pool Usage | < 70% | 70-90% | > 90% | ## Growth Forecasting ### Capacity Snapshots Daily snapshots are captured automatically by the `capture_capacity_snapshot` background task: ```python # Captured daily at midnight class CapacitySnapshot: snapshot_date: datetime # Store metrics total_stores: int active_stores: int trial_stores: int # Subscription metrics total_subscriptions: int active_subscriptions: int # Resource metrics total_products: int total_orders_month: int total_team_members: int # Storage metrics storage_used_gb: Decimal db_size_mb: Decimal # Capacity metrics theoretical_products_limit: int theoretical_orders_limit: int theoretical_team_limit: int # Tier distribution tier_distribution: dict ``` ### Growth Trends Analyze growth over any period: ```python # GET /api/v1/admin/platform-health/trends?days=30 { "period_days": 30, "snapshots_available": 30, "start_date": "2025-11-26", "end_date": "2025-12-26", "trends": { "stores": { "start_value": 140, "current_value": 150, "change": 10, "growth_rate_percent": 7.14, "daily_growth_rate": 0.238, "monthly_projection": 161 }, "products": { "start_value": 115000, "current_value": 125000, "change": 10000, "growth_rate_percent": 8.7, "daily_growth_rate": 0.29, "monthly_projection": 136000 }, "orders": { "start_value": 40000, "current_value": 45000, "change": 5000, "growth_rate_percent": 12.5, "monthly_projection": 51000 }, "team_members": {...}, "storage_gb": { "start_value": 150.5, "current_value": 165.2, "change": 14.7 } } } ``` ### Days Until Threshold Calculate when a metric will reach a specific threshold: ```python # Service method days = capacity_forecast_service.get_days_until_threshold( db, metric="total_products", threshold=500000 ) # Returns: 120 (days until products reach 500K) ``` ## Scaling Recommendations The system generates automated recommendations based on current capacity and growth: ```python # GET /api/v1/admin/platform-health/recommendations [ { "category": "capacity", "severity": "warning", "title": "Product capacity approaching limit", "description": "Currently at 85% of theoretical product capacity", "action": "Consider upgrading store tiers or adding capacity" }, { "category": "infrastructure", "severity": "info", "title": "Current tier: Medium", "description": "Next upgrade trigger: 300 stores", "action": "Monitor growth and plan for infrastructure scaling" }, { "category": "growth", "severity": "info", "title": "High store growth rate", "description": "Store base growing at 15.2% over last 30 days", "action": "Ensure infrastructure can scale to meet demand" }, { "category": "storage", "severity": "warning", "title": "Storage usage high", "description": "Image storage at 850 GB", "action": "Plan for storage expansion or implement cleanup policies" } ] ``` ### Severity Levels | Severity | Description | Action Required | |----------|-------------|-----------------| | `critical` | Immediate action needed | Within 24 hours | | `warning` | Plan action soon | Within 1-2 weeks | | `info` | Informational | Monitor and plan | ## Threshold Configuration Edit thresholds in the admin settings or via environment: ```python # Capacity thresholds (can be configured per deployment) CAPACITY_THRESHOLDS = { # Products "products_total": { "warning": 400_000, "critical": 475_000, "limit": 500_000, }, # Storage (GB) "storage_gb": { "warning": 800, "critical": 950, "limit": 1000, }, # Database (GB) "db_size_gb": { "warning": 20, "critical": 24, "limit": 25, }, # Monthly orders "monthly_orders": { "warning": 250_000, "critical": 280_000, "limit": 300_000, }, } ``` ## Infrastructure Scaling Reference | Clients | vCPU | RAM | Storage | Database | Monthly Cost | |---------|------|-----|---------|----------|--------------| | 1-50 | 2 | 4GB | 100GB | SQLite | €30 | | 50-100 | 4 | 8GB | 250GB | PostgreSQL | €80 | | 100-300 | 4 | 16GB | 500GB | PostgreSQL | €150 | | 300-500 | 8 | 32GB | 1TB | PostgreSQL + Redis | €350 | | 500-1000 | 16 | 64GB | 2TB | PostgreSQL + Redis | €700 | | 1000+ | 32+ | 128GB+ | 4TB+ | PostgreSQL cluster | €1,500+ | ## Background Tasks ### Capacity Snapshot Task ```python # app/tasks/subscription_tasks.py async def capture_capacity_snapshot(): """ Capture a daily snapshot of platform capacity metrics. Should run daily at midnight. """ from app.services.capacity_forecast_service import capacity_forecast_service db = SessionLocal() try: snapshot = capacity_forecast_service.capture_daily_snapshot(db) db.commit() return { "snapshot_id": snapshot.id, "snapshot_date": snapshot.snapshot_date.isoformat(), "total_stores": snapshot.total_stores, "total_products": snapshot.total_products, } finally: db.close() ``` ### Manual Snapshot Capture a snapshot on demand: ```bash # Via API curl -X POST /api/v1/admin/platform-health/snapshot \ -H "Authorization: Bearer $TOKEN" # Response { "id": 42, "snapshot_date": "2025-12-26T00:00:00Z", "total_stores": 150, "total_products": 125000, "message": "Snapshot captured successfully" } ``` ## Alerts Capacity alerts trigger when: 1. **Warning (Yellow)**: 80% of any threshold 2. **Critical (Red)**: 95% of any threshold 3. **Exceeded**: 100%+ of threshold (immediate action) ## Historical Data ### Viewing Historical Trends Use the `/trends` endpoint with different day ranges: ```bash # Last 7 days GET /api/v1/admin/platform-health/trends?days=7 # Last 30 days (default) GET /api/v1/admin/platform-health/trends?days=30 # Last 90 days GET /api/v1/admin/platform-health/trends?days=90 ``` ### Data Retention - Snapshots are stored indefinitely by default - Consider implementing cleanup for snapshots older than 2 years - At minimum, keep monthly aggregates for long-term trending ## Export Reports Generate capacity reports for planning: - **Weekly summary**: PDF or CSV - **Monthly capacity report**: Detailed analysis - **Projection report**: 3/6/12 month forecasts ## Usage Examples ### Check Current Capacity ```python from app.services.platform_health_service import platform_health_service from app.services.capacity_forecast_service import capacity_forecast_service # Get subscription capacity capacity = platform_health_service.get_subscription_capacity(db) print(f"Products: {capacity['products']['actual']} / {capacity['products']['theoretical_limit']}") print(f"Utilization: {capacity['products']['utilization_percent']}%") # Get growth trends trends = capacity_forecast_service.get_growth_trends(db, days=30) print(f"Store growth: {trends['trends']['stores']['growth_rate_percent']}%") # Get recommendations recommendations = capacity_forecast_service.get_scaling_recommendations(db) for rec in recommendations: print(f"[{rec['severity']}] {rec['title']}: {rec['action']}") ``` ### Project Future Capacity ```python # Calculate days until product limit days = capacity_forecast_service.get_days_until_threshold( db, metric="total_products", threshold=500000 ) if days: print(f"Products will reach 500K in approximately {days} days") else: print("Insufficient data or no growth detected") ``` ## Related Documentation - [Subscription & Billing](../features/subscription-billing.md) - Complete billing system - [Capacity Planning](../architecture/capacity-planning.md) - Full sizing guide - [Platform Health](platform-health.md) - Real-time health monitoring - [Image Storage](image-storage.md) - Image system details