Files
orion/docs/architecture/capacity-planning.md
Samir Boulahtit dc7fb5ca19 feat: add capacity planning docs, image upload system, and platform health monitoring
Documentation:
- Add comprehensive capacity planning guide (docs/architecture/capacity-planning.md)
- Add operations docs: platform-health, capacity-monitoring, image-storage
- Link pricing strategy to capacity planning documentation
- Update mkdocs.yml with new Operations section

Image Upload System:
- Add ImageService with WebP conversion and sharded directory structure
- Generate multiple size variants (original, 800px, 200px)
- Add storage stats endpoint for monitoring
- Add Pillow dependency for image processing

Platform Health Monitoring:
- Add /admin/platform-health page with real-time metrics
- Show CPU, memory, disk usage with progress bars
- Display capacity thresholds with status indicators
- Generate scaling recommendations automatically
- Determine infrastructure tier based on usage
- Add psutil dependency for system metrics

Admin UI:
- Add Capacity Monitor to Platform Health section in sidebar
- Create platform-health.html template with stats cards
- Create platform-health.js for Alpine.js state management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 17:17:09 +01:00

455 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Capacity Planning & Infrastructure Sizing
This document provides comprehensive capacity planning guidelines for the Wizamart platform, including resource requirements, scaling thresholds, and monitoring recommendations.
> **Related:** [Pricing Strategy](../marketing/pricing.md) for tier definitions and limits
---
## Tier Resource Allocations
Based on our [pricing tiers](../marketing/pricing.md), here are the expected resource requirements per client:
| Metric | Essential (€49) | Professional (€99) | Business (€199) | Enterprise (€399+) |
|--------|-----------------|--------------------|-----------------|--------------------|
| Products | 200 | 500 | 2,000 | Unlimited |
| Images per product | 3 | 5 | 8 | 10+ |
| Orders per month | 100 | 500 | 2,000 | Unlimited |
| SKU variants | 1.2x | 1.5x | 2x | 3x |
| Team members | 1 | 3 | 10 | Unlimited |
| API requests/day | 1,000 | 5,000 | 20,000 | Unlimited |
---
## Scale Projections
### Target: 1,000 Business Clients (€149/month tier)
This represents our primary growth target. Here's the infrastructure impact:
| Resource | Calculation | Total |
|----------|-------------|-------|
| **Products** | 1,000 clients × 500 products | **500,000** |
| **Product Translations** | 500,000 × 4 languages | **2,000,000 rows** |
| **Images (files)** | 500,000 × 5 images × 3 sizes | **7,500,000 files** |
| **Image Storage** | 7.5M files × 200KB avg | **1.5 TB** |
| **Database Size** | Products + translations + orders + indexes | **15-25 GB** |
| **Monthly Orders** | 1,000 clients × 300 orders | **300,000 orders** |
| **Order Items** | 300,000 × 2.5 avg items | **750,000 items/month** |
| **Monthly API Requests** | 1,000 × 10,000 req/day × 30 | **300M requests** |
### Multi-Tier Mix (Realistic Scenario)
More realistic distribution across tiers:
| Tier | Clients | Products Each | Total Products | Monthly Orders |
|------|---------|---------------|----------------|----------------|
| Essential | 500 | 100 | 50,000 | 50,000 |
| Professional | 300 | 300 | 90,000 | 150,000 |
| Business | 150 | 1,000 | 150,000 | 300,000 |
| Enterprise | 50 | 3,000 | 150,000 | 200,000 |
| **Total** | **1,000** | - | **440,000** | **700,000** |
---
## Server Sizing Recommendations
### Infrastructure Tiers
| Scale | Clients | vCPU | RAM | Storage | Database | Monthly Cost |
|-------|---------|------|-----|---------|----------|--------------|
| **Starter** | 1-50 | 2 | 4GB | 100GB SSD | SQLite | €20-40 |
| **Small** | 50-100 | 4 | 8GB | 250GB SSD | PostgreSQL | €60-100 |
| **Medium** | 100-300 | 4 | 16GB | 500GB SSD | PostgreSQL | €100-180 |
| **Large** | 300-500 | 8 | 32GB | 1TB SSD | PostgreSQL + Redis | €250-400 |
| **Scale** | 500-1000 | 16 | 64GB | 2TB SSD + CDN | PostgreSQL + Redis | €500-900 |
| **Enterprise** | 1000+ | 32+ | 128GB+ | 4TB+ + CDN | PostgreSQL cluster | €1,500+ |
### Recommended Configurations
#### Starter (1-50 clients)
```
Single Server Setup:
- Hetzner CX22 or similar (2 vCPU, 4GB RAM)
- 100GB SSD storage
- SQLite database
- nginx for static files + reverse proxy
- Estimated cost: €20-40/month
```
#### Small-Medium (50-300 clients)
```
Two-Server Setup:
- App Server: 4 vCPU, 8-16GB RAM
- Database: Managed PostgreSQL (basic tier)
- Storage: Local SSD + backup
- Optional: Redis for sessions/caching
- Estimated cost: €80-180/month
```
#### Large (300-1000 clients)
```
Multi-Component Setup:
- Load Balancer: nginx or cloud LB
- App Servers: 2-4 × (4 vCPU, 8GB RAM)
- Database: Managed PostgreSQL (production tier)
- Cache: Redis (managed or self-hosted)
- Storage: Object storage (S3-compatible) + CDN
- Estimated cost: €400-900/month
```
#### Enterprise (1000+ clients)
```
Full Production Setup:
- CDN: Cloudflare or similar
- Load Balancer: Cloud-native with health checks
- App Servers: 4-8 × (4 vCPU, 16GB RAM) with auto-scaling
- Database: PostgreSQL with read replicas
- Cache: Redis cluster
- Storage: S3 + CloudFront or equivalent
- Monitoring: Prometheus + Grafana
- Estimated cost: €1,500+/month
```
---
## Image Storage Architecture
### Capacity Calculations
| Image Size (optimized) | Files per 25GB | Files per 100GB | Files per 1TB |
|------------------------|----------------|-----------------|---------------|
| 100KB (thumbnails) | 250,000 | 1,000,000 | 10,000,000 |
| 200KB (web-ready) | 125,000 | 500,000 | 5,000,000 |
| 300KB (high quality) | 83,000 | 333,000 | 3,330,000 |
| 500KB (original) | 50,000 | 200,000 | 2,000,000 |
### Image Sizes Generated
Each uploaded image generates 3 variants:
| Variant | Dimensions | Typical Size | Use Case |
|---------|------------|--------------|----------|
| `thumb` | 200×200 | 10-20KB | List views, grids |
| `medium` | 800×800 | 80-150KB | Product cards, previews |
| `original` | As uploaded | 200-500KB | Detail views, zoom |
**Storage per product:** ~600KB (with 3 sizes for main image + 2 additional images)
### Directory Structure (Sharded)
To prevent filesystem performance degradation, images are stored in a sharded directory structure:
```
/uploads/
└── products/
├── 00/ # First 2 chars of hash
│ ├── 1a/ # Next 2 chars
│ │ ├── 001a2b3c_original.webp
│ │ ├── 001a2b3c_800.webp
│ │ └── 001a2b3c_200.webp
│ └── 2b/
│ └── ...
├── 01/
└── ...
```
This structure ensures:
- Maximum ~256 subdirectories per level
- ~16 files per leaf directory at 1M total images
- Fast filesystem operations even at scale
### Performance Thresholds
| Files per Directory | Performance | Required Action |
|---------------------|-------------|-----------------|
| < 10,000 | Excellent | None |
| 10,000 - 100,000 | Good | Monitor, plan sharding |
| 100,000 - 500,000 | Degraded | **Implement sharding** |
| > 500,000 | Poor | **Migrate to object storage** |
---
## Database Performance
### Table Size Guidelines
| Table | Rows | Query Time | Status |
|-------|------|------------|--------|
| < 10,000 | < 1ms | Excellent |
| 10,000 - 100,000 | 1-10ms | Good |
| 100,000 - 1,000,000 | 10-50ms | **Add indexes, optimize queries** |
| 1,000,000 - 10,000,000 | 50-200ms | **Consider partitioning** |
| > 10,000,000 | Variable | **Sharding or dedicated DB** |
### Critical Indexes
Ensure these indexes exist at scale:
```sql
-- Products
CREATE INDEX idx_product_vendor_active ON products(vendor_id, is_active);
CREATE INDEX idx_product_gtin ON products(gtin);
CREATE INDEX idx_product_vendor_sku ON products(vendor_id, vendor_sku);
-- Orders
CREATE INDEX idx_order_vendor_status ON orders(vendor_id, status);
CREATE INDEX idx_order_created ON orders(created_at DESC);
CREATE INDEX idx_order_customer ON orders(customer_id);
-- Inventory
CREATE INDEX idx_inventory_product_location ON inventory(product_id, warehouse, bin_location);
CREATE INDEX idx_inventory_vendor ON inventory(vendor_id);
```
### Database Size Estimates
| Component | Size per 100K Products | Size per 1M Products |
|-----------|------------------------|----------------------|
| Products table | 100 MB | 1 GB |
| Translations (4 langs) | 400 MB | 4 GB |
| Orders (1 year) | 500 MB | 5 GB |
| Order items | 200 MB | 2 GB |
| Inventory | 50 MB | 500 MB |
| Indexes | 300 MB | 3 GB |
| **Total** | **~1.5 GB** | **~15 GB** |
---
## Bandwidth & Network
### Monthly Bandwidth Estimates (1000 clients)
| Traffic Type | Calculation | Monthly Volume |
|--------------|-------------|----------------|
| Image views | 500K products × 10 views × 500KB | **2.5 TB** |
| API requests | 10K req/client/day × 1000 × 30 × 2KB | **600 GB** |
| Static assets | CSS/JS cached, minimal | **50 GB** |
| **Total Egress** | | **~3 TB/month** |
### Bandwidth Costs (Approximate)
| Provider | First 1TB | Additional per TB |
|----------|-----------|-------------------|
| Hetzner | Included | €1/TB |
| AWS | $90 | $85/TB |
| DigitalOcean | 1TB free | $10/TB |
| Cloudflare | Unlimited (CDN) | Free |
**Recommendation:** Use Cloudflare for image CDN to eliminate egress costs.
---
## Scaling Triggers & Thresholds
### When to Scale Up
| Metric | Warning | Critical | Action |
|--------|---------|----------|--------|
| CPU Usage | > 70% avg | > 85% avg | Add app server |
| Memory Usage | > 75% | > 90% | Upgrade RAM or add server |
| Disk Usage | > 70% | > 85% | Expand storage |
| DB Query Time (p95) | > 100ms | > 500ms | Optimize queries, add indexes |
| API Response Time (p95) | > 500ms | > 2000ms | Scale horizontally |
| DB Connections | > 80% max | > 95% max | Add connection pooling |
| Error Rate | > 1% | > 5% | Investigate and fix |
### Architecture Transition Points
```
STARTER → SMALL (50 clients)
├── Trigger: SQLite becomes bottleneck
├── Action: Migrate to PostgreSQL
└── Cost increase: +€40-60/month
SMALL → MEDIUM (100 clients)
├── Trigger: Single server at 70%+ CPU
├── Action: Separate DB server
└── Cost increase: +€50-80/month
MEDIUM → LARGE (300 clients)
├── Trigger: Need for caching, higher availability
├── Action: Add Redis, consider multiple app servers
└── Cost increase: +€150-200/month
LARGE → SCALE (500 clients)
├── Trigger: Storage >500GB, high traffic
├── Action: Object storage + CDN, load balancing
└── Cost increase: +€200-400/month
SCALE → ENTERPRISE (1000+ clients)
├── Trigger: High availability requirements, SLA
├── Action: Full redundancy, read replicas, auto-scaling
└── Cost increase: +€600-1000/month
```
---
## Monitoring Requirements
### Essential Metrics
Track these metrics for capacity planning:
#### Infrastructure
- CPU utilization (per server)
- Memory utilization
- Disk I/O and usage
- Network throughput
#### Application
- Request latency (p50, p95, p99)
- Request rate (per endpoint)
- Error rate by type
- Active sessions
#### Database
- Query execution time
- Connection pool usage
- Table sizes
- Index usage
#### Business
- Active clients
- Products per client
- Orders per day
- API calls per client
### Monitoring Dashboard
The admin platform includes a **Capacity Monitoring** page at `/admin/platform-health` with:
1. **Current Usage** - Real-time resource utilization
2. **Growth Trends** - Historical charts for planning
3. **Threshold Alerts** - Warning and critical indicators
4. **Scaling Recommendations** - Automated suggestions
See [Platform Health Monitoring](#platform-health-monitoring) section below.
---
## Cost Analysis
### Infrastructure Cost per Client
| Scale | Clients | Monthly Infra | Cost/Client |
|-------|---------|---------------|-------------|
| Starter | 25 | €30 | €1.20 |
| Small | 75 | €80 | €1.07 |
| Medium | 200 | €150 | €0.75 |
| Large | 400 | €350 | €0.88 |
| Scale | 800 | €700 | €0.88 |
| Enterprise | 1500 | €1,800 | €1.20 |
### Revenue vs Infrastructure Cost
At 1,000 Business tier clients (€149/month):
| Item | Monthly |
|------|---------|
| **Revenue** | €149,000 |
| Infrastructure | €700-900 |
| Support (est.) | €3,000 |
| Development (est.) | €5,000 |
| **Gross Margin** | **~96%** |
---
## Disaster Recovery
### Backup Strategy by Scale
| Scale | Database Backup | File Backup | RTO | RPO |
|-------|----------------|-------------|-----|-----|
| Starter | Daily SQLite copy | Daily rsync | 4h | 24h |
| Small | Daily pg_dump | Daily sync | 2h | 12h |
| Medium | Managed backups | S3 versioning | 1h | 6h |
| Large | Point-in-time | S3 + cross-region | 30m | 1h |
| Enterprise | Streaming replicas | Multi-region | 5m | 5m |
---
## Platform Health Monitoring
The admin dashboard includes a dedicated capacity monitoring page that tracks:
### Metrics Displayed
1. **Client Growth**
- Total active clients
- New clients this month
- Churn rate
2. **Resource Usage**
- Total products across all vendors
- Total images stored
- Database size
- Storage usage
3. **Performance Indicators**
- Average API response time
- Database query latency
- Error rate
4. **Threshold Status**
- Current infrastructure tier
- Distance to next threshold
- Recommended actions
### Alert Configuration
Configure alerts for proactive scaling:
```python
CAPACITY_THRESHOLDS = {
"products_total": {
"warning": 400_000, # 80% of 500K
"critical": 475_000, # 95% of 500K
},
"storage_gb": {
"warning": 800, # 80% of 1TB
"critical": 950,
},
"db_size_gb": {
"warning": 20,
"critical": 24,
},
"avg_response_ms": {
"warning": 200,
"critical": 500,
},
}
```
---
## Quick Reference
### TL;DR Sizing Guide
| Clients | Server | RAM | Storage | Database | Monthly Cost |
|---------|--------|-----|---------|----------|--------------|
| 1-50 | 2 vCPU | 4GB | 100GB | SQLite | €30 |
| 50-100 | 4 vCPU | 8GB | 250GB | PostgreSQL | €80 |
| 100-300 | 4 vCPU | 16GB | 500GB | PostgreSQL | €150 |
| 300-500 | 8 vCPU | 32GB | 1TB | PostgreSQL + Redis | €350 |
| 500-1000 | 16 vCPU | 64GB | 2TB + CDN | PostgreSQL + Redis | €700 |
| 1000+ | 32+ vCPU | 128GB+ | 4TB+ + CDN | PostgreSQL cluster | €1,500+ |
### Key Formulas
```
Storage (GB) = (Products × Images × 3 sizes × 200KB) / 1,000,000
DB Size (GB) = Products × 0.00003 + Orders × 0.00002
Bandwidth (TB/mo) = Products × Daily Views × 500KB × 30 / 1,000,000,000
```
---
## See Also
- [Pricing Strategy](../marketing/pricing.md) - Tier definitions and limits
- [Multi-Tenant Architecture](multi-tenant.md) - How client isolation works
- [Background Tasks](background-tasks.md) - Task queue scaling
- [Production Deployment](../deployment/production.md) - Deployment guidelines