feat: add capacity planning docs, image upload system, and platform health monitoring
Documentation: - Add comprehensive capacity planning guide (docs/architecture/capacity-planning.md) - Add operations docs: platform-health, capacity-monitoring, image-storage - Link pricing strategy to capacity planning documentation - Update mkdocs.yml with new Operations section Image Upload System: - Add ImageService with WebP conversion and sharded directory structure - Generate multiple size variants (original, 800px, 200px) - Add storage stats endpoint for monitoring - Add Pillow dependency for image processing Platform Health Monitoring: - Add /admin/platform-health page with real-time metrics - Show CPU, memory, disk usage with progress bars - Display capacity thresholds with status indicators - Generate scaling recommendations automatically - Determine infrastructure tier based on usage - Add psutil dependency for system metrics Admin UI: - Add Capacity Monitor to Platform Health section in sidebar - Create platform-health.html template with stats cards - Create platform-health.js for Alpine.js state management 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
454
docs/architecture/capacity-planning.md
Normal file
454
docs/architecture/capacity-planning.md
Normal file
@@ -0,0 +1,454 @@
|
||||
# Capacity Planning & Infrastructure Sizing
|
||||
|
||||
This document provides comprehensive capacity planning guidelines for the Wizamart platform, including resource requirements, scaling thresholds, and monitoring recommendations.
|
||||
|
||||
> **Related:** [Pricing Strategy](../marketing/pricing.md) for tier definitions and limits
|
||||
|
||||
---
|
||||
|
||||
## Tier Resource Allocations
|
||||
|
||||
Based on our [pricing tiers](../marketing/pricing.md), here are the expected resource requirements per client:
|
||||
|
||||
| Metric | Essential (€49) | Professional (€99) | Business (€199) | Enterprise (€399+) |
|
||||
|--------|-----------------|--------------------|-----------------|--------------------|
|
||||
| Products | 200 | 500 | 2,000 | Unlimited |
|
||||
| Images per product | 3 | 5 | 8 | 10+ |
|
||||
| Orders per month | 100 | 500 | 2,000 | Unlimited |
|
||||
| SKU variants | 1.2x | 1.5x | 2x | 3x |
|
||||
| Team members | 1 | 3 | 10 | Unlimited |
|
||||
| API requests/day | 1,000 | 5,000 | 20,000 | Unlimited |
|
||||
|
||||
---
|
||||
|
||||
## Scale Projections
|
||||
|
||||
### Target: 1,000 Business Clients (€149/month tier)
|
||||
|
||||
This represents our primary growth target. Here's the infrastructure impact:
|
||||
|
||||
| Resource | Calculation | Total |
|
||||
|----------|-------------|-------|
|
||||
| **Products** | 1,000 clients × 500 products | **500,000** |
|
||||
| **Product Translations** | 500,000 × 4 languages | **2,000,000 rows** |
|
||||
| **Images (files)** | 500,000 × 5 images × 3 sizes | **7,500,000 files** |
|
||||
| **Image Storage** | 7.5M files × 200KB avg | **1.5 TB** |
|
||||
| **Database Size** | Products + translations + orders + indexes | **15-25 GB** |
|
||||
| **Monthly Orders** | 1,000 clients × 300 orders | **300,000 orders** |
|
||||
| **Order Items** | 300,000 × 2.5 avg items | **750,000 items/month** |
|
||||
| **Monthly API Requests** | 1,000 × 10,000 req/day × 30 | **300M requests** |
|
||||
|
||||
### Multi-Tier Mix (Realistic Scenario)
|
||||
|
||||
More realistic distribution across tiers:
|
||||
|
||||
| Tier | Clients | Products Each | Total Products | Monthly Orders |
|
||||
|------|---------|---------------|----------------|----------------|
|
||||
| Essential | 500 | 100 | 50,000 | 50,000 |
|
||||
| Professional | 300 | 300 | 90,000 | 150,000 |
|
||||
| Business | 150 | 1,000 | 150,000 | 300,000 |
|
||||
| Enterprise | 50 | 3,000 | 150,000 | 200,000 |
|
||||
| **Total** | **1,000** | - | **440,000** | **700,000** |
|
||||
|
||||
---
|
||||
|
||||
## Server Sizing Recommendations
|
||||
|
||||
### Infrastructure Tiers
|
||||
|
||||
| Scale | Clients | vCPU | RAM | Storage | Database | Monthly Cost |
|
||||
|-------|---------|------|-----|---------|----------|--------------|
|
||||
| **Starter** | 1-50 | 2 | 4GB | 100GB SSD | SQLite | €20-40 |
|
||||
| **Small** | 50-100 | 4 | 8GB | 250GB SSD | PostgreSQL | €60-100 |
|
||||
| **Medium** | 100-300 | 4 | 16GB | 500GB SSD | PostgreSQL | €100-180 |
|
||||
| **Large** | 300-500 | 8 | 32GB | 1TB SSD | PostgreSQL + Redis | €250-400 |
|
||||
| **Scale** | 500-1000 | 16 | 64GB | 2TB SSD + CDN | PostgreSQL + Redis | €500-900 |
|
||||
| **Enterprise** | 1000+ | 32+ | 128GB+ | 4TB+ + CDN | PostgreSQL cluster | €1,500+ |
|
||||
|
||||
### Recommended Configurations
|
||||
|
||||
#### Starter (1-50 clients)
|
||||
```
|
||||
Single Server Setup:
|
||||
- Hetzner CX22 or similar (2 vCPU, 4GB RAM)
|
||||
- 100GB SSD storage
|
||||
- SQLite database
|
||||
- nginx for static files + reverse proxy
|
||||
- Estimated cost: €20-40/month
|
||||
```
|
||||
|
||||
#### Small-Medium (50-300 clients)
|
||||
```
|
||||
Two-Server Setup:
|
||||
- App Server: 4 vCPU, 8-16GB RAM
|
||||
- Database: Managed PostgreSQL (basic tier)
|
||||
- Storage: Local SSD + backup
|
||||
- Optional: Redis for sessions/caching
|
||||
- Estimated cost: €80-180/month
|
||||
```
|
||||
|
||||
#### Large (300-1000 clients)
|
||||
```
|
||||
Multi-Component Setup:
|
||||
- Load Balancer: nginx or cloud LB
|
||||
- App Servers: 2-4 × (4 vCPU, 8GB RAM)
|
||||
- Database: Managed PostgreSQL (production tier)
|
||||
- Cache: Redis (managed or self-hosted)
|
||||
- Storage: Object storage (S3-compatible) + CDN
|
||||
- Estimated cost: €400-900/month
|
||||
```
|
||||
|
||||
#### Enterprise (1000+ clients)
|
||||
```
|
||||
Full Production Setup:
|
||||
- CDN: Cloudflare or similar
|
||||
- Load Balancer: Cloud-native with health checks
|
||||
- App Servers: 4-8 × (4 vCPU, 16GB RAM) with auto-scaling
|
||||
- Database: PostgreSQL with read replicas
|
||||
- Cache: Redis cluster
|
||||
- Storage: S3 + CloudFront or equivalent
|
||||
- Monitoring: Prometheus + Grafana
|
||||
- Estimated cost: €1,500+/month
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Image Storage Architecture
|
||||
|
||||
### Capacity Calculations
|
||||
|
||||
| Image Size (optimized) | Files per 25GB | Files per 100GB | Files per 1TB |
|
||||
|------------------------|----------------|-----------------|---------------|
|
||||
| 100KB (thumbnails) | 250,000 | 1,000,000 | 10,000,000 |
|
||||
| 200KB (web-ready) | 125,000 | 500,000 | 5,000,000 |
|
||||
| 300KB (high quality) | 83,000 | 333,000 | 3,330,000 |
|
||||
| 500KB (original) | 50,000 | 200,000 | 2,000,000 |
|
||||
|
||||
### Image Sizes Generated
|
||||
|
||||
Each uploaded image generates 3 variants:
|
||||
|
||||
| Variant | Dimensions | Typical Size | Use Case |
|
||||
|---------|------------|--------------|----------|
|
||||
| `thumb` | 200×200 | 10-20KB | List views, grids |
|
||||
| `medium` | 800×800 | 80-150KB | Product cards, previews |
|
||||
| `original` | As uploaded | 200-500KB | Detail views, zoom |
|
||||
|
||||
**Storage per product:** ~600KB (with 3 sizes for main image + 2 additional images)
|
||||
|
||||
### Directory Structure (Sharded)
|
||||
|
||||
To prevent filesystem performance degradation, images are stored in a sharded directory structure:
|
||||
|
||||
```
|
||||
/uploads/
|
||||
└── products/
|
||||
├── 00/ # First 2 chars of hash
|
||||
│ ├── 1a/ # Next 2 chars
|
||||
│ │ ├── 001a2b3c_original.webp
|
||||
│ │ ├── 001a2b3c_800.webp
|
||||
│ │ └── 001a2b3c_200.webp
|
||||
│ └── 2b/
|
||||
│ └── ...
|
||||
├── 01/
|
||||
└── ...
|
||||
```
|
||||
|
||||
This structure ensures:
|
||||
- Maximum ~256 subdirectories per level
|
||||
- ~16 files per leaf directory at 1M total images
|
||||
- Fast filesystem operations even at scale
|
||||
|
||||
### Performance Thresholds
|
||||
|
||||
| Files per Directory | Performance | Required Action |
|
||||
|---------------------|-------------|-----------------|
|
||||
| < 10,000 | Excellent | None |
|
||||
| 10,000 - 100,000 | Good | Monitor, plan sharding |
|
||||
| 100,000 - 500,000 | Degraded | **Implement sharding** |
|
||||
| > 500,000 | Poor | **Migrate to object storage** |
|
||||
|
||||
---
|
||||
|
||||
## Database Performance
|
||||
|
||||
### Table Size Guidelines
|
||||
|
||||
| Table | Rows | Query Time | Status |
|
||||
|-------|------|------------|--------|
|
||||
| < 10,000 | < 1ms | Excellent |
|
||||
| 10,000 - 100,000 | 1-10ms | Good |
|
||||
| 100,000 - 1,000,000 | 10-50ms | **Add indexes, optimize queries** |
|
||||
| 1,000,000 - 10,000,000 | 50-200ms | **Consider partitioning** |
|
||||
| > 10,000,000 | Variable | **Sharding or dedicated DB** |
|
||||
|
||||
### Critical Indexes
|
||||
|
||||
Ensure these indexes exist at scale:
|
||||
|
||||
```sql
|
||||
-- Products
|
||||
CREATE INDEX idx_product_vendor_active ON products(vendor_id, is_active);
|
||||
CREATE INDEX idx_product_gtin ON products(gtin);
|
||||
CREATE INDEX idx_product_vendor_sku ON products(vendor_id, vendor_sku);
|
||||
|
||||
-- Orders
|
||||
CREATE INDEX idx_order_vendor_status ON orders(vendor_id, status);
|
||||
CREATE INDEX idx_order_created ON orders(created_at DESC);
|
||||
CREATE INDEX idx_order_customer ON orders(customer_id);
|
||||
|
||||
-- Inventory
|
||||
CREATE INDEX idx_inventory_product_location ON inventory(product_id, warehouse, bin_location);
|
||||
CREATE INDEX idx_inventory_vendor ON inventory(vendor_id);
|
||||
```
|
||||
|
||||
### Database Size Estimates
|
||||
|
||||
| Component | Size per 100K Products | Size per 1M Products |
|
||||
|-----------|------------------------|----------------------|
|
||||
| Products table | 100 MB | 1 GB |
|
||||
| Translations (4 langs) | 400 MB | 4 GB |
|
||||
| Orders (1 year) | 500 MB | 5 GB |
|
||||
| Order items | 200 MB | 2 GB |
|
||||
| Inventory | 50 MB | 500 MB |
|
||||
| Indexes | 300 MB | 3 GB |
|
||||
| **Total** | **~1.5 GB** | **~15 GB** |
|
||||
|
||||
---
|
||||
|
||||
## Bandwidth & Network
|
||||
|
||||
### Monthly Bandwidth Estimates (1000 clients)
|
||||
|
||||
| Traffic Type | Calculation | Monthly Volume |
|
||||
|--------------|-------------|----------------|
|
||||
| Image views | 500K products × 10 views × 500KB | **2.5 TB** |
|
||||
| API requests | 10K req/client/day × 1000 × 30 × 2KB | **600 GB** |
|
||||
| Static assets | CSS/JS cached, minimal | **50 GB** |
|
||||
| **Total Egress** | | **~3 TB/month** |
|
||||
|
||||
### Bandwidth Costs (Approximate)
|
||||
|
||||
| Provider | First 1TB | Additional per TB |
|
||||
|----------|-----------|-------------------|
|
||||
| Hetzner | Included | €1/TB |
|
||||
| AWS | $90 | $85/TB |
|
||||
| DigitalOcean | 1TB free | $10/TB |
|
||||
| Cloudflare | Unlimited (CDN) | Free |
|
||||
|
||||
**Recommendation:** Use Cloudflare for image CDN to eliminate egress costs.
|
||||
|
||||
---
|
||||
|
||||
## Scaling Triggers & Thresholds
|
||||
|
||||
### When to Scale Up
|
||||
|
||||
| Metric | Warning | Critical | Action |
|
||||
|--------|---------|----------|--------|
|
||||
| CPU Usage | > 70% avg | > 85% avg | Add app server |
|
||||
| Memory Usage | > 75% | > 90% | Upgrade RAM or add server |
|
||||
| Disk Usage | > 70% | > 85% | Expand storage |
|
||||
| DB Query Time (p95) | > 100ms | > 500ms | Optimize queries, add indexes |
|
||||
| API Response Time (p95) | > 500ms | > 2000ms | Scale horizontally |
|
||||
| DB Connections | > 80% max | > 95% max | Add connection pooling |
|
||||
| Error Rate | > 1% | > 5% | Investigate and fix |
|
||||
|
||||
### Architecture Transition Points
|
||||
|
||||
```
|
||||
STARTER → SMALL (50 clients)
|
||||
├── Trigger: SQLite becomes bottleneck
|
||||
├── Action: Migrate to PostgreSQL
|
||||
└── Cost increase: +€40-60/month
|
||||
|
||||
SMALL → MEDIUM (100 clients)
|
||||
├── Trigger: Single server at 70%+ CPU
|
||||
├── Action: Separate DB server
|
||||
└── Cost increase: +€50-80/month
|
||||
|
||||
MEDIUM → LARGE (300 clients)
|
||||
├── Trigger: Need for caching, higher availability
|
||||
├── Action: Add Redis, consider multiple app servers
|
||||
└── Cost increase: +€150-200/month
|
||||
|
||||
LARGE → SCALE (500 clients)
|
||||
├── Trigger: Storage >500GB, high traffic
|
||||
├── Action: Object storage + CDN, load balancing
|
||||
└── Cost increase: +€200-400/month
|
||||
|
||||
SCALE → ENTERPRISE (1000+ clients)
|
||||
├── Trigger: High availability requirements, SLA
|
||||
├── Action: Full redundancy, read replicas, auto-scaling
|
||||
└── Cost increase: +€600-1000/month
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Requirements
|
||||
|
||||
### Essential Metrics
|
||||
|
||||
Track these metrics for capacity planning:
|
||||
|
||||
#### Infrastructure
|
||||
- CPU utilization (per server)
|
||||
- Memory utilization
|
||||
- Disk I/O and usage
|
||||
- Network throughput
|
||||
|
||||
#### Application
|
||||
- Request latency (p50, p95, p99)
|
||||
- Request rate (per endpoint)
|
||||
- Error rate by type
|
||||
- Active sessions
|
||||
|
||||
#### Database
|
||||
- Query execution time
|
||||
- Connection pool usage
|
||||
- Table sizes
|
||||
- Index usage
|
||||
|
||||
#### Business
|
||||
- Active clients
|
||||
- Products per client
|
||||
- Orders per day
|
||||
- API calls per client
|
||||
|
||||
### Monitoring Dashboard
|
||||
|
||||
The admin platform includes a **Capacity Monitoring** page at `/admin/platform-health` with:
|
||||
|
||||
1. **Current Usage** - Real-time resource utilization
|
||||
2. **Growth Trends** - Historical charts for planning
|
||||
3. **Threshold Alerts** - Warning and critical indicators
|
||||
4. **Scaling Recommendations** - Automated suggestions
|
||||
|
||||
See [Platform Health Monitoring](#platform-health-monitoring) section below.
|
||||
|
||||
---
|
||||
|
||||
## Cost Analysis
|
||||
|
||||
### Infrastructure Cost per Client
|
||||
|
||||
| Scale | Clients | Monthly Infra | Cost/Client |
|
||||
|-------|---------|---------------|-------------|
|
||||
| Starter | 25 | €30 | €1.20 |
|
||||
| Small | 75 | €80 | €1.07 |
|
||||
| Medium | 200 | €150 | €0.75 |
|
||||
| Large | 400 | €350 | €0.88 |
|
||||
| Scale | 800 | €700 | €0.88 |
|
||||
| Enterprise | 1500 | €1,800 | €1.20 |
|
||||
|
||||
### Revenue vs Infrastructure Cost
|
||||
|
||||
At 1,000 Business tier clients (€149/month):
|
||||
|
||||
| Item | Monthly |
|
||||
|------|---------|
|
||||
| **Revenue** | €149,000 |
|
||||
| Infrastructure | €700-900 |
|
||||
| Support (est.) | €3,000 |
|
||||
| Development (est.) | €5,000 |
|
||||
| **Gross Margin** | **~96%** |
|
||||
|
||||
---
|
||||
|
||||
## Disaster Recovery
|
||||
|
||||
### Backup Strategy by Scale
|
||||
|
||||
| Scale | Database Backup | File Backup | RTO | RPO |
|
||||
|-------|----------------|-------------|-----|-----|
|
||||
| Starter | Daily SQLite copy | Daily rsync | 4h | 24h |
|
||||
| Small | Daily pg_dump | Daily sync | 2h | 12h |
|
||||
| Medium | Managed backups | S3 versioning | 1h | 6h |
|
||||
| Large | Point-in-time | S3 + cross-region | 30m | 1h |
|
||||
| Enterprise | Streaming replicas | Multi-region | 5m | 5m |
|
||||
|
||||
---
|
||||
|
||||
## Platform Health Monitoring
|
||||
|
||||
The admin dashboard includes a dedicated capacity monitoring page that tracks:
|
||||
|
||||
### Metrics Displayed
|
||||
|
||||
1. **Client Growth**
|
||||
- Total active clients
|
||||
- New clients this month
|
||||
- Churn rate
|
||||
|
||||
2. **Resource Usage**
|
||||
- Total products across all vendors
|
||||
- Total images stored
|
||||
- Database size
|
||||
- Storage usage
|
||||
|
||||
3. **Performance Indicators**
|
||||
- Average API response time
|
||||
- Database query latency
|
||||
- Error rate
|
||||
|
||||
4. **Threshold Status**
|
||||
- Current infrastructure tier
|
||||
- Distance to next threshold
|
||||
- Recommended actions
|
||||
|
||||
### Alert Configuration
|
||||
|
||||
Configure alerts for proactive scaling:
|
||||
|
||||
```python
|
||||
CAPACITY_THRESHOLDS = {
|
||||
"products_total": {
|
||||
"warning": 400_000, # 80% of 500K
|
||||
"critical": 475_000, # 95% of 500K
|
||||
},
|
||||
"storage_gb": {
|
||||
"warning": 800, # 80% of 1TB
|
||||
"critical": 950,
|
||||
},
|
||||
"db_size_gb": {
|
||||
"warning": 20,
|
||||
"critical": 24,
|
||||
},
|
||||
"avg_response_ms": {
|
||||
"warning": 200,
|
||||
"critical": 500,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### TL;DR Sizing Guide
|
||||
|
||||
| Clients | Server | RAM | Storage | Database | Monthly Cost |
|
||||
|---------|--------|-----|---------|----------|--------------|
|
||||
| 1-50 | 2 vCPU | 4GB | 100GB | SQLite | €30 |
|
||||
| 50-100 | 4 vCPU | 8GB | 250GB | PostgreSQL | €80 |
|
||||
| 100-300 | 4 vCPU | 16GB | 500GB | PostgreSQL | €150 |
|
||||
| 300-500 | 8 vCPU | 32GB | 1TB | PostgreSQL + Redis | €350 |
|
||||
| 500-1000 | 16 vCPU | 64GB | 2TB + CDN | PostgreSQL + Redis | €700 |
|
||||
| 1000+ | 32+ vCPU | 128GB+ | 4TB+ + CDN | PostgreSQL cluster | €1,500+ |
|
||||
|
||||
### Key Formulas
|
||||
|
||||
```
|
||||
Storage (GB) = (Products × Images × 3 sizes × 200KB) / 1,000,000
|
||||
DB Size (GB) = Products × 0.00003 + Orders × 0.00002
|
||||
Bandwidth (TB/mo) = Products × Daily Views × 500KB × 30 / 1,000,000,000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Pricing Strategy](../marketing/pricing.md) - Tier definitions and limits
|
||||
- [Multi-Tenant Architecture](multi-tenant.md) - How client isolation works
|
||||
- [Background Tasks](background-tasks.md) - Task queue scaling
|
||||
- [Production Deployment](../deployment/production.md) - Deployment guidelines
|
||||
@@ -6,6 +6,8 @@
|
||||
|
||||
A focused Order Management System built specifically for Luxembourg e-commerce. Works alongside Letzshop, not instead of it. Provides the operational tools Letzshop lacks: real inventory, correct invoicing, customer ownership.
|
||||
|
||||
> **Infrastructure Planning:** See [Capacity Planning](../architecture/capacity-planning.md) for resource requirements, server sizing, and scaling guidelines per tier.
|
||||
|
||||
---
|
||||
|
||||
## Market Context
|
||||
|
||||
121
docs/operations/capacity-monitoring.md
Normal file
121
docs/operations/capacity-monitoring.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Capacity Monitoring
|
||||
|
||||
Detailed guide for monitoring and managing platform capacity.
|
||||
|
||||
## Overview
|
||||
|
||||
The Capacity Monitoring page (`/admin/platform-health/capacity`) provides insights into resource consumption and helps plan infrastructure scaling.
|
||||
|
||||
## Key Metrics
|
||||
|
||||
### Client Metrics
|
||||
|
||||
| Metric | Description | Threshold Indicator |
|
||||
|--------|-------------|---------------------|
|
||||
| Active Clients | Vendors with activity in last 30 days | Scale planning |
|
||||
| Total Products | Sum across all vendors | Storage/DB sizing |
|
||||
| Products per Client | Average products per vendor | Tier compliance |
|
||||
| Monthly Orders | Order volume this month | Performance impact |
|
||||
|
||||
### Storage Metrics
|
||||
|
||||
| Metric | Description | Warning | Critical |
|
||||
|--------|-------------|---------|----------|
|
||||
| Image Files | Total files in storage | 80% of limit | 95% of limit |
|
||||
| Image Storage (GB) | Total size in gigabytes | 80% of disk | 95% of disk |
|
||||
| Database Size (GB) | PostgreSQL data size | 80% of allocation | 95% of allocation |
|
||||
| Backup Size (GB) | Latest backup size | Informational | N/A |
|
||||
|
||||
### Performance Metrics
|
||||
|
||||
| Metric | Good | Warning | Critical |
|
||||
|--------|------|---------|----------|
|
||||
| Avg Response Time | < 100ms | 100-300ms | > 300ms |
|
||||
| DB Query Time (p95) | < 50ms | 50-200ms | > 200ms |
|
||||
| Cache Hit Rate | > 90% | 70-90% | < 70% |
|
||||
| Connection Pool Usage | < 70% | 70-90% | > 90% |
|
||||
|
||||
## Scaling Recommendations
|
||||
|
||||
The system provides automatic scaling recommendations based on current usage:
|
||||
|
||||
### Example Recommendations
|
||||
|
||||
```
|
||||
Current Infrastructure: MEDIUM (100-300 clients)
|
||||
Current Usage: 85% of capacity
|
||||
|
||||
Recommendations:
|
||||
1. [WARNING] Approaching product limit (420K of 500K)
|
||||
→ Consider upgrading to LARGE tier
|
||||
|
||||
2. [INFO] Database size growing 5GB/month
|
||||
→ Plan storage expansion in 3 months
|
||||
|
||||
3. [OK] API response times within normal range
|
||||
→ No action needed
|
||||
```
|
||||
|
||||
## Threshold Configuration
|
||||
|
||||
Edit thresholds in the admin settings or via environment:
|
||||
|
||||
```python
|
||||
# Capacity thresholds (can be configured per deployment)
|
||||
CAPACITY_THRESHOLDS = {
|
||||
# Products
|
||||
"products_total": {
|
||||
"warning": 400_000,
|
||||
"critical": 475_000,
|
||||
"limit": 500_000,
|
||||
},
|
||||
# Storage (GB)
|
||||
"storage_gb": {
|
||||
"warning": 800,
|
||||
"critical": 950,
|
||||
"limit": 1000,
|
||||
},
|
||||
# Database (GB)
|
||||
"db_size_gb": {
|
||||
"warning": 20,
|
||||
"critical": 24,
|
||||
"limit": 25,
|
||||
},
|
||||
# Monthly orders
|
||||
"monthly_orders": {
|
||||
"warning": 250_000,
|
||||
"critical": 280_000,
|
||||
"limit": 300_000,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Historical Trends
|
||||
|
||||
View growth trends to plan ahead:
|
||||
|
||||
- **30-day growth rate**: Products, storage, clients
|
||||
- **Projected capacity date**: When limits will be reached
|
||||
- **Seasonal patterns**: Order volume fluctuations
|
||||
|
||||
## Alerts
|
||||
|
||||
Capacity alerts trigger when:
|
||||
|
||||
1. **Warning (Yellow)**: 80% of any threshold
|
||||
2. **Critical (Red)**: 95% of any threshold
|
||||
3. **Exceeded**: 100%+ of threshold (immediate action)
|
||||
|
||||
## Export Reports
|
||||
|
||||
Generate capacity reports for planning:
|
||||
|
||||
- **Weekly summary**: PDF or CSV
|
||||
- **Monthly capacity report**: Detailed analysis
|
||||
- **Projection report**: 3/6/12 month forecasts
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Capacity Planning](../architecture/capacity-planning.md) - Full sizing guide
|
||||
- [Platform Health](platform-health.md) - Real-time health monitoring
|
||||
- [Image Storage](image-storage.md) - Image system details
|
||||
246
docs/operations/image-storage.md
Normal file
246
docs/operations/image-storage.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# Image Storage System
|
||||
|
||||
Documentation for the platform's image storage and management system.
|
||||
|
||||
## Overview
|
||||
|
||||
The Wizamart platform uses a self-hosted image storage system with:
|
||||
|
||||
- **Sharded directory structure** for filesystem performance
|
||||
- **Automatic WebP conversion** for optimization
|
||||
- **Multiple size variants** for different use cases
|
||||
- **CDN-ready architecture** for scaling
|
||||
|
||||
## Storage Architecture
|
||||
|
||||
### Directory Structure
|
||||
|
||||
Images are stored in a sharded directory structure to prevent filesystem performance degradation:
|
||||
|
||||
```
|
||||
/static/uploads/
|
||||
└── products/
|
||||
├── 00/ # First 2 chars of hash
|
||||
│ ├── 1a/ # Next 2 chars
|
||||
│ │ ├── 001a2b3c_original.webp
|
||||
│ │ ├── 001a2b3c_800.webp
|
||||
│ │ └── 001a2b3c_200.webp
|
||||
│ └── 2b/
|
||||
│ └── ...
|
||||
├── 01/
|
||||
│ └── ...
|
||||
└── ff/
|
||||
└── ...
|
||||
```
|
||||
|
||||
### Hash Generation
|
||||
|
||||
The file hash is generated from:
|
||||
```python
|
||||
hash = md5(f"{vendor_id}:{product_id}:{timestamp}:{original_filename}")[:8]
|
||||
```
|
||||
|
||||
This ensures:
|
||||
- Unique file paths
|
||||
- Even distribution across directories
|
||||
- Predictable file locations
|
||||
|
||||
## Image Variants
|
||||
|
||||
Each uploaded image generates multiple variants:
|
||||
|
||||
| Variant | Max Dimensions | Format | Use Case |
|
||||
|---------|---------------|--------|----------|
|
||||
| `original` | As uploaded (max 2000px) | WebP | Detail view, zoom |
|
||||
| `800` | 800×800 | WebP | Product cards |
|
||||
| `200` | 200×200 | WebP | Thumbnails, grids |
|
||||
|
||||
### Size Estimates
|
||||
|
||||
| Original Size | After Processing | Storage per Image |
|
||||
|---------------|------------------|-------------------|
|
||||
| 2MB JPEG | ~200KB (original) + 80KB (800) + 15KB (200) | ~295KB |
|
||||
| 500KB JPEG | ~150KB (original) + 60KB (800) + 12KB (200) | ~222KB |
|
||||
| 100KB JPEG | ~80KB (original) + 40KB (800) + 10KB (200) | ~130KB |
|
||||
|
||||
**Average: ~200KB per image (all variants)**
|
||||
|
||||
## Upload Process
|
||||
|
||||
### API Endpoint
|
||||
|
||||
```http
|
||||
POST /api/v1/admin/images/upload
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
file: <binary>
|
||||
vendor_id: 123
|
||||
product_id: 456 (optional, for product images)
|
||||
type: product|category|banner
|
||||
```
|
||||
|
||||
### Response
|
||||
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"image": {
|
||||
"id": "001a2b3c",
|
||||
"urls": {
|
||||
"original": "/uploads/products/00/1a/001a2b3c_original.webp",
|
||||
"medium": "/uploads/products/00/1a/001a2b3c_800.webp",
|
||||
"thumb": "/uploads/products/00/1a/001a2b3c_200.webp"
|
||||
},
|
||||
"size_bytes": 295000,
|
||||
"dimensions": {
|
||||
"width": 1200,
|
||||
"height": 1200
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Image storage configuration
|
||||
IMAGE_UPLOAD_DIR=/var/www/uploads
|
||||
IMAGE_MAX_SIZE_MB=10
|
||||
IMAGE_ALLOWED_TYPES=jpg,jpeg,png,gif,webp
|
||||
IMAGE_QUALITY=85
|
||||
IMAGE_MAX_DIMENSION=2000
|
||||
```
|
||||
|
||||
### Python Configuration
|
||||
|
||||
```python
|
||||
# app/core/config.py
|
||||
class ImageSettings:
|
||||
UPLOAD_DIR: str = "/static/uploads"
|
||||
MAX_SIZE_MB: int = 10
|
||||
ALLOWED_TYPES: list = ["jpg", "jpeg", "png", "gif", "webp"]
|
||||
QUALITY: int = 85
|
||||
MAX_DIMENSION: int = 2000
|
||||
|
||||
# Generated sizes
|
||||
SIZES: dict = {
|
||||
"original": None, # No resize, just optimize
|
||||
"medium": 800,
|
||||
"thumb": 200,
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Guidelines
|
||||
|
||||
### Filesystem Limits
|
||||
|
||||
| Files per Directory | Status | Action |
|
||||
|---------------------|--------|--------|
|
||||
| < 10,000 | OK | None needed |
|
||||
| 10,000 - 50,000 | Monitor | Plan migration |
|
||||
| 50,000 - 100,000 | Warning | Increase sharding depth |
|
||||
| > 100,000 | Critical | Migrate to object storage |
|
||||
|
||||
### Capacity Planning
|
||||
|
||||
| Products | Images (5/product) | Total Files (3 sizes) | Storage |
|
||||
|----------|--------------------|-----------------------|---------|
|
||||
| 10,000 | 50,000 | 150,000 | 30 GB |
|
||||
| 50,000 | 250,000 | 750,000 | 150 GB |
|
||||
| 100,000 | 500,000 | 1,500,000 | 300 GB |
|
||||
| 500,000 | 2,500,000 | 7,500,000 | 1.5 TB |
|
||||
|
||||
## CDN Integration
|
||||
|
||||
For production deployments, configure a CDN for image delivery:
|
||||
|
||||
### Cloudflare (Recommended)
|
||||
|
||||
1. Set up Cloudflare for your domain
|
||||
2. Configure page rules for `/uploads/*`:
|
||||
- Cache Level: Cache Everything
|
||||
- Edge Cache TTL: 1 month
|
||||
- Browser Cache TTL: 1 week
|
||||
|
||||
### nginx Configuration
|
||||
|
||||
```nginx
|
||||
location /uploads/ {
|
||||
alias /var/www/uploads/;
|
||||
expires 30d;
|
||||
add_header Cache-Control "public, immutable";
|
||||
add_header X-Content-Type-Options nosniff;
|
||||
|
||||
# WebP fallback for older browsers
|
||||
location ~ \.(jpg|jpeg|png)$ {
|
||||
try_files $uri$webp_suffix $uri =404;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Maintenance
|
||||
|
||||
### Cleanup Orphaned Images
|
||||
|
||||
Remove images not referenced by any product:
|
||||
|
||||
```bash
|
||||
# Run via admin CLI
|
||||
python -m scripts.cleanup_orphaned_images --dry-run
|
||||
python -m scripts.cleanup_orphaned_images --execute
|
||||
```
|
||||
|
||||
### Regenerate Variants
|
||||
|
||||
If image quality settings change:
|
||||
|
||||
```bash
|
||||
# Regenerate all variants for a vendor
|
||||
python -m scripts.regenerate_images --vendor-id 123
|
||||
|
||||
# Regenerate all variants (use with caution)
|
||||
python -m scripts.regenerate_images --all
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Metrics to Track
|
||||
|
||||
- Total file count
|
||||
- Storage used (GB)
|
||||
- Files per directory (max)
|
||||
- Upload success rate
|
||||
- Average processing time
|
||||
|
||||
### Health Checks
|
||||
|
||||
The platform health page includes image storage metrics:
|
||||
|
||||
- Current file count
|
||||
- Storage usage
|
||||
- Directory distribution
|
||||
- Processing queue status
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Upload fails with "File too large"**
|
||||
- Check `IMAGE_MAX_SIZE_MB` setting
|
||||
- Verify nginx `client_max_body_size`
|
||||
|
||||
**Images not displaying**
|
||||
- Check file permissions (should be readable by web server)
|
||||
- Verify URL paths match actual file locations
|
||||
|
||||
**Slow uploads**
|
||||
- Check disk I/O performance
|
||||
- Consider async processing queue
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Capacity Planning](../architecture/capacity-planning.md)
|
||||
- [Platform Health](platform-health.md)
|
||||
- [Capacity Monitoring](capacity-monitoring.md)
|
||||
92
docs/operations/platform-health.md
Normal file
92
docs/operations/platform-health.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# Platform Health Monitoring
|
||||
|
||||
This guide covers the platform health monitoring features available in the admin dashboard.
|
||||
|
||||
## Overview
|
||||
|
||||
The Platform Health page (`/admin/platform-health`) provides real-time visibility into system performance, resource usage, and capacity thresholds.
|
||||
|
||||
## Accessing Platform Health
|
||||
|
||||
Navigate to **Admin > Platform Health** in the sidebar, or go directly to `/admin/platform-health`.
|
||||
|
||||
## Dashboard Sections
|
||||
|
||||
### 1. System Overview
|
||||
|
||||
Quick glance at overall platform status:
|
||||
|
||||
| Indicator | Green | Yellow | Red |
|
||||
|-----------|-------|--------|-----|
|
||||
| API Response Time | < 100ms | 100-500ms | > 500ms |
|
||||
| Error Rate | < 0.1% | 0.1-1% | > 1% |
|
||||
| Database Health | Connected | Slow queries | Disconnected |
|
||||
| Storage | < 70% | 70-85% | > 85% |
|
||||
|
||||
### 2. Resource Usage
|
||||
|
||||
Real-time metrics:
|
||||
|
||||
- **CPU Usage**: Current and 24h average
|
||||
- **Memory Usage**: Used vs available
|
||||
- **Disk Usage**: Storage consumption with trend
|
||||
- **Network**: Inbound/outbound throughput
|
||||
|
||||
### 3. Capacity Metrics
|
||||
|
||||
Track growth toward scaling thresholds:
|
||||
|
||||
- **Total Products**: Count across all vendors
|
||||
- **Total Images**: Files stored in image system
|
||||
- **Database Size**: Current size vs recommended max
|
||||
- **Active Clients**: Monthly active vendor accounts
|
||||
|
||||
### 4. Performance Trends
|
||||
|
||||
Historical charts (7-day, 30-day):
|
||||
|
||||
- API response times (p50, p95, p99)
|
||||
- Request volume by endpoint
|
||||
- Database query latency
|
||||
- Error rate over time
|
||||
|
||||
## Alert Configuration
|
||||
|
||||
### Threshold Alerts
|
||||
|
||||
Configure alerts for proactive monitoring:
|
||||
|
||||
```python
|
||||
# In app/core/config.py
|
||||
HEALTH_THRESHOLDS = {
|
||||
"cpu_percent": {"warning": 70, "critical": 85},
|
||||
"memory_percent": {"warning": 75, "critical": 90},
|
||||
"disk_percent": {"warning": 70, "critical": 85},
|
||||
"response_time_ms": {"warning": 200, "critical": 500},
|
||||
"error_rate_percent": {"warning": 1.0, "critical": 5.0},
|
||||
}
|
||||
```
|
||||
|
||||
### Notification Channels
|
||||
|
||||
Alerts can be sent via:
|
||||
- Email to admin users
|
||||
- Slack webhook (if configured)
|
||||
- Dashboard notifications
|
||||
|
||||
## Related Pages
|
||||
|
||||
- [Capacity Monitoring](capacity-monitoring.md) - Detailed capacity metrics
|
||||
- [Image Storage](image-storage.md) - Image system management
|
||||
- [Capacity Planning](../architecture/capacity-planning.md) - Infrastructure sizing guide
|
||||
|
||||
## API Endpoints
|
||||
|
||||
The platform health page uses these admin API endpoints:
|
||||
|
||||
| Endpoint | Description |
|
||||
|----------|-------------|
|
||||
| `GET /api/v1/admin/platform/health` | Overall health status |
|
||||
| `GET /api/v1/admin/platform/metrics` | Current metrics |
|
||||
| `GET /api/v1/admin/platform/metrics/history` | Historical data |
|
||||
| `GET /api/v1/admin/platform/capacity` | Capacity usage |
|
||||
Reference in New Issue
Block a user