Files
orion/docs/operations/capacity-monitoring.md
Samir Boulahtit dc7fb5ca19 feat: add capacity planning docs, image upload system, and platform health monitoring
Documentation:
- Add comprehensive capacity planning guide (docs/architecture/capacity-planning.md)
- Add operations docs: platform-health, capacity-monitoring, image-storage
- Link pricing strategy to capacity planning documentation
- Update mkdocs.yml with new Operations section

Image Upload System:
- Add ImageService with WebP conversion and sharded directory structure
- Generate multiple size variants (original, 800px, 200px)
- Add storage stats endpoint for monitoring
- Add Pillow dependency for image processing

Platform Health Monitoring:
- Add /admin/platform-health page with real-time metrics
- Show CPU, memory, disk usage with progress bars
- Display capacity thresholds with status indicators
- Generate scaling recommendations automatically
- Determine infrastructure tier based on usage
- Add psutil dependency for system metrics

Admin UI:
- Add Capacity Monitor to Platform Health section in sidebar
- Create platform-health.html template with stats cards
- Create platform-health.js for Alpine.js state management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 17:17:09 +01:00

122 lines
3.3 KiB
Markdown

# Capacity Monitoring
Detailed guide for monitoring and managing platform capacity.
## Overview
The Capacity Monitoring page (`/admin/platform-health/capacity`) provides insights into resource consumption and helps plan infrastructure scaling.
## Key Metrics
### Client Metrics
| Metric | Description | Threshold Indicator |
|--------|-------------|---------------------|
| Active Clients | Vendors with activity in last 30 days | Scale planning |
| Total Products | Sum across all vendors | Storage/DB sizing |
| Products per Client | Average products per vendor | Tier compliance |
| Monthly Orders | Order volume this month | Performance impact |
### Storage Metrics
| Metric | Description | Warning | Critical |
|--------|-------------|---------|----------|
| Image Files | Total files in storage | 80% of limit | 95% of limit |
| Image Storage (GB) | Total size in gigabytes | 80% of disk | 95% of disk |
| Database Size (GB) | PostgreSQL data size | 80% of allocation | 95% of allocation |
| Backup Size (GB) | Latest backup size | Informational | N/A |
### Performance Metrics
| Metric | Good | Warning | Critical |
|--------|------|---------|----------|
| Avg Response Time | < 100ms | 100-300ms | > 300ms |
| DB Query Time (p95) | < 50ms | 50-200ms | > 200ms |
| Cache Hit Rate | > 90% | 70-90% | < 70% |
| Connection Pool Usage | < 70% | 70-90% | > 90% |
## Scaling Recommendations
The system provides automatic scaling recommendations based on current usage:
### Example Recommendations
```
Current Infrastructure: MEDIUM (100-300 clients)
Current Usage: 85% of capacity
Recommendations:
1. [WARNING] Approaching product limit (420K of 500K)
→ Consider upgrading to LARGE tier
2. [INFO] Database size growing 5GB/month
→ Plan storage expansion in 3 months
3. [OK] API response times within normal range
→ No action needed
```
## Threshold Configuration
Edit thresholds in the admin settings or via environment:
```python
# Capacity thresholds (can be configured per deployment)
CAPACITY_THRESHOLDS = {
# Products
"products_total": {
"warning": 400_000,
"critical": 475_000,
"limit": 500_000,
},
# Storage (GB)
"storage_gb": {
"warning": 800,
"critical": 950,
"limit": 1000,
},
# Database (GB)
"db_size_gb": {
"warning": 20,
"critical": 24,
"limit": 25,
},
# Monthly orders
"monthly_orders": {
"warning": 250_000,
"critical": 280_000,
"limit": 300_000,
},
}
```
## Historical Trends
View growth trends to plan ahead:
- **30-day growth rate**: Products, storage, clients
- **Projected capacity date**: When limits will be reached
- **Seasonal patterns**: Order volume fluctuations
## Alerts
Capacity alerts trigger when:
1. **Warning (Yellow)**: 80% of any threshold
2. **Critical (Red)**: 95% of any threshold
3. **Exceeded**: 100%+ of threshold (immediate action)
## Export Reports
Generate capacity reports for planning:
- **Weekly summary**: PDF or CSV
- **Monthly capacity report**: Detailed analysis
- **Projection report**: 3/6/12 month forecasts
## Related Documentation
- [Capacity Planning](../architecture/capacity-planning.md) - Full sizing guide
- [Platform Health](platform-health.md) - Real-time health monitoring
- [Image Storage](image-storage.md) - Image system details