docs: update subscription billing and capacity monitoring documentation

- Expand subscription-billing.md with complete system documentation
  - Add background tasks section with scheduling examples
  - Add capacity forecasting with API examples
  - Document all new API endpoints (trends, recommendations, snapshot)
  - Add CapacitySnapshot model documentation
  - Include infrastructure scaling reference table

- Update capacity-monitoring.md with forecasting features
  - Add subscription capacity tracking section
  - Document growth trends API with example responses
  - Add scaling recommendations with severity levels
  - Include usage examples for capacity planning
  - Add historical data and export options

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-26 20:56:22 +01:00
parent c6e7f4087f
commit 25279a03d4
2 changed files with 583 additions and 26 deletions

View File

@@ -1,10 +1,28 @@
# Capacity Monitoring
Detailed guide for monitoring and managing platform capacity.
Detailed guide for monitoring and managing platform capacity, including growth forecasting and scaling recommendations.
## Overview
The Capacity Monitoring page (`/admin/platform-health/capacity`) provides insights into resource consumption and helps plan infrastructure scaling.
The Capacity Monitoring system provides insights into resource consumption and helps plan infrastructure scaling. It includes:
- **Real-time metrics**: Current resource usage and health status
- **Subscription capacity**: Theoretical vs actual capacity based on vendor subscriptions
- **Growth forecasting**: Historical trends and future projections
- **Scaling recommendations**: Automated advice for infrastructure planning
## API Endpoints
All capacity endpoints are under `/api/v1/admin/platform-health`:
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Full platform health report |
| `/capacity` | GET | Capacity-focused metrics |
| `/subscription-capacity` | GET | Subscription-based capacity analysis |
| `/trends` | GET | Growth trends over specified period |
| `/recommendations` | GET | Prioritized scaling recommendations |
| `/snapshot` | POST | Manually capture capacity snapshot |
## Key Metrics
@@ -17,6 +35,41 @@ The Capacity Monitoring page (`/admin/platform-health/capacity`) provides insigh
| Products per Client | Average products per vendor | Tier compliance |
| Monthly Orders | Order volume this month | Performance impact |
### Subscription Capacity
Track theoretical vs actual capacity based on all vendor subscriptions:
```python
# GET /api/v1/admin/platform-health/subscription-capacity
{
"total_subscriptions": 150,
"tier_distribution": {
"essential": 80,
"professional": 50,
"business": 18,
"enterprise": 2
},
"products": {
"actual": 125000,
"theoretical_limit": 500000,
"utilization_percent": 25.0,
"headroom": 375000
},
"orders_monthly": {
"actual": 45000,
"theoretical_limit": 300000,
"utilization_percent": 15.0,
"headroom": 255000
},
"team_members": {
"actual": 320,
"theoretical_limit": 1500,
"utilization_percent": 21.3,
"headroom": 1180
}
}
```
### Storage Metrics
| Metric | Description | Warning | Critical |
@@ -35,26 +88,148 @@ The Capacity Monitoring page (`/admin/platform-health/capacity`) provides insigh
| Cache Hit Rate | > 90% | 70-90% | < 70% |
| Connection Pool Usage | < 70% | 70-90% | > 90% |
## Growth Forecasting
### Capacity Snapshots
Daily snapshots are captured automatically by the `capture_capacity_snapshot` background task:
```python
# Captured daily at midnight
class CapacitySnapshot:
snapshot_date: datetime
# Vendor metrics
total_vendors: int
active_vendors: int
trial_vendors: int
# Subscription metrics
total_subscriptions: int
active_subscriptions: int
# Resource metrics
total_products: int
total_orders_month: int
total_team_members: int
# Storage metrics
storage_used_gb: Decimal
db_size_mb: Decimal
# Capacity metrics
theoretical_products_limit: int
theoretical_orders_limit: int
theoretical_team_limit: int
# Tier distribution
tier_distribution: dict
```
### Growth Trends
Analyze growth over any period:
```python
# GET /api/v1/admin/platform-health/trends?days=30
{
"period_days": 30,
"snapshots_available": 30,
"start_date": "2025-11-26",
"end_date": "2025-12-26",
"trends": {
"vendors": {
"start_value": 140,
"current_value": 150,
"change": 10,
"growth_rate_percent": 7.14,
"daily_growth_rate": 0.238,
"monthly_projection": 161
},
"products": {
"start_value": 115000,
"current_value": 125000,
"change": 10000,
"growth_rate_percent": 8.7,
"daily_growth_rate": 0.29,
"monthly_projection": 136000
},
"orders": {
"start_value": 40000,
"current_value": 45000,
"change": 5000,
"growth_rate_percent": 12.5,
"monthly_projection": 51000
},
"team_members": {...},
"storage_gb": {
"start_value": 150.5,
"current_value": 165.2,
"change": 14.7
}
}
}
```
### Days Until Threshold
Calculate when a metric will reach a specific threshold:
```python
# Service method
days = capacity_forecast_service.get_days_until_threshold(
db,
metric="total_products",
threshold=500000
)
# Returns: 120 (days until products reach 500K)
```
## Scaling Recommendations
The system provides automatic scaling recommendations based on current usage:
### Example Recommendations
The system generates automated recommendations based on current capacity and growth:
```python
# GET /api/v1/admin/platform-health/recommendations
[
{
"category": "capacity",
"severity": "warning",
"title": "Product capacity approaching limit",
"description": "Currently at 85% of theoretical product capacity",
"action": "Consider upgrading vendor tiers or adding capacity"
},
{
"category": "infrastructure",
"severity": "info",
"title": "Current tier: Medium",
"description": "Next upgrade trigger: 300 vendors",
"action": "Monitor growth and plan for infrastructure scaling"
},
{
"category": "growth",
"severity": "info",
"title": "High vendor growth rate",
"description": "Vendor base growing at 15.2% over last 30 days",
"action": "Ensure infrastructure can scale to meet demand"
},
{
"category": "storage",
"severity": "warning",
"title": "Storage usage high",
"description": "Image storage at 850 GB",
"action": "Plan for storage expansion or implement cleanup policies"
}
]
```
Current Infrastructure: MEDIUM (100-300 clients)
Current Usage: 85% of capacity
Recommendations:
1. [WARNING] Approaching product limit (420K of 500K)
→ Consider upgrading to LARGE tier
### Severity Levels
2. [INFO] Database size growing 5GB/month
→ Plan storage expansion in 3 months
3. [OK] API response times within normal range
→ No action needed
```
| Severity | Description | Action Required |
|----------|-------------|-----------------|
| `critical` | Immediate action needed | Within 24 hours |
| `warning` | Plan action soon | Within 1-2 weeks |
| `info` | Informational | Monitor and plan |
## Threshold Configuration
@@ -90,13 +265,63 @@ CAPACITY_THRESHOLDS = {
}
```
## Historical Trends
## Infrastructure Scaling Reference
View growth trends to plan ahead:
| Clients | vCPU | RAM | Storage | Database | Monthly Cost |
|---------|------|-----|---------|----------|--------------|
| 1-50 | 2 | 4GB | 100GB | SQLite | €30 |
| 50-100 | 4 | 8GB | 250GB | PostgreSQL | €80 |
| 100-300 | 4 | 16GB | 500GB | PostgreSQL | €150 |
| 300-500 | 8 | 32GB | 1TB | PostgreSQL + Redis | €350 |
| 500-1000 | 16 | 64GB | 2TB | PostgreSQL + Redis | €700 |
| 1000+ | 32+ | 128GB+ | 4TB+ | PostgreSQL cluster | €1,500+ |
- **30-day growth rate**: Products, storage, clients
- **Projected capacity date**: When limits will be reached
- **Seasonal patterns**: Order volume fluctuations
## Background Tasks
### Capacity Snapshot Task
```python
# app/tasks/subscription_tasks.py
async def capture_capacity_snapshot():
"""
Capture a daily snapshot of platform capacity metrics.
Should run daily at midnight.
"""
from app.services.capacity_forecast_service import capacity_forecast_service
db = SessionLocal()
try:
snapshot = capacity_forecast_service.capture_daily_snapshot(db)
db.commit()
return {
"snapshot_id": snapshot.id,
"snapshot_date": snapshot.snapshot_date.isoformat(),
"total_vendors": snapshot.total_vendors,
"total_products": snapshot.total_products,
}
finally:
db.close()
```
### Manual Snapshot
Capture a snapshot on demand:
```bash
# Via API
curl -X POST /api/v1/admin/platform-health/snapshot \
-H "Authorization: Bearer $TOKEN"
# Response
{
"id": 42,
"snapshot_date": "2025-12-26T00:00:00Z",
"total_vendors": 150,
"total_products": 125000,
"message": "Snapshot captured successfully"
}
```
## Alerts
@@ -106,6 +331,29 @@ Capacity alerts trigger when:
2. **Critical (Red)**: 95% of any threshold
3. **Exceeded**: 100%+ of threshold (immediate action)
## Historical Data
### Viewing Historical Trends
Use the `/trends` endpoint with different day ranges:
```bash
# Last 7 days
GET /api/v1/admin/platform-health/trends?days=7
# Last 30 days (default)
GET /api/v1/admin/platform-health/trends?days=30
# Last 90 days
GET /api/v1/admin/platform-health/trends?days=90
```
### Data Retention
- Snapshots are stored indefinitely by default
- Consider implementing cleanup for snapshots older than 2 years
- At minimum, keep monthly aggregates for long-term trending
## Export Reports
Generate capacity reports for planning:
@@ -114,8 +362,47 @@ Generate capacity reports for planning:
- **Monthly capacity report**: Detailed analysis
- **Projection report**: 3/6/12 month forecasts
## Usage Examples
### Check Current Capacity
```python
from app.services.platform_health_service import platform_health_service
from app.services.capacity_forecast_service import capacity_forecast_service
# Get subscription capacity
capacity = platform_health_service.get_subscription_capacity(db)
print(f"Products: {capacity['products']['actual']} / {capacity['products']['theoretical_limit']}")
print(f"Utilization: {capacity['products']['utilization_percent']}%")
# Get growth trends
trends = capacity_forecast_service.get_growth_trends(db, days=30)
print(f"Vendor growth: {trends['trends']['vendors']['growth_rate_percent']}%")
# Get recommendations
recommendations = capacity_forecast_service.get_scaling_recommendations(db)
for rec in recommendations:
print(f"[{rec['severity']}] {rec['title']}: {rec['action']}")
```
### Project Future Capacity
```python
# Calculate days until product limit
days = capacity_forecast_service.get_days_until_threshold(
db,
metric="total_products",
threshold=500000
)
if days:
print(f"Products will reach 500K in approximately {days} days")
else:
print("Insufficient data or no growth detected")
```
## Related Documentation
- [Subscription & Billing](../features/subscription-billing.md) - Complete billing system
- [Capacity Planning](../architecture/capacity-planning.md) - Full sizing guide
- [Platform Health](platform-health.md) - Real-time health monitoring
- [Image Storage](image-storage.md) - Image system details