feat: add automated backups and Prometheus/Grafana monitoring stack (Steps 17-18)
Some checks failed
CI / dependency-scanning (push) Has been cancelled
CI / docs (push) Has been cancelled
CI / ruff (push) Successful in 7s
CI / validate (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / pytest (push) Has started running

Backups: pg_dump scripts with daily/weekly rotation and Cloudflare R2 offsite sync.
Monitoring: Prometheus, Grafana, node-exporter, cAdvisor in docker-compose; /metrics
endpoint activated via prometheus_client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-14 22:40:08 +01:00
parent 488d5a6f0e
commit ef7187b508
15 changed files with 809 additions and 20 deletions

View File

@@ -0,0 +1,17 @@
# File-based dashboard provider
# Import dashboards via Grafana UI; they'll be saved to the SQLite backend.
# Pre-built JSON dashboards can be placed in the json/ subdirectory.
# Docs: https://grafana.com/docs/grafana/latest/administration/provisioning/#dashboards
apiVersion: 1
providers:
- name: default
orgId: 1
folder: ""
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: false

View File

@@ -0,0 +1,12 @@
# Auto-provision Prometheus as the default datasource
# Docs: https://grafana.com/docs/grafana/latest/administration/provisioning/#datasources
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
editable: true

36
monitoring/prometheus.yml Normal file
View File

@@ -0,0 +1,36 @@
# Prometheus configuration for Orion platform
# Docs: https://prometheus.io/docs/prometheus/latest/configuration/configuration/
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# Orion API — /metrics endpoint (prometheus_client)
- job_name: "orion-api"
metrics_path: /metrics
static_configs:
- targets: ["api:8000"]
labels:
service: "orion-api"
# Node Exporter — host-level CPU, RAM, disk metrics
- job_name: "node-exporter"
static_configs:
- targets: ["node-exporter:9100"]
labels:
service: "node-exporter"
# cAdvisor — per-container resource metrics
- job_name: "cadvisor"
static_configs:
- targets: ["cadvisor:8080"]
labels:
service: "cadvisor"
# Prometheus self-monitoring
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
labels:
service: "prometheus"