Yesterday's deploy debug surfaced a SendGrid API key pasted into the
tracked monitoring/alertmanager/alertmanager.yml on prod, with the
in-repo file literally captioning the field "TODO: Paste your SG.xxx
API key here" — actively encouraging the anti-pattern. Forensic
follow-up (bash history lines 290-357) confirmed it was a user-driven
nano edit that was never committed, just left as a long-running local
mod. Three problems collapsed into this finding:
1. Real SMTP credential lived in a tracked git file on prod.
2. The SendGrid → mail1.myservices.hosting SMTP migration never
touched alertmanager — it still pointed at smtp.sendgrid.net.
3. The alertmanager container has been Up 13 days with the
pre-paste empty smtp_auth_password loaded from disk, so prod's
email alerting has been silently failing.
Resolution shipped here:
- `git rm --cached monitoring/alertmanager/alertmanager.yml` so the
prod-edited file on each host stops being a tracked file and the
credential can't accidentally reach git again.
- Add `monitoring/alertmanager/alertmanager.yml` to .gitignore.
- Ship `monitoring/alertmanager/alertmanager.yml.example` as the
template — pre-filled with the post-migration non-secret routing
(`mail1.myservices.hosting:587`, `support@wizard.lu` auth,
`alerts@wizard.lu` From for inbox clarity), only `smtp_auth_password`
left as `CHANGEME`. Includes inline guidance for the From-vs-auth
rule that some SMTP relays enforce.
Per-host steps (Hetzner): backup the prod-edited file → revert local
change → pull → copy the template over the old file → fill in the
password → SIGHUP alertmanager. Doc reference will follow in the next
commit (Hetzner deploy doc still needs an "alertmanager.yml lives
outside git" footnote).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Redis had no maxmemory set, causing the Prometheus alert expression
(used/max) to evaluate to +Inf. Set maxmemory to 100mb with allkeys-lru
eviction policy, and guard the alert expression against division by zero.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SendGrid handles both transactional emails and marketing campaigns
under one account. Updated alertmanager SMTP placeholders, hetzner
setup guide (Step 19.5), and environment reference to recommend
SendGrid as the primary email provider.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backups: pg_dump scripts with daily/weekly rotation and Cloudflare R2 offsite sync.
Monitoring: Prometheus, Grafana, node-exporter, cAdvisor in docker-compose; /metrics
endpoint activated via prometheus_client.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>