diff --git a/docs/deployment/hetzner-server-setup.md b/docs/deployment/hetzner-server-setup.md index 40cdbd71..66d426ac 100644 --- a/docs/deployment/hetzner-server-setup.md +++ b/docs/deployment/hetzner-server-setup.md @@ -1176,14 +1176,79 @@ The `docker-compose.yml` includes: - `prometheus` volumes: mounts `alert.rules.yml` as read-only - `prometheus.yml`: `alerting:` section pointing to alertmanager:9093, `rule_files:` for alert rules, new scrape job for alertmanager -### 19.5 Deploy +### 19.5 Alertmanager SMTP Setup (Mailgun) + +Alertmanager needs SMTP to send email notifications. Mailgun's free tier (1,000 emails/month) is ideal for low-volume alerting. + +**1. Create Mailgun account:** + +1. Sign up at [mailgun.com](https://www.mailgun.com/) (free Flex plan) +2. Add and verify your sending domain (e.g. `mg.wizard.lu`) — Mailgun provides DNS records to add +3. Go to **Sending** > **Domain settings** > **SMTP credentials** +4. Note: SMTP server, port, username, and password + +**2. Update alertmanager config on the server:** + +```bash +nano ~/apps/orion/monitoring/alertmanager/alertmanager.yml +``` + +Replace the SMTP placeholders: + +```yaml +global: + smtp_smarthost: 'smtp.mailgun.org:587' + smtp_from: 'alerts@mg.wizard.lu' + smtp_auth_username: 'postmaster@mg.wizard.lu' + smtp_auth_password: 'your-mailgun-smtp-password' + smtp_require_tls: true +``` + +Update the `to:` addresses in both receivers to your actual email. + +**3. Restart alertmanager:** + +```bash +cd ~/apps/orion +docker compose --profile full restart alertmanager +curl -s http://localhost:9093/-/healthy # Should return OK +``` + +**4. Test by triggering a test alert (optional):** + +```bash +# Send a test alert to alertmanager +curl -X POST http://localhost:9093/api/v1/alerts \ + -H "Content-Type: application/json" \ + -d '[{ + "labels": {"alertname": "TestAlert", "severity": "warning"}, + "annotations": {"summary": "Test alert — please ignore"}, + "startsAt": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'", + "endsAt": "'$(date -u -d '+5 minutes' +%Y-%m-%dT%H:%M:%SZ)'" + }]' +``` + +Check your inbox within 30 seconds. Then verify the alert resolved: + +```bash +curl -s http://localhost:9093/api/v1/alerts | python3 -m json.tool +``` + +!!! tip "Alternative SMTP providers" + Any SMTP service works. Common alternatives: + + - **SendGrid**: `smtp.sendgrid.net:587`, username `apikey`, password is your API key + - **Amazon SES**: `email-smtp.eu-west-1.amazonaws.com:587` + - **Gmail**: `smtp.gmail.com:587` with an App Password (less reliable, not recommended for production) + +### 19.6 Deploy ```bash cd ~/apps/orion docker compose --profile full up -d ``` -### 19.6 Verification +### 19.7 Verification ```bash # Alertmanager healthy @@ -1230,11 +1295,12 @@ docker network ls | grep orion ### 20.2 fail2ban Configuration -fail2ban is already installed (Step 3) but needs jail configuration. +fail2ban is already installed (Step 3) but needs jail configuration. All commands below are copy-pasteable. -**SSH jail** — create `/etc/fail2ban/jail.local`: +**SSH jail** — bans IPs after 3 failed SSH attempts for 24 hours: -```ini +```bash +sudo tee /etc/fail2ban/jail.local << 'EOF' [sshd] enabled = true port = ssh @@ -1243,19 +1309,56 @@ logpath = /var/log/auth.log maxretry = 3 bantime = 86400 findtime = 600 +EOF ``` -**Caddy auth filter** — create `/etc/fail2ban/filter.d/caddy-auth.conf`: +**Caddy access logging** — fail2ban needs a log file to watch. Add a global `log` directive to your Caddyfile: -```ini +```bash +sudo nano /etc/caddy/Caddyfile +``` + +Add this block at the **very top** of the Caddyfile, before any site blocks: + +```caddy +{ + log { + output file /var/log/caddy/access.log { + roll_size 100MiB + roll_keep 5 + } + format json + } +} +``` + +Create the log directory and restart Caddy: + +```bash +sudo mkdir -p /var/log/caddy +sudo chown caddy:caddy /var/log/caddy +sudo systemctl restart caddy +sudo systemctl status caddy + +# Verify logging works (make a request, then check) +curl -s https://wizard.lu > /dev/null +sudo tail -1 /var/log/caddy/access.log | python3 -m json.tool | head -5 +``` + +**Caddy auth filter** — matches 401/403 responses in Caddy's JSON logs: + +```bash +sudo tee /etc/fail2ban/filter.d/caddy-auth.conf << 'EOF' [Definition] failregex = ^.*"remote_ip":"".*"status":(401|403).*$ ignoreregex = +EOF ``` -**Caddy jail** — create `/etc/fail2ban/jail.d/caddy.conf`: +**Caddy jail** — bans IPs after 10 failed auth attempts for 1 hour: -```ini +```bash +sudo tee /etc/fail2ban/jail.d/caddy.conf << 'EOF' [caddy-auth] enabled = true port = http,https @@ -1264,17 +1367,22 @@ logpath = /var/log/caddy/access.log maxretry = 10 bantime = 3600 findtime = 600 +EOF ``` -!!! note "Caddy access logging" - For the Caddy jail to work, enable access logging in your Caddyfile by adding `log` directives that write to `/var/log/caddy/access.log` in JSON format. See [Caddy logging docs](https://caddyserver.com/docs/caddyfile/directives/log). - -Restart fail2ban: +**Restart and verify:** ```bash sudo systemctl restart fail2ban + +# Both jails should be listed sudo fail2ban-client status + +# SSH jail details sudo fail2ban-client status sshd + +# Caddy jail details (will show 0 bans initially) +sudo fail2ban-client status caddy-auth ``` ### 20.3 Unattended Security Upgrades @@ -1300,17 +1408,35 @@ APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1"; ``` -### 20.4 Verification +### 20.4 Clean Up Legacy Docker Network + +After deploying with network segmentation, the old default network may remain: ```bash -# fail2ban jails active +# Check if orion_default still exists +docker network ls | grep orion_default + +# Remove it (safe — no containers should be using it) +docker network rm orion_default 2>/dev/null || echo "Already removed" +``` + +### 20.5 Verification + +```bash +# fail2ban jails active (should show sshd and caddy-auth) +sudo fail2ban-client status + +# SSH jail details sudo fail2ban-client status sshd -# Docker networks exist +# Docker networks (should show 3: frontend, backend, monitoring) docker network ls | grep orion # Unattended upgrades configured sudo unattended-upgrades --dry-run 2>&1 | head + +# Caddy access log being written +sudo tail -1 /var/log/caddy/access.log ``` ---