docs(deployment): expand server-side setup for Steps 19-20

- Add Mailgun SMTP setup instructions for Alertmanager with test alert
- Expand fail2ban to fully copy-pasteable sudo tee commands
- Add Caddy access logging config (required for fail2ban Caddy jail)
- Add orion_default network cleanup step
- Expand verification checklist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-15 22:17:15 +01:00
parent 4bce16fb73
commit 8c715cfde3

View File

@@ -1176,14 +1176,79 @@ The `docker-compose.yml` includes:
- `prometheus` volumes: mounts `alert.rules.yml` as read-only
- `prometheus.yml`: `alerting:` section pointing to alertmanager:9093, `rule_files:` for alert rules, new scrape job for alertmanager
### 19.5 Deploy
### 19.5 Alertmanager SMTP Setup (Mailgun)
Alertmanager needs SMTP to send email notifications. Mailgun's free tier (1,000 emails/month) is ideal for low-volume alerting.
**1. Create Mailgun account:**
1. Sign up at [mailgun.com](https://www.mailgun.com/) (free Flex plan)
2. Add and verify your sending domain (e.g. `mg.wizard.lu`) — Mailgun provides DNS records to add
3. Go to **Sending** > **Domain settings** > **SMTP credentials**
4. Note: SMTP server, port, username, and password
**2. Update alertmanager config on the server:**
```bash
nano ~/apps/orion/monitoring/alertmanager/alertmanager.yml
```
Replace the SMTP placeholders:
```yaml
global:
smtp_smarthost: 'smtp.mailgun.org:587'
smtp_from: 'alerts@mg.wizard.lu'
smtp_auth_username: 'postmaster@mg.wizard.lu'
smtp_auth_password: 'your-mailgun-smtp-password'
smtp_require_tls: true
```
Update the `to:` addresses in both receivers to your actual email.
**3. Restart alertmanager:**
```bash
cd ~/apps/orion
docker compose --profile full restart alertmanager
curl -s http://localhost:9093/-/healthy # Should return OK
```
**4. Test by triggering a test alert (optional):**
```bash
# Send a test alert to alertmanager
curl -X POST http://localhost:9093/api/v1/alerts \
-H "Content-Type: application/json" \
-d '[{
"labels": {"alertname": "TestAlert", "severity": "warning"},
"annotations": {"summary": "Test alert — please ignore"},
"startsAt": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'",
"endsAt": "'$(date -u -d '+5 minutes' +%Y-%m-%dT%H:%M:%SZ)'"
}]'
```
Check your inbox within 30 seconds. Then verify the alert resolved:
```bash
curl -s http://localhost:9093/api/v1/alerts | python3 -m json.tool
```
!!! tip "Alternative SMTP providers"
Any SMTP service works. Common alternatives:
- **SendGrid**: `smtp.sendgrid.net:587`, username `apikey`, password is your API key
- **Amazon SES**: `email-smtp.eu-west-1.amazonaws.com:587`
- **Gmail**: `smtp.gmail.com:587` with an App Password (less reliable, not recommended for production)
### 19.6 Deploy
```bash
cd ~/apps/orion
docker compose --profile full up -d
```
### 19.6 Verification
### 19.7 Verification
```bash
# Alertmanager healthy
@@ -1230,11 +1295,12 @@ docker network ls | grep orion
### 20.2 fail2ban Configuration
fail2ban is already installed (Step 3) but needs jail configuration.
fail2ban is already installed (Step 3) but needs jail configuration. All commands below are copy-pasteable.
**SSH jail** — create `/etc/fail2ban/jail.local`:
**SSH jail** — bans IPs after 3 failed SSH attempts for 24 hours:
```ini
```bash
sudo tee /etc/fail2ban/jail.local << 'EOF'
[sshd]
enabled = true
port = ssh
@@ -1243,19 +1309,56 @@ logpath = /var/log/auth.log
maxretry = 3
bantime = 86400
findtime = 600
EOF
```
**Caddy auth filter** — create `/etc/fail2ban/filter.d/caddy-auth.conf`:
**Caddy access logging** — fail2ban needs a log file to watch. Add a global `log` directive to your Caddyfile:
```ini
```bash
sudo nano /etc/caddy/Caddyfile
```
Add this block at the **very top** of the Caddyfile, before any site blocks:
```caddy
{
log {
output file /var/log/caddy/access.log {
roll_size 100MiB
roll_keep 5
}
format json
}
}
```
Create the log directory and restart Caddy:
```bash
sudo mkdir -p /var/log/caddy
sudo chown caddy:caddy /var/log/caddy
sudo systemctl restart caddy
sudo systemctl status caddy
# Verify logging works (make a request, then check)
curl -s https://wizard.lu > /dev/null
sudo tail -1 /var/log/caddy/access.log | python3 -m json.tool | head -5
```
**Caddy auth filter** — matches 401/403 responses in Caddy's JSON logs:
```bash
sudo tee /etc/fail2ban/filter.d/caddy-auth.conf << 'EOF'
[Definition]
failregex = ^.*"remote_ip":"<HOST>".*"status":(401|403).*$
ignoreregex =
EOF
```
**Caddy jail** — create `/etc/fail2ban/jail.d/caddy.conf`:
**Caddy jail** — bans IPs after 10 failed auth attempts for 1 hour:
```ini
```bash
sudo tee /etc/fail2ban/jail.d/caddy.conf << 'EOF'
[caddy-auth]
enabled = true
port = http,https
@@ -1264,17 +1367,22 @@ logpath = /var/log/caddy/access.log
maxretry = 10
bantime = 3600
findtime = 600
EOF
```
!!! note "Caddy access logging"
For the Caddy jail to work, enable access logging in your Caddyfile by adding `log` directives that write to `/var/log/caddy/access.log` in JSON format. See [Caddy logging docs](https://caddyserver.com/docs/caddyfile/directives/log).
Restart fail2ban:
**Restart and verify:**
```bash
sudo systemctl restart fail2ban
# Both jails should be listed
sudo fail2ban-client status
# SSH jail details
sudo fail2ban-client status sshd
# Caddy jail details (will show 0 bans initially)
sudo fail2ban-client status caddy-auth
```
### 20.3 Unattended Security Upgrades
@@ -1300,17 +1408,35 @@ APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
```
### 20.4 Verification
### 20.4 Clean Up Legacy Docker Network
After deploying with network segmentation, the old default network may remain:
```bash
# fail2ban jails active
# Check if orion_default still exists
docker network ls | grep orion_default
# Remove it (safe — no containers should be using it)
docker network rm orion_default 2>/dev/null || echo "Already removed"
```
### 20.5 Verification
```bash
# fail2ban jails active (should show sshd and caddy-auth)
sudo fail2ban-client status
# SSH jail details
sudo fail2ban-client status sshd
# Docker networks exist
# Docker networks (should show 3: frontend, backend, monitoring)
docker network ls | grep orion
# Unattended upgrades configured
sudo unattended-upgrades --dry-run 2>&1 | head
# Caddy access log being written
sudo tail -1 /var/log/caddy/access.log
```
---