docs: update deployment docs for CI timeouts, build info, and prod safety

- hetzner-server-setup: runner timeout 3h, shutdown_timeout 300s,
  deploy.sh now writes .build-info and uses explicit -f flag
- gitea: document unit-only CI tests and xdist incompatibility
- docker: add build info section, document volume mount approach

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-23 14:00:35 +01:00
parent 3015a490f9
commit 661547f6cf
3 changed files with 38 additions and 10 deletions

View File

@@ -331,6 +331,25 @@ docker compose -f docker-compose.prod.yml exec api python scripts/seed/init_prod
--- ---
## Build Info
The deploy script writes a `.build-info` JSON file (commit SHA + deploy timestamp) before rebuilding containers. This file is mounted as a read-only volume into the API container:
```yaml
# In docker-compose.yml
volumes:
- ./.build-info:/app/.build-info:ro
```
The app reads it via `app/core/build_info.py` and exposes it in:
- **`/health` endpoint** — `commit` and `deployed_at` fields
- **Admin sidebar** — version, commit, and deploy timestamp
In local development (where `.build-info` doesn't exist), the app falls back to `git rev-parse` for the commit SHA.
---
## Daily Operations ## Daily Operations
### View Logs ### View Logs

View File

@@ -252,9 +252,13 @@ The `scripts/deploy.sh` script handles the full deploy lifecycle:
1. Stash local changes (preserves `.env` and other server-side edits) 1. Stash local changes (preserves `.env` and other server-side edits)
2. Pull latest code (`--ff-only`) 2. Pull latest code (`--ff-only`)
3. Pop stash to restore local changes 3. Pop stash to restore local changes
4. Rebuild and restart Docker containers (`docker compose --profile full up -d --build`) 4. Write `.build-info` (commit SHA + deploy timestamp)
5. Run database migrations (`alembic upgrade heads`) 5. Rebuild and restart Docker containers (`docker compose -f docker-compose.yml --profile full up -d --build`)
6. Health check `http://localhost:8001/health` with retries 6. Run database migrations (`alembic upgrade heads`)
7. Health check `http://localhost:8001/health` with retries
!!! note "CI test configuration"
Only unit tests run in CI (`-m "unit"` with `timeout-minutes: 150`). Integration tests are run locally via `make test`. The CAX11 runner (2 vCPU ARM, 4GB) takes ~2.5h for 2,484 unit tests. `pytest-xdist` parallel execution is not compatible with the shared database session test fixtures.
See [Hetzner Server Setup — Step 16](hetzner-server-setup.md#step-16-continuous-deployment) for the full setup guide including SSH key generation and Gitea secrets configuration. See [Hetzner Server Setup — Step 16](hetzner-server-setup.md#step-16-continuous-deployment) for the full setup guide including SSH key generation and Gitea secrets configuration.

View File

@@ -1081,7 +1081,8 @@ Generate a config file to override defaults (notably the 3h job timeout which ca
```bash ```bash
cd ~/gitea-runner cd ~/gitea-runner
./act_runner generate-config > config.yaml ./act_runner generate-config > config.yaml
sed -i 's/timeout: 3h/timeout: 1h/' config.yaml sed -i 's/timeout: 3h/timeout: 3h/' config.yaml
sed -i 's/shutdown_timeout: 0s/shutdown_timeout: 300s/' config.yaml
sudo systemctl restart gitea-runner sudo systemctl restart gitea-runner
``` ```
@@ -1089,12 +1090,12 @@ Key settings in `config.yaml`:
| Setting | Default | Recommended | Why | | Setting | Default | Recommended | Why |
|---|---|---|---| |---|---|---|---|
| `runner.timeout` | 3h | 1h | Prevents silent failures — tests take ~25min, so 1h is generous | | `runner.timeout` | 3h | 3h | 2,484 unit tests take ~2.5h on the CAX11 (2 vCPU ARM). Keep the default |
| `runner.shutdown_timeout` | 0s | 0s | OK as-is | | `runner.shutdown_timeout` | 0s | 300s | Wait for running jobs to finish on restart — `0s` kills jobs immediately |
| `runner.fetch_timeout` | 5s | 5s | OK as-is | | `runner.fetch_timeout` | 5s | 5s | OK as-is |
!!! tip "CI also has per-job and per-test timeouts" !!! tip "CI also has per-job and per-test timeouts"
The `.gitea/workflows/ci.yml` sets `timeout-minutes: 45` on the pytest job and `--timeout=120` per individual test. These work together with the runner timeout to catch different failure modes. The `.gitea/workflows/ci.yml` sets `timeout-minutes: 150` on the pytest job and `--timeout=120` per individual test. These work together with the runner timeout to catch different failure modes.
### 15.2 Swap for CI Stability ### 15.2 Swap for CI Stability
@@ -1160,9 +1161,13 @@ The deploy script lives at `scripts/deploy.sh` in the repository. It:
1. Stashes local changes (preserves `.env`) 1. Stashes local changes (preserves `.env`)
2. Pulls latest code (`--ff-only`) 2. Pulls latest code (`--ff-only`)
3. Pops stash to restore local changes 3. Pops stash to restore local changes
4. Rebuilds and restarts Docker containers (`docker compose --profile full up -d --build`) 4. Writes `.build-info` (commit SHA + deploy timestamp)
5. Runs database migrations (`alembic upgrade heads`) 5. Rebuilds and restarts Docker containers (`docker compose -f docker-compose.yml --profile full up -d --build`)
6. Health checks `http://localhost:8001/health` with 12 retries (60s total) 6. Runs database migrations (`alembic upgrade heads`)
7. Health checks `http://localhost:8001/health` with 12 retries (60s total)
!!! warning "Always use `-f docker-compose.yml` on the production server"
The explicit `-f` flag prevents `docker-compose.override.yml` (which exposes db/redis ports for local dev) from being loaded. This flag must never be removed from `deploy.sh`, and any manual `docker compose` commands on the server must also include it. See [Docker Deployment — Dev vs Prod](docker.md#dev-vs-prod-compose-architecture) for details.
Exit codes: `0` success, `1` git pull failed, `2` docker compose failed, `3` migration failed, `4` health check failed. Exit codes: `0` success, `1` git pull failed, `2` docker compose failed, `3` migration failed, `4` health check failed.