docs: update deployment docs for CI timeouts, build info, and prod safety
- hetzner-server-setup: runner timeout 3h, shutdown_timeout 300s, deploy.sh now writes .build-info and uses explicit -f flag - gitea: document unit-only CI tests and xdist incompatibility - docker: add build info section, document volume mount approach Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -331,6 +331,25 @@ docker compose -f docker-compose.prod.yml exec api python scripts/seed/init_prod
|
||||
|
||||
---
|
||||
|
||||
## Build Info
|
||||
|
||||
The deploy script writes a `.build-info` JSON file (commit SHA + deploy timestamp) before rebuilding containers. This file is mounted as a read-only volume into the API container:
|
||||
|
||||
```yaml
|
||||
# In docker-compose.yml
|
||||
volumes:
|
||||
- ./.build-info:/app/.build-info:ro
|
||||
```
|
||||
|
||||
The app reads it via `app/core/build_info.py` and exposes it in:
|
||||
|
||||
- **`/health` endpoint** — `commit` and `deployed_at` fields
|
||||
- **Admin sidebar** — version, commit, and deploy timestamp
|
||||
|
||||
In local development (where `.build-info` doesn't exist), the app falls back to `git rev-parse` for the commit SHA.
|
||||
|
||||
---
|
||||
|
||||
## Daily Operations
|
||||
|
||||
### View Logs
|
||||
|
||||
@@ -252,9 +252,13 @@ The `scripts/deploy.sh` script handles the full deploy lifecycle:
|
||||
1. Stash local changes (preserves `.env` and other server-side edits)
|
||||
2. Pull latest code (`--ff-only`)
|
||||
3. Pop stash to restore local changes
|
||||
4. Rebuild and restart Docker containers (`docker compose --profile full up -d --build`)
|
||||
5. Run database migrations (`alembic upgrade heads`)
|
||||
6. Health check `http://localhost:8001/health` with retries
|
||||
4. Write `.build-info` (commit SHA + deploy timestamp)
|
||||
5. Rebuild and restart Docker containers (`docker compose -f docker-compose.yml --profile full up -d --build`)
|
||||
6. Run database migrations (`alembic upgrade heads`)
|
||||
7. Health check `http://localhost:8001/health` with retries
|
||||
|
||||
!!! note "CI test configuration"
|
||||
Only unit tests run in CI (`-m "unit"` with `timeout-minutes: 150`). Integration tests are run locally via `make test`. The CAX11 runner (2 vCPU ARM, 4GB) takes ~2.5h for 2,484 unit tests. `pytest-xdist` parallel execution is not compatible with the shared database session test fixtures.
|
||||
|
||||
See [Hetzner Server Setup — Step 16](hetzner-server-setup.md#step-16-continuous-deployment) for the full setup guide including SSH key generation and Gitea secrets configuration.
|
||||
|
||||
|
||||
@@ -1081,7 +1081,8 @@ Generate a config file to override defaults (notably the 3h job timeout which ca
|
||||
```bash
|
||||
cd ~/gitea-runner
|
||||
./act_runner generate-config > config.yaml
|
||||
sed -i 's/timeout: 3h/timeout: 1h/' config.yaml
|
||||
sed -i 's/timeout: 3h/timeout: 3h/' config.yaml
|
||||
sed -i 's/shutdown_timeout: 0s/shutdown_timeout: 300s/' config.yaml
|
||||
sudo systemctl restart gitea-runner
|
||||
```
|
||||
|
||||
@@ -1089,12 +1090,12 @@ Key settings in `config.yaml`:
|
||||
|
||||
| Setting | Default | Recommended | Why |
|
||||
|---|---|---|---|
|
||||
| `runner.timeout` | 3h | 1h | Prevents silent failures — tests take ~25min, so 1h is generous |
|
||||
| `runner.shutdown_timeout` | 0s | 0s | OK as-is |
|
||||
| `runner.timeout` | 3h | 3h | 2,484 unit tests take ~2.5h on the CAX11 (2 vCPU ARM). Keep the default |
|
||||
| `runner.shutdown_timeout` | 0s | 300s | Wait for running jobs to finish on restart — `0s` kills jobs immediately |
|
||||
| `runner.fetch_timeout` | 5s | 5s | OK as-is |
|
||||
|
||||
!!! tip "CI also has per-job and per-test timeouts"
|
||||
The `.gitea/workflows/ci.yml` sets `timeout-minutes: 45` on the pytest job and `--timeout=120` per individual test. These work together with the runner timeout to catch different failure modes.
|
||||
The `.gitea/workflows/ci.yml` sets `timeout-minutes: 150` on the pytest job and `--timeout=120` per individual test. These work together with the runner timeout to catch different failure modes.
|
||||
|
||||
### 15.2 Swap for CI Stability
|
||||
|
||||
@@ -1160,9 +1161,13 @@ The deploy script lives at `scripts/deploy.sh` in the repository. It:
|
||||
1. Stashes local changes (preserves `.env`)
|
||||
2. Pulls latest code (`--ff-only`)
|
||||
3. Pops stash to restore local changes
|
||||
4. Rebuilds and restarts Docker containers (`docker compose --profile full up -d --build`)
|
||||
5. Runs database migrations (`alembic upgrade heads`)
|
||||
6. Health checks `http://localhost:8001/health` with 12 retries (60s total)
|
||||
4. Writes `.build-info` (commit SHA + deploy timestamp)
|
||||
5. Rebuilds and restarts Docker containers (`docker compose -f docker-compose.yml --profile full up -d --build`)
|
||||
6. Runs database migrations (`alembic upgrade heads`)
|
||||
7. Health checks `http://localhost:8001/health` with 12 retries (60s total)
|
||||
|
||||
!!! warning "Always use `-f docker-compose.yml` on the production server"
|
||||
The explicit `-f` flag prevents `docker-compose.override.yml` (which exposes db/redis ports for local dev) from being loaded. This flag must never be removed from `deploy.sh`, and any manual `docker compose` commands on the server must also include it. See [Docker Deployment — Dev vs Prod](docker.md#dev-vs-prod-compose-architecture) for details.
|
||||
|
||||
Exit codes: `0` success, `1` git pull failed, `2` docker compose failed, `3` migration failed, `4` health check failed.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user