From 223650a52bef7b9ecd669967e7215ae55a945613 Mon Sep 17 00:00:00 2001 From: Samir Boulahtit Date: Sat, 6 Jun 2026 21:01:54 +0200 Subject: [PATCH] docs(ops): record 2026-06-06 Gitea+CI migration execution + runbook lessons MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add the "Executed: 2026-06-06" record to the 2c runbook (new box gitea-ci-fsn1-1, Falkenstein CX22, IPs, outcome) and fold the real-world lessons into the steps: pin the Gitea image version (not latest), ON_ERROR_STOP + count check on DB restore, the old-runner-survives-in- migrated-DB gotcha (delete from action_runner + stop prod service), generate runner token as the git user, expected volume-already-exists warning, and the root-vs-sudo note. Held local (not pushed) — pushing stacks a 2nd ~3h CI run behind the in-flight one. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/deployment/hetzner-server-setup.md | 60 ++++++++++++++++++++++++- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/docs/deployment/hetzner-server-setup.md b/docs/deployment/hetzner-server-setup.md index 04763fb4..887ffa64 100644 --- a/docs/deployment/hetzner-server-setup.md +++ b/docs/deployment/hetzner-server-setup.md @@ -3314,6 +3314,11 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes: / `POSTGRES_PASSWORD` — copy them from the current file; do not regenerate, or the restored DB won't authenticate). Keep `ROOT_URL`/`DOMAIN`/`SSH_DOMAIN` as `git.wizard.lu`. + **Pin the Gitea image to the running version, not `latest`** — check it first + with `docker exec gitea gitea --version` and set e.g. `image: gitea/gitea:1.25.4`. + If the new box pulls a newer `latest`, Gitea runs unexpected DB migrations on + first start against your freshly-restored data. (`postgres:15` is already + pinned on the major, fine.) 2. **Announce downtime / stop writes** on the old Gitea. 3. **Dump the data on the old box:** @@ -3327,11 +3332,16 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes: 4. **Transfer** `/tmp/gitea-db.sql` + `/tmp/gitea-data.tgz` to the new box (`scp`/`rsync`). -5. **Restore the DB** on the new box: +5. **Restore the DB** on the new box (the `postgres:15` container auto-creates an + empty `gitea` DB; restore into it. Add `-v ON_ERROR_STOP=1` so a bad restore + fails loudly instead of silently): ```bash docker compose up -d gitea-db # wait until healthy - cat gitea-db.sql | docker exec -i gitea-db psql -U gitea -d gitea + docker exec -i gitea-db psql -U gitea -d gitea -v ON_ERROR_STOP=1 < gitea-db.sql + # sanity-check counts match the source: + docker exec gitea-db psql -U gitea -d gitea -t \ + -c "SELECT 'repos',count(*) FROM repository UNION ALL SELECT 'secrets',count(*) FROM secret;" ``` 6. **Restore the data volume** on the new box: @@ -3360,6 +3370,23 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes: 11. **No remote/runner URL changes needed** — the hostname `git.wizard.lu` stays the same (only the IP moved), so your `gitea` git remote and the runner's `--instance https://git.wizard.lu` keep working after DNS flips. + Install the new runner per [2a](#offloading-ci-to-a-separate-server-2a-recommended). + ⚠️ **Critical gotcha — the OLD runner registration travels in the migrated + DB.** Because the DB is copied wholesale, the old prod runner still exists in + `action_runner` and — since `git.wizard.lu` now resolves to the new box — it + can re-authenticate and grab jobs. You must BOTH (a) remove its registration + from the migrated DB and (b) stop its process on prod, or CI may still run on + the old box: + + ```bash + # (a) on the NEW box — drop the stale runner registration: + docker exec gitea-db psql -U gitea -d gitea \ + -c "DELETE FROM action_runner WHERE name='';" + # (b) on PROD — stop the orphaned runner process: + sudo systemctl disable --now gitea-runner.service + ``` + (Generate the new runner's token with `docker exec -u git gitea gitea actions + generate-runner-token` — Gitea refuses to run that as root.) 12. **Decommission Gitea on prod** (keep volumes + backups for a rollback window): @@ -3374,6 +3401,35 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes: works, a push triggers CI, and repos/actions history are intact. (See the "Backup coverage & rollback" callout above if anything needs reverting.) +#### Executed: 2026-06-06 (production run) + +This migration was carried out on **2026-06-06**, moving Gitea + the CI runner +off the prod box (`91.99.65.229`, Nuremberg) — which had been suffering CPU +floods from CI running on it — to a dedicated box. + +- **New box:** `gitea-ci-fsn1-1`, Falkenstein (`fsn1`), CX22 (2 vCPU / 4 GB x86, + Ubuntu 24.04, Hetzner backups on). IPv4 `167.233.28.95`, IPv6 + `2a01:4f8:c015:b6cb::1`. ~5.29 EUR/mo. +- **Outcome:** Gitea `1.25.4` + runner `gitea-ci-fsn1-1` (act_runner v0.2.13) now + run on the new box; `git.wizard.lu` serves from it with a fresh Let's Encrypt + cert; CI runs off-prod (prod CPU stayed at its ~1.4 baseline during a CI run, + no burst). DB restore counts matched source exactly (1 repo, 2 users, 4 + secrets). The git-SSH host key travelled in `gitea-data` → no host-key-changed + warnings on push. +- **Real-world notes / deviations from the generic steps above (now folded in):** + - Pinned `gitea/gitea:1.25.4` (step 1) — prod was on 1.25.4; avoid `latest`. + - Restore = `pg_dump` (plain SQL) + `gitea-data` volume tar; a `gitea dump` + archive was taken first as the one-shot safety net and pulled to the laptop. + - Had to delete the **old runner** from the migrated DB + stop its prod + service (step 11 gotcha) — otherwise it kept eligibility for jobs. + - On the new box, the `samir` user's sudo needs a password (not NOPASSWD), so + automated/admin commands were run as `root` over key-only SSH; + `PermitRootLogin prohibit-password` was kept during the migration (tighten + to `no` + give `samir` a sudo password afterward if desired). + - The `docker compose` warning *"volume gitea_gitea-data already exists but + was not created by Docker Compose"* is expected — the volume is pre-created + when you restore into it before first `up`. Harmless. + ### View logs ```bash