docs(ops): record 2026-06-06 Gitea+CI migration execution + runbook lessons
Some checks are pending
Some checks are pending
Add the "Executed: 2026-06-06" record to the 2c runbook (new box gitea-ci-fsn1-1, Falkenstein CX22, IPs, outcome) and fold the real-world lessons into the steps: pin the Gitea image version (not latest), ON_ERROR_STOP + count check on DB restore, the old-runner-survives-in- migrated-DB gotcha (delete from action_runner + stop prod service), generate runner token as the git user, expected volume-already-exists warning, and the root-vs-sudo note. Held local (not pushed) — pushing stacks a 2nd ~3h CI run behind the in-flight one. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -3314,6 +3314,11 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes:
|
||||
/ `POSTGRES_PASSWORD` — copy them from the current file; do not regenerate, or
|
||||
the restored DB won't authenticate). Keep `ROOT_URL`/`DOMAIN`/`SSH_DOMAIN`
|
||||
as `git.wizard.lu`.
|
||||
**Pin the Gitea image to the running version, not `latest`** — check it first
|
||||
with `docker exec gitea gitea --version` and set e.g. `image: gitea/gitea:1.25.4`.
|
||||
If the new box pulls a newer `latest`, Gitea runs unexpected DB migrations on
|
||||
first start against your freshly-restored data. (`postgres:15` is already
|
||||
pinned on the major, fine.)
|
||||
2. **Announce downtime / stop writes** on the old Gitea.
|
||||
3. **Dump the data on the old box:**
|
||||
|
||||
@@ -3327,11 +3332,16 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes:
|
||||
|
||||
4. **Transfer** `/tmp/gitea-db.sql` + `/tmp/gitea-data.tgz` to the new box
|
||||
(`scp`/`rsync`).
|
||||
5. **Restore the DB** on the new box:
|
||||
5. **Restore the DB** on the new box (the `postgres:15` container auto-creates an
|
||||
empty `gitea` DB; restore into it. Add `-v ON_ERROR_STOP=1` so a bad restore
|
||||
fails loudly instead of silently):
|
||||
|
||||
```bash
|
||||
docker compose up -d gitea-db # wait until healthy
|
||||
cat gitea-db.sql | docker exec -i gitea-db psql -U gitea -d gitea
|
||||
docker exec -i gitea-db psql -U gitea -d gitea -v ON_ERROR_STOP=1 < gitea-db.sql
|
||||
# sanity-check counts match the source:
|
||||
docker exec gitea-db psql -U gitea -d gitea -t \
|
||||
-c "SELECT 'repos',count(*) FROM repository UNION ALL SELECT 'secrets',count(*) FROM secret;"
|
||||
```
|
||||
|
||||
6. **Restore the data volume** on the new box:
|
||||
@@ -3360,6 +3370,23 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes:
|
||||
11. **No remote/runner URL changes needed** — the hostname `git.wizard.lu`
|
||||
stays the same (only the IP moved), so your `gitea` git remote and the
|
||||
runner's `--instance https://git.wizard.lu` keep working after DNS flips.
|
||||
Install the new runner per [2a](#offloading-ci-to-a-separate-server-2a-recommended).
|
||||
⚠️ **Critical gotcha — the OLD runner registration travels in the migrated
|
||||
DB.** Because the DB is copied wholesale, the old prod runner still exists in
|
||||
`action_runner` and — since `git.wizard.lu` now resolves to the new box — it
|
||||
can re-authenticate and grab jobs. You must BOTH (a) remove its registration
|
||||
from the migrated DB and (b) stop its process on prod, or CI may still run on
|
||||
the old box:
|
||||
|
||||
```bash
|
||||
# (a) on the NEW box — drop the stale runner registration:
|
||||
docker exec gitea-db psql -U gitea -d gitea \
|
||||
-c "DELETE FROM action_runner WHERE name='<old-runner-name>';"
|
||||
# (b) on PROD — stop the orphaned runner process:
|
||||
sudo systemctl disable --now gitea-runner.service
|
||||
```
|
||||
(Generate the new runner's token with `docker exec -u git gitea gitea actions
|
||||
generate-runner-token` — Gitea refuses to run that as root.)
|
||||
12. **Decommission Gitea on prod** (keep volumes + backups for a rollback
|
||||
window):
|
||||
|
||||
@@ -3374,6 +3401,35 @@ host `2222`) and `gitea-db` (`postgres:15`). Data lives in two named volumes:
|
||||
works, a push triggers CI, and repos/actions history are intact. (See the
|
||||
"Backup coverage & rollback" callout above if anything needs reverting.)
|
||||
|
||||
#### Executed: 2026-06-06 (production run)
|
||||
|
||||
This migration was carried out on **2026-06-06**, moving Gitea + the CI runner
|
||||
off the prod box (`91.99.65.229`, Nuremberg) — which had been suffering CPU
|
||||
floods from CI running on it — to a dedicated box.
|
||||
|
||||
- **New box:** `gitea-ci-fsn1-1`, Falkenstein (`fsn1`), CX22 (2 vCPU / 4 GB x86,
|
||||
Ubuntu 24.04, Hetzner backups on). IPv4 `167.233.28.95`, IPv6
|
||||
`2a01:4f8:c015:b6cb::1`. ~5.29 EUR/mo.
|
||||
- **Outcome:** Gitea `1.25.4` + runner `gitea-ci-fsn1-1` (act_runner v0.2.13) now
|
||||
run on the new box; `git.wizard.lu` serves from it with a fresh Let's Encrypt
|
||||
cert; CI runs off-prod (prod CPU stayed at its ~1.4 baseline during a CI run,
|
||||
no burst). DB restore counts matched source exactly (1 repo, 2 users, 4
|
||||
secrets). The git-SSH host key travelled in `gitea-data` → no host-key-changed
|
||||
warnings on push.
|
||||
- **Real-world notes / deviations from the generic steps above (now folded in):**
|
||||
- Pinned `gitea/gitea:1.25.4` (step 1) — prod was on 1.25.4; avoid `latest`.
|
||||
- Restore = `pg_dump` (plain SQL) + `gitea-data` volume tar; a `gitea dump`
|
||||
archive was taken first as the one-shot safety net and pulled to the laptop.
|
||||
- Had to delete the **old runner** from the migrated DB + stop its prod
|
||||
service (step 11 gotcha) — otherwise it kept eligibility for jobs.
|
||||
- On the new box, the `samir` user's sudo needs a password (not NOPASSWD), so
|
||||
automated/admin commands were run as `root` over key-only SSH;
|
||||
`PermitRootLogin prohibit-password` was kept during the migration (tighten
|
||||
to `no` + give `samir` a sudo password afterward if desired).
|
||||
- The `docker compose` warning *"volume gitea_gitea-data already exists but
|
||||
was not created by Docker Compose"* is expected — the volume is pre-created
|
||||
when you restore into it before first `up`. Harmless.
|
||||
|
||||
### View logs
|
||||
|
||||
```bash
|
||||
|
||||
Reference in New Issue
Block a user