Document CI runner Docker pruning setup in deploy/README.md
Publish Docker Image / build-base (push) Successful in 44m32s
Publish Docker Image / build-omos (push) Successful in 54m49s
Publish Docker Image / update-description (push) Successful in 20s

Gitea Actions runners accumulate buildkit cache, stale containers,
and unused images. Without periodic cleanup the disk fills and builds
stall during image push (observed: build-omos hung at 'pushing layers'
for 1.5h on a 77%-full disk).

Add a 'CI runner maintenance' section to deploy/README.md with two
cleanup layers: a daily cron job (prunes anything >72h old) and
Docker daemon builder GC (caps buildkit cache at 10 GB).
This commit is contained in:
2026-04-25 19:35:33 +02:00
parent b648d83928
commit 4de0bc9993
+36
View File
@@ -238,6 +238,42 @@ This means:
- To restore the baked defaults any time: `cp /etc/skel-devbox/.bash_aliases ~/` (or delete the file and recreate the container).
- To diff your current config against what the image ships: `diff ~/.bash_aliases /etc/skel-devbox/.bash_aliases`.
### CI runner maintenance: automatic Docker pruning
Gitea Actions runners accumulate Docker build cache, stale buildkit containers, and unused images over time. Without periodic cleanup, the runner's disk fills up and builds stall during the image-push phase (symptom: `#61 exporting to image` / `pushing layers` hangs indefinitely while buildkit repeatedly re-authenticates with Docker Hub).
Set up two layers of automatic cleanup on the runner host:
**1. Daily cron job** — prunes images, containers, and build cache older than 72 hours:
```bash
sudo tee /etc/cron.daily/docker-prune <<'EOF'
#!/bin/sh
docker system prune -af --filter "until=72h" > /var/log/docker-prune.log 2>&1
docker builder prune -af --filter "until=72h" >> /var/log/docker-prune.log 2>&1
EOF
sudo chmod +x /etc/cron.daily/docker-prune
```
**2. Docker daemon builder GC** — caps buildkit cache at 10 GB (Docker 23.0+):
Add to `/etc/docker/daemon.json` (create if absent):
```json
{
"builder": {
"gc": {
"enabled": true,
"defaultKeepStorage": "10GB"
}
}
}
```
Then `sudo systemctl restart docker`.
Both are safe to run on a machine that also hosts long-running containers (like opencode-devbox) — `docker system prune` only removes *unused* images and *stopped* containers, never running ones.
### Troubleshooting: SSH hangs or "banner exchange" timeouts
If SSH to the VM intermittently fails with `Connection timed out during banner exchange` or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.