From 4de0bc9993a685d57c5b575835ec8dca3a2c8b57 Mon Sep 17 00:00:00 2001 From: Joakim Persson Date: Sat, 25 Apr 2026 19:35:33 +0200 Subject: [PATCH] Document CI runner Docker pruning setup in deploy/README.md Gitea Actions runners accumulate buildkit cache, stale containers, and unused images. Without periodic cleanup the disk fills and builds stall during image push (observed: build-omos hung at 'pushing layers' for 1.5h on a 77%-full disk). Add a 'CI runner maintenance' section to deploy/README.md with two cleanup layers: a daily cron job (prunes anything >72h old) and Docker daemon builder GC (caps buildkit cache at 10 GB). --- deploy/README.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/deploy/README.md b/deploy/README.md index 74ca340..4639a68 100644 --- a/deploy/README.md +++ b/deploy/README.md @@ -238,6 +238,42 @@ This means: - To restore the baked defaults any time: `cp /etc/skel-devbox/.bash_aliases ~/` (or delete the file and recreate the container). - To diff your current config against what the image ships: `diff ~/.bash_aliases /etc/skel-devbox/.bash_aliases`. +### CI runner maintenance: automatic Docker pruning + +Gitea Actions runners accumulate Docker build cache, stale buildkit containers, and unused images over time. Without periodic cleanup, the runner's disk fills up and builds stall during the image-push phase (symptom: `#61 exporting to image` / `pushing layers` hangs indefinitely while buildkit repeatedly re-authenticates with Docker Hub). + +Set up two layers of automatic cleanup on the runner host: + +**1. Daily cron job** — prunes images, containers, and build cache older than 72 hours: + +```bash +sudo tee /etc/cron.daily/docker-prune <<'EOF' +#!/bin/sh +docker system prune -af --filter "until=72h" > /var/log/docker-prune.log 2>&1 +docker builder prune -af --filter "until=72h" >> /var/log/docker-prune.log 2>&1 +EOF +sudo chmod +x /etc/cron.daily/docker-prune +``` + +**2. Docker daemon builder GC** — caps buildkit cache at 10 GB (Docker 23.0+): + +Add to `/etc/docker/daemon.json` (create if absent): + +```json +{ + "builder": { + "gc": { + "enabled": true, + "defaultKeepStorage": "10GB" + } + } +} +``` + +Then `sudo systemctl restart docker`. + +Both are safe to run on a machine that also hosts long-running containers (like opencode-devbox) — `docker system prune` only removes *unused* images and *stopped* containers, never running ones. + ### Troubleshooting: SSH hangs or "banner exchange" timeouts If SSH to the VM intermittently fails with `Connection timed out during banner exchange` or pure TCP connect timeouts — especially after the first few successful connects in a short window — the cause is almost certainly your ISP's CGNAT (Carrier-Grade NAT), not the VM.