diff --git a/.gitea/README.md b/.gitea/README.md index 85a5000..4c45c95 100644 --- a/.gitea/README.md +++ b/.gitea/README.md @@ -13,7 +13,7 @@ the build pipeline is shaped the way it is, you're in the right place. ## Why the split-base pipeline exists -opencode-devbox publishes **four image variants** (`base`, `omos`, `with-pi`, `omos-with-pi`) × **two architectures** (amd64, arm64) = **eight image tags per release**. Today's runners are 2 self-hosted gitea Actions runners. arm64 builds are emulated under QEMU, which is the dominant cost (~3–5x slower than native). +opencode-devbox publishes **five image variants** (`base`, `omos`, `with-pi`, `omos-with-pi`, `pi-only`) × **two architectures** (amd64, arm64) = **ten image tags per release**. Today's runners are 2 self-hosted gitea Actions runners. arm64 builds are emulated under QEMU, which is the dominant cost (~3–5x slower than native). The five variants share ~95% of their layers (Debian + apt + Node + AWS CLI + mempalace + dev tools + entrypoints). The original `Dockerfile` was a single multi-stage build with `INSTALL_*` build-args gating variant-specific RUNs. BuildKit's per-layer cache key is content-addressed, but as soon as a build-arg-gated `RUN` produces a different layer hash for variant A vs variant B, every subsequent layer also has a different parent → identical commands re-execute per variant. Result: minimal cross-variant cache reuse on a fresh build. diff --git a/AGENTS.md b/AGENTS.md index f766324..e39b14b 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -99,7 +99,7 @@ cd /tmp && npm pack @earendil-works/pi-coding-agent@0.75.5 && tar -xzf earendil- - `update-description` job runs only when both builds succeed (`needs: [build-base, build-omos]`). - Tags must be pushed to trigger the publish workflow. The validate workflow runs on push to main and PRs. - Smoke tests run on amd64 only (single-arch load into the local daemon). The multi-arch push happens after smoke passes. -- **Gitea Actions runner has ~40 GB disk, often 70%+ used at job start.** All eight `load: true` jobs (`validate-base`, `validate-omos`, `validate-with-pi`, `validate-omos-with-pi`, `smoke-base`, `smoke-omos`, `smoke-with-pi`, `smoke-omos-with-pi`) include a `Reclaim runner disk` step that strips catthehacker-resident toolchains and prunes stale docker state before `setup-buildx-action`. Build jobs use a lighter version (push-by-digest doesn't need `docker system prune`). Don't remove these steps without testing on a fresh runner. +- **Gitea Actions runner has ~40 GB disk, often 70%+ used at job start.** All ten `load: true` jobs (`validate-base`, `validate-omos`, `validate-with-pi`, `validate-omos-with-pi`, `validate-pi-only`, `smoke-base`, `smoke-omos`, `smoke-with-pi`, `smoke-omos-with-pi`, `smoke-pi-only`) include a `Reclaim runner disk` step that strips catthehacker-resident toolchains and prunes stale docker state before `setup-buildx-action`. Build jobs use a lighter version (push-by-digest doesn't need `docker system prune`). Don't remove these steps without testing on a fresh runner. - **`docker/build-push-action@v7` with `platforms: linux/amd64,linux/arm64` handles multi-arch push natively in a single job** — produces a proper manifest list, no matrix or merge step needed. An earlier revision split into per-arch matrix jobs with digest artifacts, but that pattern requires `actions/{upload,download}-artifact@v4+` which Gitea Actions doesn't support (see below). - **`actions/upload-artifact` and `actions/download-artifact` must stay at @v3 on Gitea.** v4+ uses a GitHub-Enterprise-specific Artifact API; runs fail with `GHESNotSupportedError`. If you need artifacts for a new reason (build logs, SBOMs, etc.), pin @v3 explicitly. - **Step scripts run under `/bin/sh` (dash), not bash.** Avoid bash-isms like `${VAR//a/b}` parameter-pattern substitution; use POSIX alternatives (`tr`, `sed`) or declare `shell: bash` on the step. diff --git a/docs/manual-host-publish.md b/docs/manual-host-publish.md index 15a7c7f..a791fd7 100644 --- a/docs/manual-host-publish.md +++ b/docs/manual-host-publish.md @@ -101,8 +101,8 @@ After the constants are set, the script runs a 5-step procedure. No editing need 1. **Preflight** — buildx present, tag exists, `HEAD == tag`, multi-arch builder created if missing. 2. **Base build (conditional)** — probe `${IMAGE}:base-${BASE_HASH}` on the Hub; if missing, build it multi-arch and push. **No `--cache-from` / `--cache-to`.** That's the whole point of this escape. If the base push itself fails the same way CI did, stop — the regression has spread to image push and you need a different host or account, not this runbook. 3. **Promote `base-latest`** — `docker buildx imagetools create` re-tags by manifest reference. No rebuild. -4. **Variants × 4** — sequential (not parallel; one host's egress can't saturate four multi-arch pushes safely). Each variant is `Dockerfile.variant` `FROM ${IMAGE}:base-${BASE_HASH}` plus the appropriate `INSTALL_OMOS` / `INSTALL_PI` build-args, tagged `${RELEASE_TAG}${suffix}` and `latest${suffix}`. -5. **Verify** — prints the digest of all 10 expected tags (8 variant + base-hash + base-latest). Spot-check that each `vX.Y.Z*` and its `latest*` alias share a digest. +4. **Variants × 5** — sequential (not parallel; one host's egress can't saturate five multi-arch pushes safely). Each variant is `Dockerfile.variant` `FROM ${IMAGE}:base-${BASE_HASH}` plus the appropriate `INSTALL_OPENCODE` / `INSTALL_OMOS` / `INSTALL_PI` build-args, tagged `${RELEASE_TAG}${suffix}` and `latest${suffix}`. +5. **Verify** — prints the digest of all 12 expected tags (10 variant + base-hash + base-latest). Spot-check that each `vX.Y.Z*` and its `latest*` alias share a digest. Expected wall time on a recent Mac: ~25-40 min (base ~3 min if rebuilt, each variant ~3-7 min mostly QEMU arm64 emulation).