Compare commits

..

4 Commits

Author SHA1 Message Date
Joakim Persson fc74a8f906 Collapse per-arch matrix back into single multi-arch push jobs
Validate / docs-check (push) Successful in 17s
Validate / validate-omos (push) Successful in 14m21s
Validate / validate-base (push) Successful in 14m50s
Publish Docker Image / smoke-base (push) Successful in 11m12s
Publish Docker Image / smoke-omos (push) Successful in 22m0s
Publish Docker Image / build-base (push) Successful in 42m25s
Publish Docker Image / build-omos (push) Failing after 1h16m24s
Publish Docker Image / update-description (push) Has been cancelled
v1.14.31c's matrix jobs failed on Upload digest with GHESNotSupportedError
— Gitea Actions doesn't support actions/upload-artifact@v4+.
Separately, build-omos arm64 hung silently for 12 min in Set-up job,
likely catthehacker pull contention between concurrent matrix children.

Rather than downgrade artifacts to @v3, collapse the matrix entirely.
docker/build-push-action@v7 with platforms: linux/amd64,linux/arm64
publishes a proper multi-arch manifest in one job, so the
artifact-passing and imagetools create merge dance only existed to
support a matrix split we no longer need.

The matrix was designed around load: true disk exhaustion (v1.14.30b),
but push-by-digest streams straight to the registry with fundamentally
different disk profile. Reclaim step gives enough headroom for the
combined amd64+arm64 push case.

Workflow: 7 jobs → 5. docker-publish.yml: 263 → ~110 lines of YAML.

Also:
- timeout-minutes: 90 on build jobs so hung builds fail explicitly
- BUILDKIT_PROGRESS=plain at workflow level for line-by-line arm64 logs
- AGENTS.md §CI quirks documents the Gitea-specific traps
  (upload-artifact@v3-only, dash-not-bash, build-push-action@v7
  multi-arch convention, reclaim requirement)
2026-05-01 12:28:34 +00:00
Joakim Persson 5a2d06340e Fix dash-incompatible slash substitution and bump omos size threshold
Validate / docs-check (push) Successful in 18s
Validate / validate-base (push) Successful in 15m44s
Validate / validate-omos (push) Successful in 15m21s
Publish Docker Image / smoke-base (push) Successful in 14m30s
Publish Docker Image / smoke-omos (push) Successful in 15m51s
Publish Docker Image / build-base (linux/amd64) (push) Failing after 10m58s
Publish Docker Image / build-omos (linux/amd64) (push) Failing after 15m9s
Publish Docker Image / build-omos (linux/arm64) (push) Failing after 11m57s
Publish Docker Image / build-base (linux/arm64) (push) Failing after 39m30s
Publish Docker Image / merge-base (push) Has been skipped
Publish Docker Image / merge-omos (push) Has been skipped
Publish Docker Image / update-description (push) Has been skipped
v1.14.31b made it through smoke-base and validate-base (reclaim worked),
but two narrow bugs blocked the rest:

1. 'Derive platform slug' in the per-arch matrix jobs used bash
   ${PLATFORM_PAIR//\//-} which dash (/bin/sh in the runner) can't
   parse — 'Bad substitution'. Rewrote with 'tr / -'.

2. smoke-omos image size 3107 MB tripped the 3000 MB guardrail. All
   functional checks pass; the mempalace-toolkit bake-in from v1.14.30b
   added ~100 MB and the threshold was stale. Bumped to 3200 MB.

No image-level changes.
2026-05-01 10:43:04 +00:00
Joakim Persson 23894bc19f Reclaim runner disk before load: true smoke builds
Validate / docs-check (push) Successful in 22s
Validate / validate-base (push) Successful in 18m10s
Validate / validate-omos (push) Failing after 25m54s
Publish Docker Image / smoke-base (push) Successful in 11m50s
Publish Docker Image / build-base (linux/amd64) (push) Failing after 38s
Publish Docker Image / build-base (linux/arm64) (push) Failing after 21s
Publish Docker Image / merge-base (push) Has been skipped
Publish Docker Image / smoke-omos (push) Failing after 19m18s
Publish Docker Image / build-omos (linux/amd64) (push) Has been skipped
Publish Docker Image / build-omos (linux/arm64) (push) Has been skipped
Publish Docker Image / merge-omos (push) Has been skipped
Publish Docker Image / update-description (push) Has been skipped
v1.14.31 publish and validate both hit 'No space left on device' on
single-arch amd64 smoke/validate builds. The image has crossed ~3 GB
and the runner's ~40 GB overlay starts ~70% full, so 'load: true'
peak disk (tarball + unpacked image + buildx cache) no longer fits.

Add a 'Reclaim runner disk' step to validate-base, validate-omos,
smoke-base, smoke-omos. Strips catthehacker-resident toolchains we
never use (hosted-tool-cache, dotnet, android, powershell, swift,
ghc, jvm, microsoft, chromium, boost), then runs 'docker system
prune -af --volumes' + 'docker builder prune -af' against the
runner's dockerd before setup-buildx-action. Expected reclaim is
6-12 GB depending on what's resident.

Deliberately NOT in the per-arch matrix build jobs — push-by-digest
doesn't need it and pruning in parallel jobs risks one job nuking
another's in-flight buildx cache. Also add workflow-level
concurrency on docker-publish.yml so concurrent tag pushes serialize
cleanly.
2026-05-01 09:34:52 +00:00
Joakim Persson f0918ba915 Bump opencode to 1.14.31 and split multi-arch publish across runners
Validate / docs-check (push) Successful in 26s
Publish Docker Image / smoke-base (push) Failing after 11m1s
Publish Docker Image / build-base (linux/amd64) (push) Has been skipped
Publish Docker Image / build-base (linux/arm64) (push) Has been skipped
Publish Docker Image / merge-base (push) Has been skipped
Validate / validate-base (push) Failing after 13m48s
Validate / validate-omos (push) Failing after 15m23s
Publish Docker Image / smoke-omos (push) Failing after 16m20s
Publish Docker Image / build-omos (linux/amd64) (push) Has been skipped
Publish Docker Image / build-omos (linux/arm64) (push) Has been skipped
Publish Docker Image / merge-omos (push) Has been skipped
Publish Docker Image / update-description (push) Has been skipped
The v1.14.30b publish failed on both variants with 'No space left on
device' — arm64 QEMU-emulated layers were stored alongside amd64 on the
same ~40 GB runner, and the mempalace-toolkit bake-in from v1.14.30b
tipped peak disk over the edge during the nodejs dpkg unpack and the
git-lfs layer export.

Refactor docker-publish.yml to the canonical push-by-digest +
manifest-merge pattern: smoke test (amd64) runs on its own runner, each
(variant x arch) push target runs on its own fresh runner with
outputs=type=image,push-by-digest=true,push=true (no local image
store), then a tiny merge job assembles the multi-arch manifest with
docker buildx imagetools create from digest artifacts. Per-runner disk
peak is roughly one-quarter of the old single-job peak. The four
Docker Hub tags per release are unchanged. As a bonus, amd64 and arm64
now build in parallel.

No image-level changes beyond the opencode bump.
2026-05-01 08:43:08 +00:00
6 changed files with 336 additions and 54 deletions
+229 -51
View File
@@ -5,8 +5,56 @@ on:
tags: tags:
- 'v*' - 'v*'
# Serialize concurrent runs of the same workflow on the same ref so the
# build jobs can't race `docker system prune` in the smoke gates
# (pruning from one job can nuke another job's in-flight buildx cache).
# cancel-in-progress: false — tag pushes are release events, we never
# want to silently drop one.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: false
# Plain progress output from BuildKit — critical for diagnosing stalls
# inside arm64-under-QEMU builds where the default collapsed progress UI
# hides which step is stuck.
env:
BUILDKIT_PROGRESS: plain
# Runner disk pressure notes:
# Gitea Actions runners use `catthehacker/ubuntu:act-latest` on a shared host
# with limited overlay space (~40 GB, often 70%+ used at start). Two jobs
# per variant:
# * smoke gate (amd64 only, `load: true` into local dockerd for smoke
# testing) — peak disk = tarball + unpacked image + buildx cache. The
# `Reclaim runner disk` step below strips catthehacker-resident
# toolchains and prunes stale docker state before buildx starts.
# * build job (amd64 + arm64, `push-by-digest` streaming directly to
# Docker Hub, no local unpack). Peak disk on push-by-digest is
# BuildKit's content store only — much smaller than `load: true`.
# `docker/build-push-action@v7` with comma-separated platforms
# publishes a proper multi-arch manifest in one step.
#
# Why not matrix + digest artifacts?
# An earlier revision split each arch into its own matrix job and used
# `actions/upload-artifact` to pass digests to a merge job. On Gitea
# Actions, `actions/{upload,download}-artifact@v4+` fails with
# `GHESNotSupportedError` — v4 relies on a GitHub-specific Artifact
# API that Gitea doesn't implement. Rather than downgrade to @v3 (the
# last Gitea-compatible release) we collapsed back to single-job
# multi-arch push. The matrix only helps when the build literally
# cannot fit on one runner, which push-by-digest + reclaim no longer
# hits for this image.
#
# Gitea Actions gotchas baked into this file:
# * `actions/{upload,download}-artifact` must stay at @v3 on Gitea.
# * Step scripts run under /bin/sh (dash) — no bash-isms like
# ${VAR//a/b}. Use `tr` or explicit `shell: bash`.
# * `docker/build-push-action@v7` with `platforms: a,b` works for
# multi-arch push natively; no matrix/merge dance needed.
jobs: jobs:
build-base: # ── Smoke test (amd64 only, gates the push jobs) ────────────────────
smoke-base:
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
image: catthehacker/ubuntu:act-latest image: catthehacker/ubuntu:act-latest
@@ -15,30 +63,41 @@ jobs:
uses: actions/checkout@v4 uses: actions/checkout@v4
- name: Force IPv4 for Docker Hub - name: Force IPv4 for Docker Hub
run: | run: echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
# Prefer IPv4 to avoid intermittent IPv6 connectivity failures
echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
- name: Set up QEMU # See docker-publish.yml preamble. `load: true` peak disk = tarball
uses: docker/setup-qemu-action@v4 # + unpacked image + buildx cache; the image now crosses the 40 GB
# runner overlay's starting headroom. Strip catthehacker-resident
# toolchains and any stale docker state up front.
- name: Reclaim runner disk
run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker system df || true
docker system prune -af --volumes || true
docker builder prune -af || true
df -h / || true
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4 uses: docker/setup-buildx-action@v4
with: with:
driver-opts: network=host driver-opts: network=host
- name: Login to Docker Hub
uses: docker/login-action@v4
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract version from tag
id: version
run: |
VERSION=${GITHUB_REF#refs/tags/}
echo "version=${VERSION}" >> $GITHUB_OUTPUT
- name: Build and load amd64 image for smoke test - name: Build and load amd64 image for smoke test
uses: docker/build-push-action@v7 uses: docker/build-push-action@v7
with: with:
@@ -49,20 +108,9 @@ jobs:
tags: opencode-devbox:smoke-base tags: opencode-devbox:smoke-base
- name: Smoke test (amd64) - name: Smoke test (amd64)
run: | run: bash scripts/smoke-test.sh opencode-devbox:smoke-base --variant base
bash scripts/smoke-test.sh opencode-devbox:smoke-base --variant base
- name: Build and push (base) smoke-omos:
uses: docker/build-push-action@v7
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
${{ vars.DOCKERHUB_USERNAME }}/opencode-devbox:${{ steps.version.outputs.version }}
${{ vars.DOCKERHUB_USERNAME }}/opencode-devbox:latest
build-omos:
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
image: catthehacker/ubuntu:act-latest image: catthehacker/ubuntu:act-latest
@@ -71,30 +119,37 @@ jobs:
uses: actions/checkout@v4 uses: actions/checkout@v4
- name: Force IPv4 for Docker Hub - name: Force IPv4 for Docker Hub
run: | run: echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
# Prefer IPv4 to avoid intermittent IPv6 connectivity failures
echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
- name: Set up QEMU - name: Reclaim runner disk
uses: docker/setup-qemu-action@v4 run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker system df || true
docker system prune -af --volumes || true
docker builder prune -af || true
df -h / || true
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4 uses: docker/setup-buildx-action@v4
with: with:
driver-opts: network=host driver-opts: network=host
- name: Login to Docker Hub
uses: docker/login-action@v4
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract version from tag
id: version
run: |
VERSION=${GITHUB_REF#refs/tags/}
echo "version=${VERSION}" >> $GITHUB_OUTPUT
- name: Build and load amd64 image for smoke test - name: Build and load amd64 image for smoke test
uses: docker/build-push-action@v7 uses: docker/build-push-action@v7
with: with:
@@ -107,10 +162,133 @@ jobs:
tags: opencode-devbox:smoke-omos tags: opencode-devbox:smoke-omos
- name: Smoke test (amd64) - name: Smoke test (amd64)
run: | run: bash scripts/smoke-test.sh opencode-devbox:smoke-omos --variant omos
bash scripts/smoke-test.sh opencode-devbox:smoke-omos --variant omos
- name: Build and push (omos) # ── Multi-arch push (single job per variant, comma-separated platforms) ─
build-base:
runs-on: ubuntu-latest
needs: smoke-base
timeout-minutes: 90
container:
image: catthehacker/ubuntu:act-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Force IPv4 for Docker Hub
run: echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
# Lighter reclaim than the smoke-gate version: push-by-digest
# doesn't write to host dockerd, so `docker system prune` adds
# little. BuildKit cache from prior runs is the thing to clear.
- name: Reclaim runner disk
run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker builder prune -af || true
df -h / || true
- name: Set up QEMU
uses: docker/setup-qemu-action@v4
with:
platforms: arm64
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4
with:
driver-opts: network=host
- name: Login to Docker Hub
uses: docker/login-action@v4
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract version from tag
id: version
run: echo "version=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
- name: Build and push (multi-arch)
uses: docker/build-push-action@v7
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: |
${{ vars.DOCKERHUB_USERNAME }}/opencode-devbox:${{ steps.version.outputs.version }}
${{ vars.DOCKERHUB_USERNAME }}/opencode-devbox:latest
build-omos:
runs-on: ubuntu-latest
needs: smoke-omos
timeout-minutes: 90
container:
image: catthehacker/ubuntu:act-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Force IPv4 for Docker Hub
run: echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
- name: Reclaim runner disk
run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker builder prune -af || true
df -h / || true
- name: Set up QEMU
uses: docker/setup-qemu-action@v4
with:
platforms: arm64
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4
with:
driver-opts: network=host
- name: Login to Docker Hub
uses: docker/login-action@v4
with:
username: ${{ vars.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Extract version from tag
id: version
run: echo "version=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
- name: Build and push (multi-arch)
uses: docker/build-push-action@v7 uses: docker/build-push-action@v7
with: with:
context: . context: .
+52
View File
@@ -46,6 +46,34 @@ jobs:
run: | run: |
echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
# The runner's overlay disk starts ~70% full. `load: true` peak disk
# is tarball + unpacked image + buildx cache, which tips it over
# once the image crosses ~3 GB. Strip catthehacker-resident
# toolchains we never use and any stale docker state up front.
- name: Reclaim runner disk
run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker system df || true
docker system prune -af --volumes || true
docker builder prune -af || true
df -h / || true
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4 uses: docker/setup-buildx-action@v4
with: with:
@@ -76,6 +104,30 @@ jobs:
run: | run: |
echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf echo 'precedence ::ffff:0:0/96 100' >> /etc/gai.conf
- name: Reclaim runner disk
run: |
set -x
df -h / || true
rm -rf \
/opt/hostedtoolcache \
/opt/microsoft \
/opt/az \
/opt/ghc \
/usr/local/.ghcup \
/usr/share/dotnet \
/usr/share/swift \
/usr/local/lib/android \
/usr/local/share/powershell \
/usr/local/share/chromium \
/usr/local/share/boost \
/usr/lib/jvm 2>/dev/null || true
apt-get clean || true
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* || true
docker system df || true
docker system prune -af --volumes || true
docker builder prune -af || true
df -h / || true
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4 uses: docker/setup-buildx-action@v4
with: with:
+5
View File
@@ -47,6 +47,11 @@ When bumping the opencode version, also bump `OPENCODE_VERSION` in `Dockerfile`
- `update-description` job runs only when both builds succeed (`needs: [build-base, build-omos]`). - `update-description` job runs only when both builds succeed (`needs: [build-base, build-omos]`).
- Tags must be pushed to trigger the publish workflow. The validate workflow runs on push to main and PRs. - Tags must be pushed to trigger the publish workflow. The validate workflow runs on push to main and PRs.
- Smoke tests run on amd64 only (single-arch load into the local daemon). The multi-arch push happens after smoke passes. - Smoke tests run on amd64 only (single-arch load into the local daemon). The multi-arch push happens after smoke passes.
- **Gitea Actions runner has ~40 GB disk, often 70%+ used at job start.** All four `load: true` jobs (`validate-base`, `validate-omos`, `smoke-base`, `smoke-omos`) include a `Reclaim runner disk` step that strips catthehacker-resident toolchains and prunes stale docker state before `setup-buildx-action`. Build jobs use a lighter version (push-by-digest doesn't need `docker system prune`). Don't remove these steps without testing on a fresh runner.
- **`docker/build-push-action@v7` with `platforms: linux/amd64,linux/arm64` handles multi-arch push natively in a single job** — produces a proper manifest list, no matrix or merge step needed. An earlier revision split into per-arch matrix jobs with digest artifacts, but that pattern requires `actions/{upload,download}-artifact@v4+` which Gitea Actions doesn't support (see below).
- **`actions/upload-artifact` and `actions/download-artifact` must stay at @v3 on Gitea.** v4+ uses a GitHub-Enterprise-specific Artifact API; runs fail with `GHESNotSupportedError`. If you need artifacts for a new reason (build logs, SBOMs, etc.), pin @v3 explicitly.
- **Step scripts run under `/bin/sh` (dash), not bash.** Avoid bash-isms like `${VAR//a/b}` parameter-pattern substitution; use POSIX alternatives (`tr`, `sed`) or declare `shell: bash` on the step.
- **`BUILDKIT_PROGRESS=plain`** is set at workflow level on `docker-publish.yml` so arm64-under-QEMU builds log each layer line-by-line. The default collapsed progress UI hides which step is stalled, which made diagnosing earlier hangs expensive.
## Testing changes ## Testing changes
+44
View File
@@ -6,6 +6,50 @@ Tags follow `v{opencode_version}[letter]` — bare tag for the first build on a
--- ---
## v1.14.31d — 2026-05-01
**CI: collapse per-arch matrix back into single multi-arch push jobs.**
- **Fix:** `v1.14.31c`'s per-arch matrix build jobs failed on `Upload digest` with `GHESNotSupportedError: @actions/artifact v2.0.0+, upload-artifact@v4+ and download-artifact@v4+ are not currently supported on GHES`. Gitea Actions only implements the v3-compatible artifact API; `@v4` uses a GitHub-Enterprise-specific backend. Separately, `build-omos linux/arm64` hung silently for 12 minutes in "Set-up job" and then failed with no log output — likely catthehacker image-pull contention between concurrent matrix children on the same runner host.
- Rather than downgrade to `actions/{upload,download}-artifact@v3`, collapsed the per-arch matrix entirely. `docker/build-push-action@v7` with `platforms: linux/amd64,linux/arm64` publishes a proper multi-arch manifest in a single job, so the whole artifact-passing and `imagetools create` merge dance existed only to support a matrix split we no longer need.
- The original matrix split was designed around `load: true` disk exhaustion (v1.14.30b). With `push-by-digest`/`push: true` streaming straight to the registry — no local unpack — the peak disk story is fundamentally different. Validated in v1.14.31b that the reclaim step gives sufficient headroom for a single-job amd64 build; oracle-reviewed call that this should extend to the combined amd64+arm64 push case.
- Workflow goes from 7 jobs to 5 (smoke-base, smoke-omos, build-base, build-omos, update-description). 263 → ~110 lines of YAML in `docker-publish.yml`.
- **Add:** `timeout-minutes: 90` on both build jobs so a hung arm64 build produces an explicit failure with logs rather than runner-default silent truncation.
- **Add:** `BUILDKIT_PROGRESS=plain` at workflow level so arm64-under-QEMU build output is line-by-line (the default collapsed progress UI was obscuring earlier stalls).
- **Add:** `AGENTS.md §CI quirks` documents the Gitea-specific traps encountered this week: `upload-artifact@v3`-only on Gitea, `/bin/sh` is dash, `build-push-action@v7` does multi-arch natively with comma-separated platforms, reclaim step is mandatory on `load: true` jobs.
- No image changes. Rebuild of v1.14.31 content only.
## v1.14.31c — 2026-05-01
**CI: fix bash-specific parameter expansion and bump omos size threshold.**
- **Fix:** `Derive platform slug` step in the per-arch matrix build jobs (`build-base`, `build-omos`) used `${PLATFORM_PAIR//\//-}` which is a bash parameter-expansion. The runner container executes step scripts via `/bin/sh` (dash), which errored with `Bad substitution`. Rewrote using `tr / -` which is POSIX and behaves identically. Both `build-base` and `build-omos` matrix jobs were blocked on this on `v1.14.31b`.
- **Fix:** smoke-test image-size threshold for the `omos` variant bumped from 3000 MB to 3200 MB. The mempalace-toolkit bake-in added ~100 MB to omos; measured 3107 MB on `v1.14.31b`. All functional smoke checks (opencode, node, mempalace CLIs, toolkit wrappers, oh-my-opencode-slim) pass — this is a guardrail recalibration, not a performance concession. The underlying image genuinely grew.
- The runner-disk reclaim step from v1.14.31b did its job: `smoke-base` and `validate-base` now pass cleanly. Only `smoke-omos` was blocked this iteration, and only on the threshold.
- No image changes beyond what shipped in v1.14.31. Rebuild of v1.14.31 content only.
## v1.14.31b — 2026-05-01
**CI: reclaim runner disk before `load: true` smoke builds.**
- **Fix:** v1.14.31's publish workflow and the `validate` workflow both hit `No space left on device` on the single-arch amd64 smoke/validate builds (`/opt/uv-tools/mempalace/lib/python3.13/site-packages/hf_xet/hf_xet.abi3.so`, `/usr/local/bin/git-lfs`). Root cause is not the build itself but the `load: true` step: peak disk during export equals tarball + unpacked image + buildx cache, and the image has crossed the ~3 GB threshold where this no longer fits in the ~12 GB of free space the runner container starts with. The v1.14.30c refactor split multi-arch into per-arch push-by-digest jobs (which don't `load`), but the smoke gates still do and still hit the wall.
- Added a `Reclaim runner disk` step to all four `load: true` jobs (`validate-base`, `validate-omos`, `smoke-base`, `smoke-omos`). The step strips `catthehacker/ubuntu:act-latest`-resident toolchains we never use (hosted-tool-cache, dotnet, android, powershell, swift, ghc, jvm, microsoft, chromium, boost) and runs `docker system prune -af --volumes` + `docker builder prune -af` against the runner's dockerd before `setup-buildx-action`. Expected reclaim is 612 GB depending on what's resident.
- Added workflow-level `concurrency: { group: ..., cancel-in-progress: false }` on `docker-publish.yml` so concurrent tag pushes can't race `docker system prune` in one job against an in-flight buildx cache in another.
- Pruning is deliberately kept out of the per-arch matrix push-by-digest jobs (`build-base`/`build-omos`) — those don't need it (no `load: true`), and pruning in parallel jobs risks one job nuking another's cache.
- **Follow-up** (not in this release): image-size reduction via a dedicated `uv tool install mempalace` build stage (strips uv's cache from the final image), pinning `mempalace-toolkit` to a commit SHA with `--depth=1 --filter=blob:none`, and auditing whether `hf_xet` is actually required by mempalace at runtime. These will ship in the next release that rebases on a new opencode version.
- No image changes. Rebuild of v1.14.31 content only.
## v1.14.31 — 2026-05-01
Bump opencode to 1.14.31.
**CI infrastructure: split multi-arch publish across separate runners.**
- **Fix:** The `publish` workflow exhausted runner disk space on `v1.14.30b` and would have hit the same wall on any subsequent release. Both variants built both architectures on a single `catthehacker/ubuntu:act-latest` container with ~40 GB of shared overlay space, and the peak disk footprint during the nodejs dpkg unpack / git-lfs layer export pushed it over the edge (`No space left on device`). The mempalace-toolkit bake-in from v1.14.30b added the final straw; the underlying issue is that QEMU-emulated arm64 layers were stored alongside the amd64 build on the same runner.
- `docker-publish.yml` refactored to the canonical `push-by-digest` + manifest-merge pattern: smoke test (amd64) runs on its own runner, each `(variant × arch)` push target runs on its own fresh runner with `outputs: type=image,...,push-by-digest=true,push=true` (no local image store), then a tiny merge job assembles the multi-arch manifest with `docker buildx imagetools create` from digest artifacts.
- Per-runner disk peak is now roughly one-quarter of the old single-job peak. The four Docker Hub tags produced per release (`vX.Y.Z[n]`, `latest`, `vX.Y.Z[n]-omos`, `latest-omos`) are unchanged.
- Also parallelizes the amd64 and arm64 builds, so wall-clock time for a release should drop noticeably despite the added merge hop.
## v1.14.30b — 2026-04-30 ## v1.14.30b — 2026-04-30
**Bake mempalace-toolkit wrappers into the image.** **Bake mempalace-toolkit wrappers into the image.**
+1 -1
View File
@@ -5,7 +5,7 @@ ARG DEBIAN_VERSION=trixie-slim
FROM debian:${DEBIAN_VERSION} AS base FROM debian:${DEBIAN_VERSION} AS base
ARG TARGETARCH ARG TARGETARCH
ARG OPENCODE_VERSION=1.14.30 ARG OPENCODE_VERSION=1.14.31
LABEL maintainer="joakimp" LABEL maintainer="joakimp"
LABEL description="Portable opencode developer container" LABEL description="Portable opencode developer container"
+5 -2
View File
@@ -214,9 +214,12 @@ SIZE_BYTES=$(docker image inspect --format='{{.Size}}' "$IMAGE")
SIZE_MB=$((SIZE_BYTES / 1024 / 1024)) SIZE_MB=$((SIZE_BYTES / 1024 / 1024))
echo " Uncompressed size: ${SIZE_MB} MB" echo " Uncompressed size: ${SIZE_MB} MB"
# Thresholds (uncompressed): base 2500 MB, omos 3000 MB. Adjust as image content evolves. # Thresholds (uncompressed): base 2500 MB, omos 3200 MB. Adjust as image content evolves.
# omos bumped 3000→3200 on v1.14.31c — mempalace-toolkit bake-in pushed the
# omos variant to ~3.1 GB. Functional smoke checks all pass; this is a
# guardrail, not a performance limit.
THRESHOLD=2500 THRESHOLD=2500
[ "$VARIANT" = "omos" ] && THRESHOLD=3000 [ "$VARIANT" = "omos" ] && THRESHOLD=3200
if [ "$SIZE_MB" -gt "$THRESHOLD" ]; then if [ "$SIZE_MB" -gt "$THRESHOLD" ]; then
fail "image size ${SIZE_MB} MB exceeds threshold ${THRESHOLD} MB for variant=$VARIANT" fail "image size ${SIZE_MB} MB exceeds threshold ${THRESHOLD} MB for variant=$VARIANT"
else else