Compare commits
20 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| cb50e6ea60 | |||
| 1fe5b5df91 | |||
| 6cc2670a93 | |||
| 51ec4a88cf | |||
| be2a16834c | |||
| a16da2f041 | |||
| 608304c3de | |||
| 668592da0d | |||
| 3cbcb44cf5 | |||
| 73a7f96056 | |||
| f7c34091b1 | |||
| 4cce39d167 | |||
| 72d2c99885 | |||
| 80e57d732b | |||
| 19f8c043bd | |||
| 90e5a1f5d0 | |||
| b6e4d89a2c | |||
| 8f2c9f5112 | |||
| 60eb49469e | |||
| 18b9c9c549 |
+40
-8
@@ -30,10 +30,10 @@ The split-base architecture is what the `docker-publish-split.yml` workflow exer
|
||||
┌──────────────────┐
|
||||
│ base-decide │ compute base-<hash>;
|
||||
│ │ probe Docker Hub.
|
||||
│ hash inputs: │
|
||||
│ Dockerfile.base│
|
||||
│ rootfs/ │
|
||||
│ entrypoint*.sh │
|
||||
│ hash inputs: │ (resolve-versions
|
||||
│ Dockerfile.base│ runs in parallel:
|
||||
│ rootfs/ │ npm view pi/omos
|
||||
│ entrypoint*.sh │ → concrete versions)
|
||||
└────────┬─────────┘
|
||||
│
|
||||
┌─────────────┴─────────────┐
|
||||
@@ -73,19 +73,28 @@ The split-base architecture is what the `docker-publish-split.yml` workflow exer
|
||||
└──────────────────────────┘
|
||||
```
|
||||
|
||||
### Step 1: `base-decide`
|
||||
### Step 1: `base-decide` (and `resolve-versions` in parallel)
|
||||
|
||||
Compute a SHA-256 hash over the inputs that determine the base image's
|
||||
content:
|
||||
**`base-decide`** computes a SHA-256 hash over the inputs that determine
|
||||
the base image's content:
|
||||
|
||||
```sh
|
||||
{
|
||||
cat Dockerfile.base
|
||||
find rootfs -type f -print0 | sort -z | xargs -0 cat
|
||||
find rootfs -type f \
|
||||
! -path '*/__pycache__/*' \
|
||||
! -name '*.pyc' \
|
||||
! -name '.DS_Store' \
|
||||
! -name '._*' \
|
||||
-print0 | sort -z | xargs -0 cat
|
||||
cat entrypoint.sh entrypoint-user.sh
|
||||
} | sha256sum | cut -c1-12
|
||||
```
|
||||
|
||||
Junk filters keep the local recompute reproducible against CI's clean
|
||||
checkout — `__pycache__/*.pyc` and macOS metadata files (`.DS_Store`,
|
||||
`._AppleDouble`) are gitignored but still walked by `find -type f`.
|
||||
|
||||
The 12-character truncated hash becomes `base-<hash>`. Probe Docker Hub
|
||||
for this tag via `docker manifest inspect`:
|
||||
|
||||
@@ -97,6 +106,29 @@ This is the core cache-reuse mechanism. Version-bump-only releases
|
||||
that change anything in the base — apt packages, AWS CLI, Node version,
|
||||
locale list, entrypoint scripts — pay the full base-build cost once.
|
||||
|
||||
**`resolve-versions`** runs alongside `base-decide` (no `needs:`
|
||||
dependency between them) and resolves the floating npm packages whose
|
||||
`*_VERSION` build-args default to `latest`:
|
||||
|
||||
```sh
|
||||
PI_VERSION=$(npm view @earendil-works/pi-coding-agent version)
|
||||
OMOS_VERSION=$(npm view oh-my-opencode-slim version)
|
||||
```
|
||||
|
||||
The outputs (`pi_version`, `omos_version`) are consumed by every variant
|
||||
smoke and build job that installs pi or omos. **Why this exists:** without
|
||||
it, the `npm install -g` RUN layer in `Dockerfile.variant` hashes
|
||||
identically across builds (same ARG default, same command string), so
|
||||
the registry buildcache silently reuses the layer from whatever upstream
|
||||
version was current when the cache was first populated. This is the
|
||||
cache-hit silent-regression class of bug that shipped pi-devbox v0.74.0
|
||||
through v0.75.5 with identical image bytes (fixed in pi-devbox v0.75.5b
|
||||
2026-05-23). Currently masked here by `OPENCODE_VERSION` bumping every
|
||||
release (parent-chain cache-key invalidation), but masking would fail on
|
||||
a `vN.N.Nb` opencode-version-unchanged release that only bumps pi or
|
||||
omos. Smoke jobs additionally assert `EXPECTED_PI_VERSION` /
|
||||
`EXPECTED_OMOS_VERSION` against the resolved values.
|
||||
|
||||
### Step 2: `build-base` (conditional)
|
||||
|
||||
Only runs when `need_build=true`. Multi-arch (amd64 + arm64) build of
|
||||
|
||||
@@ -63,10 +63,19 @@ jobs:
|
||||
run: |
|
||||
# Hash inputs that determine the base image's contents.
|
||||
# Order is fixed via `find -print0 | sort -z` for reproducibility.
|
||||
# Junk filters: __pycache__/*.pyc and macOS metadata (.DS_Store,
|
||||
# ._AppleDouble) are gitignored locally but still picked up by
|
||||
# `find rootfs -type f`, which would diverge the local hash from
|
||||
# CI's clean checkout. Exclude them defensively here.
|
||||
HASH=$(
|
||||
{
|
||||
cat Dockerfile.base
|
||||
find rootfs -type f -print0 2>/dev/null | sort -z | xargs -0 cat 2>/dev/null
|
||||
find rootfs -type f \
|
||||
! -path '*/__pycache__/*' \
|
||||
! -name '*.pyc' \
|
||||
! -name '.DS_Store' \
|
||||
! -name '._*' \
|
||||
-print0 2>/dev/null | sort -z | xargs -0 cat 2>/dev/null
|
||||
cat entrypoint.sh entrypoint-user.sh
|
||||
} | sha256sum | cut -c1-12
|
||||
)
|
||||
@@ -93,6 +102,42 @@ jobs:
|
||||
echo "Base tag ${IMAGE}:${{ steps.compute.outputs.base_tag }} missing — will build."
|
||||
fi
|
||||
|
||||
# ── Phase 1b: resolve floating npm versions (pi, omos) to concrete
|
||||
# versions so the variant build-args carry a different value when an
|
||||
# upstream package bumps. Without this, when PI_VERSION / OMOS_VERSION
|
||||
# default to 'latest', the docker/build-push-action build-arg string
|
||||
# is byte-identical across builds, so the resulting layer-hash is
|
||||
# identical, so the registry buildcache silently reuses the layer
|
||||
# from whatever pi/omos version was current when the cache was first
|
||||
# populated. Same class of bug as pi-devbox v0.74.0..v0.75.5 (fixed in
|
||||
# v0.75.5b 2026-05-23). Currently masked here because OPENCODE_VERSION
|
||||
# is hard-coded in Dockerfile.variant and bumps every release —
|
||||
# invalidating the parent-chain cache key for the pi/omos layers — but
|
||||
# that masking would fail the moment we cut a vN.N.Nb opencode-version-
|
||||
# unchanged release that only bumps pi or omos. Fix is preventative.
|
||||
resolve-versions:
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
outputs:
|
||||
pi_version: ${{ steps.resolve.outputs.pi_version }}
|
||||
omos_version: ${{ steps.resolve.outputs.omos_version }}
|
||||
steps:
|
||||
- name: Resolve pi + omos versions from npm registry
|
||||
id: resolve
|
||||
run: |
|
||||
set -eu
|
||||
# Query the npm registry directly via curl+jq rather than `npm view`.
|
||||
# catthehacker/ubuntu:act-latest ships Node/npm under /opt/acttoolcache/
|
||||
# and adds it to PATH only via /etc/environment — which act_runner never
|
||||
# sources (it reads the Docker image's ENV instructions, not /etc/environment).
|
||||
# curl and jq are both guaranteed present in every job in this workflow.
|
||||
PI_VERSION=$(curl -sf "https://registry.npmjs.org/@earendil-works%2Fpi-coding-agent/latest" | jq -r '.version')
|
||||
OMOS_VERSION=$(curl -sf "https://registry.npmjs.org/oh-my-opencode-slim/latest" | jq -r '.version')
|
||||
echo "pi_version=${PI_VERSION}" >> "$GITHUB_OUTPUT"
|
||||
echo "omos_version=${OMOS_VERSION}" >> "$GITHUB_OUTPUT"
|
||||
echo "Resolved PI_VERSION=${PI_VERSION}, OMOS_VERSION=${OMOS_VERSION}"
|
||||
|
||||
# ── Phase 2: build & push base (multi-arch), only when needed ──────
|
||||
build-base:
|
||||
needs: [base-decide]
|
||||
@@ -139,17 +184,44 @@ jobs:
|
||||
username: ${{ vars.DOCKERHUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKERHUB_TOKEN }}
|
||||
|
||||
- name: Build and push base (multi-arch)
|
||||
uses: docker/build-push-action@v7
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.base
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
tags: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
# Registry cache for faster repeat base rebuilds (e.g. Node bump).
|
||||
cache-from: type=registry,ref=${{ env.IMAGE }}:base-buildcache
|
||||
cache-to: type=registry,ref=${{ env.IMAGE }}:base-buildcache,mode=max
|
||||
- name: Build and push base (multi-arch) — with retry
|
||||
shell: bash
|
||||
env:
|
||||
BASE_TAG_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
# 3-attempt retry around `docker buildx build --push` for transient
|
||||
# registry-1.docker.io blips. Does NOT mask deterministic failures:
|
||||
# a true regression (e.g. cache-export 400 hit 2026-05-23..28) will
|
||||
# fail all 3 attempts identically and the job still fails — by
|
||||
# design.
|
||||
# Registry cache disabled: buildkit's cache-export (mode=max) hits a
|
||||
# reproducible HTTP 400 from registry-1.docker.io on the resumable-
|
||||
# upload PUT (state-token format mismatch on Hub CDN, suspected to
|
||||
# have started ~2026-05-23). Image push itself works fine. We pay
|
||||
# the full base build on every Dockerfile.base change, but the base
|
||||
# tag itself is content-addressed (base-<hash>) so unchanged bases
|
||||
# short-circuit at the probe step and never re-build anyway. Re-
|
||||
# enable when upstream resolves; tracked in CHANGELOG v1.15.12.
|
||||
for attempt in 1 2 3; do
|
||||
echo "==> Build+push attempt ${attempt}/3"
|
||||
if docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file Dockerfile.base \
|
||||
--push \
|
||||
--tag "${BASE_TAG_FULL}" \
|
||||
.; then
|
||||
echo "==> Attempt ${attempt} succeeded"
|
||||
exit 0
|
||||
fi
|
||||
if [[ "${attempt}" -lt 3 ]]; then
|
||||
backoff=$(( attempt * 15 ))
|
||||
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||
sleep "${backoff}"
|
||||
fi
|
||||
done
|
||||
echo "==> All 3 build+push attempts failed"
|
||||
exit 1
|
||||
|
||||
# ── Phase 3: amd64 smoke per variant (gates the multi-arch publish) ─
|
||||
# Each smoke job builds amd64-only against the base tag and runs
|
||||
@@ -202,10 +274,11 @@ jobs:
|
||||
run: bash scripts/smoke-test.sh opencode-devbox:smoke-base --variant base
|
||||
|
||||
smoke-omos:
|
||||
needs: [base-decide, build-base]
|
||||
needs: [base-decide, build-base, resolve-versions]
|
||||
if: |
|
||||
always() &&
|
||||
needs.base-decide.result == 'success' &&
|
||||
needs.resolve-versions.result == 'success' &&
|
||||
(needs.build-base.result == 'success' || needs.build-base.result == 'skipped')
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
@@ -240,13 +313,17 @@ jobs:
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=true
|
||||
INSTALL_PI=false
|
||||
- run: bash scripts/smoke-test.sh opencode-devbox:smoke-omos --variant omos
|
||||
OMOS_VERSION=${{ needs.resolve-versions.outputs.omos_version }}
|
||||
- env:
|
||||
EXPECTED_OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||
run: bash scripts/smoke-test.sh opencode-devbox:smoke-omos --variant omos
|
||||
|
||||
smoke-with-pi:
|
||||
needs: [base-decide, build-base]
|
||||
needs: [base-decide, build-base, resolve-versions]
|
||||
if: |
|
||||
always() &&
|
||||
needs.base-decide.result == 'success' &&
|
||||
needs.resolve-versions.result == 'success' &&
|
||||
(needs.build-base.result == 'success' || needs.build-base.result == 'skipped')
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
@@ -281,13 +358,17 @@ jobs:
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=false
|
||||
INSTALL_PI=true
|
||||
- run: bash scripts/smoke-test.sh opencode-devbox:smoke-with-pi --variant with-pi
|
||||
PI_VERSION=${{ needs.resolve-versions.outputs.pi_version }}
|
||||
- env:
|
||||
EXPECTED_PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||
run: bash scripts/smoke-test.sh opencode-devbox:smoke-with-pi --variant with-pi
|
||||
|
||||
smoke-omos-with-pi:
|
||||
needs: [base-decide, build-base]
|
||||
needs: [base-decide, build-base, resolve-versions]
|
||||
if: |
|
||||
always() &&
|
||||
needs.base-decide.result == 'success' &&
|
||||
needs.resolve-versions.result == 'success' &&
|
||||
(needs.build-base.result == 'success' || needs.build-base.result == 'skipped')
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
@@ -322,7 +403,12 @@ jobs:
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=true
|
||||
INSTALL_PI=true
|
||||
- run: bash scripts/smoke-test.sh opencode-devbox:smoke-omos-with-pi --variant omos-with-pi
|
||||
PI_VERSION=${{ needs.resolve-versions.outputs.pi_version }}
|
||||
OMOS_VERSION=${{ needs.resolve-versions.outputs.omos_version }}
|
||||
- env:
|
||||
EXPECTED_PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||
EXPECTED_OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||
run: bash scripts/smoke-test.sh opencode-devbox:smoke-omos-with-pi --variant omos-with-pi
|
||||
|
||||
# ── Phase 4: multi-arch publish per variant ────────────────────────
|
||||
|
||||
@@ -361,21 +447,43 @@ jobs:
|
||||
fi
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
- uses: docker/build-push-action@v7
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.variant
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
build-args: |
|
||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=false
|
||||
INSTALL_PI=false
|
||||
tags: ${{ steps.tags.outputs.tags }}
|
||||
- name: Build and push variant (with retry)
|
||||
shell: bash
|
||||
env:
|
||||
TAGS: ${{ steps.tags.outputs.tags }}
|
||||
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
TAG_FLAGS=()
|
||||
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||
# 3-attempt retry around `docker buildx build --push` (see build-base
|
||||
# step for full rationale). Variant: base (opencode only).
|
||||
for attempt in 1 2 3; do
|
||||
echo "==> Build+push attempt ${attempt}/3"
|
||||
if docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file Dockerfile.variant \
|
||||
--push \
|
||||
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||
--build-arg "INSTALL_OPENCODE=true" \
|
||||
--build-arg "INSTALL_OMOS=false" \
|
||||
--build-arg "INSTALL_PI=false" \
|
||||
"${TAG_FLAGS[@]}" \
|
||||
.; then
|
||||
echo "==> Attempt ${attempt} succeeded"
|
||||
exit 0
|
||||
fi
|
||||
if [[ "${attempt}" -lt 3 ]]; then
|
||||
backoff=$(( attempt * 15 ))
|
||||
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||
sleep "${backoff}"
|
||||
fi
|
||||
done
|
||||
echo "==> All 3 build+push attempts failed"
|
||||
exit 1
|
||||
|
||||
build-variant-omos:
|
||||
needs: [base-decide, smoke-omos]
|
||||
needs: [base-decide, smoke-omos, resolve-versions]
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
@@ -409,21 +517,44 @@ jobs:
|
||||
fi
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
- uses: docker/build-push-action@v7
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.variant
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
build-args: |
|
||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=true
|
||||
INSTALL_PI=false
|
||||
tags: ${{ steps.tags.outputs.tags }}
|
||||
- name: Build and push variant (with retry)
|
||||
shell: bash
|
||||
env:
|
||||
TAGS: ${{ steps.tags.outputs.tags }}
|
||||
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
TAG_FLAGS=()
|
||||
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||
# 3-attempt retry (see build-base step for rationale). Variant: omos.
|
||||
for attempt in 1 2 3; do
|
||||
echo "==> Build+push attempt ${attempt}/3"
|
||||
if docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file Dockerfile.variant \
|
||||
--push \
|
||||
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||
--build-arg "INSTALL_OPENCODE=true" \
|
||||
--build-arg "INSTALL_OMOS=true" \
|
||||
--build-arg "INSTALL_PI=false" \
|
||||
--build-arg "OMOS_VERSION=${OMOS_VERSION}" \
|
||||
"${TAG_FLAGS[@]}" \
|
||||
.; then
|
||||
echo "==> Attempt ${attempt} succeeded"
|
||||
exit 0
|
||||
fi
|
||||
if [[ "${attempt}" -lt 3 ]]; then
|
||||
backoff=$(( attempt * 15 ))
|
||||
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||
sleep "${backoff}"
|
||||
fi
|
||||
done
|
||||
echo "==> All 3 build+push attempts failed"
|
||||
exit 1
|
||||
|
||||
build-variant-with-pi:
|
||||
needs: [base-decide, smoke-with-pi]
|
||||
needs: [base-decide, smoke-with-pi, resolve-versions]
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
@@ -457,21 +588,44 @@ jobs:
|
||||
fi
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
- uses: docker/build-push-action@v7
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.variant
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
build-args: |
|
||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=false
|
||||
INSTALL_PI=true
|
||||
tags: ${{ steps.tags.outputs.tags }}
|
||||
- name: Build and push variant (with retry)
|
||||
shell: bash
|
||||
env:
|
||||
TAGS: ${{ steps.tags.outputs.tags }}
|
||||
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
TAG_FLAGS=()
|
||||
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||
# 3-attempt retry (see build-base step for rationale). Variant: with-pi.
|
||||
for attempt in 1 2 3; do
|
||||
echo "==> Build+push attempt ${attempt}/3"
|
||||
if docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file Dockerfile.variant \
|
||||
--push \
|
||||
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||
--build-arg "INSTALL_OPENCODE=true" \
|
||||
--build-arg "INSTALL_OMOS=false" \
|
||||
--build-arg "INSTALL_PI=true" \
|
||||
--build-arg "PI_VERSION=${PI_VERSION}" \
|
||||
"${TAG_FLAGS[@]}" \
|
||||
.; then
|
||||
echo "==> Attempt ${attempt} succeeded"
|
||||
exit 0
|
||||
fi
|
||||
if [[ "${attempt}" -lt 3 ]]; then
|
||||
backoff=$(( attempt * 15 ))
|
||||
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||
sleep "${backoff}"
|
||||
fi
|
||||
done
|
||||
echo "==> All 3 build+push attempts failed"
|
||||
exit 1
|
||||
|
||||
build-variant-omos-with-pi:
|
||||
needs: [base-decide, smoke-omos-with-pi]
|
||||
needs: [base-decide, smoke-omos-with-pi, resolve-versions]
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
@@ -505,18 +659,43 @@ jobs:
|
||||
fi
|
||||
echo "EOF"
|
||||
} >> "$GITHUB_OUTPUT"
|
||||
- uses: docker/build-push-action@v7
|
||||
with:
|
||||
context: .
|
||||
file: Dockerfile.variant
|
||||
platforms: linux/amd64,linux/arm64
|
||||
push: true
|
||||
build-args: |
|
||||
BASE_IMAGE=${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
INSTALL_OPENCODE=true
|
||||
INSTALL_OMOS=true
|
||||
INSTALL_PI=true
|
||||
tags: ${{ steps.tags.outputs.tags }}
|
||||
- name: Build and push variant (with retry)
|
||||
shell: bash
|
||||
env:
|
||||
TAGS: ${{ steps.tags.outputs.tags }}
|
||||
BASE_IMAGE_FULL: ${{ env.IMAGE }}:${{ needs.base-decide.outputs.base_tag }}
|
||||
PI_VERSION: ${{ needs.resolve-versions.outputs.pi_version }}
|
||||
OMOS_VERSION: ${{ needs.resolve-versions.outputs.omos_version }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
TAG_FLAGS=()
|
||||
while IFS= read -r t; do [[ -n "$t" ]] && TAG_FLAGS+=( -t "$t" ); done <<< "${TAGS}"
|
||||
# 3-attempt retry (see build-base step for rationale). Variant: omos-with-pi.
|
||||
for attempt in 1 2 3; do
|
||||
echo "==> Build+push attempt ${attempt}/3"
|
||||
if docker buildx build \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file Dockerfile.variant \
|
||||
--push \
|
||||
--build-arg "BASE_IMAGE=${BASE_IMAGE_FULL}" \
|
||||
--build-arg "INSTALL_OPENCODE=true" \
|
||||
--build-arg "INSTALL_OMOS=true" \
|
||||
--build-arg "INSTALL_PI=true" \
|
||||
--build-arg "PI_VERSION=${PI_VERSION}" \
|
||||
--build-arg "OMOS_VERSION=${OMOS_VERSION}" \
|
||||
"${TAG_FLAGS[@]}" \
|
||||
.; then
|
||||
echo "==> Attempt ${attempt} succeeded"
|
||||
exit 0
|
||||
fi
|
||||
if [[ "${attempt}" -lt 3 ]]; then
|
||||
backoff=$(( attempt * 15 ))
|
||||
echo "==> Attempt ${attempt} failed, sleeping ${backoff}s before retry"
|
||||
sleep "${backoff}"
|
||||
fi
|
||||
done
|
||||
echo "==> All 3 build+push attempts failed"
|
||||
exit 1
|
||||
|
||||
# ── Phase 5: promote base-<hash> → base-latest (manifest copy only) ─
|
||||
promote-base-latest:
|
||||
@@ -526,12 +705,40 @@ jobs:
|
||||
- build-variant-omos
|
||||
- build-variant-with-pi
|
||||
- build-variant-omos-with-pi
|
||||
if: ${{ github.ref_type == 'tag' || inputs.promote_latest == 'true' }}
|
||||
# Skip on cache-hit base builds: when need_build=false, base-latest
|
||||
# already points at the same digest as base-<hash>, so the retag is
|
||||
# a tautology and any transient failure of it is purely cosmetic.
|
||||
# Manual workflow_dispatch with promote_latest=true overrides this
|
||||
# gate as an escape hatch (e.g., if base-latest got hand-deleted).
|
||||
#
|
||||
# `always()` wrapper + explicit base-variant success check protects
|
||||
# against the gitea-Actions default of "skipped need => skip dependent":
|
||||
# a partial-publish run (e.g., omos-with-pi smoke fails) shouldn't
|
||||
# prevent the base-latest alias from advancing on a real base rebuild.
|
||||
if: |
|
||||
always() &&
|
||||
needs.build-variant-base.result == 'success' &&
|
||||
(inputs.promote_latest == 'true' ||
|
||||
(github.ref_type == 'tag' && needs.base-decide.outputs.need_build == 'true'))
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
steps:
|
||||
- uses: imjasonh/setup-crane@v0.4
|
||||
# Direct pinned install instead of imjasonh/setup-crane@v0.4. The
|
||||
# action's bootstrap script calls api.github.com/.../releases/latest
|
||||
# to discover the crane version, which periodically rate-limits and
|
||||
# produces tag=null → download from .../download/null/... → 404 →
|
||||
# 'gzip: unexpected end of file' → exit 2. Pinning removes the
|
||||
# runtime dependency on GitHub API entirely. Bump CRANE_VERSION
|
||||
# deliberately when you want updates.
|
||||
- name: Install crane (pinned)
|
||||
env:
|
||||
CRANE_VERSION: v0.21.6
|
||||
run: |
|
||||
set -eux
|
||||
curl -fsSL "https://github.com/google/go-containerregistry/releases/download/${CRANE_VERSION}/go-containerregistry_Linux_x86_64.tar.gz" \
|
||||
| tar -xz -C /usr/local/bin crane
|
||||
crane version
|
||||
- name: Login (crane)
|
||||
run: |
|
||||
crane auth login docker.io \
|
||||
@@ -550,7 +757,17 @@ jobs:
|
||||
- build-variant-omos
|
||||
- build-variant-with-pi
|
||||
- build-variant-omos-with-pi
|
||||
if: ${{ github.ref_type == 'tag' || inputs.promote_latest == 'true' }}
|
||||
# Run when at least the base variant published — don't let a single
|
||||
# variant failure (e.g., omos-with-pi smoke threshold) prevent Hub
|
||||
# description refresh for the other variants that did publish.
|
||||
# Without this `always()` wrapper, gitea Actions' default behavior
|
||||
# of "skipped need => skip dependent" cascades from any failed/
|
||||
# skipped build-variant-* into update-description, and the Hub
|
||||
# description goes stale on partial-publish releases.
|
||||
if: |
|
||||
always() &&
|
||||
needs.build-variant-base.result == 'success' &&
|
||||
(github.ref_type == 'tag' || inputs.promote_latest == 'true')
|
||||
runs-on: ubuntu-latest
|
||||
container:
|
||||
image: catthehacker/ubuntu:act-latest
|
||||
|
||||
@@ -27,11 +27,43 @@ Tags follow `v{opencode_version}[letter]` — e.g. `v1.14.20` for the first buil
|
||||
- **No letter suffix** on the first build of a new opencode version — the bare `v{opencode_version}` tag is the canonical release.
|
||||
- **Letter suffix is the build ordinal**, starting at `b` for the second build. The letter `a` is **never used** — think of the suffix as counting rebuilds: `b = 2nd, c = 3rd, d = 4th, …`. For opencode version `1.14.20`: first build `v1.14.20`, second `v1.14.20b`, third `v1.14.20c`, and so on.
|
||||
- A letter suffix is only used for container-level rebuilds — tooling changes, CVE fixes, doc-driven rebuilds, entrypoint bugfixes — that don't change the underlying opencode version.
|
||||
- **Pre-flight check before cutting any non-letter-suffixed tag** — verify the bump is real:
|
||||
```bash
|
||||
npm view opencode-ai version # must equal the X.Y.Z in your tag
|
||||
```
|
||||
If the npm version equals the *previous* release's `X.Y.Z`, you're cutting a letter-suffix rebuild (`vX.Y.Zc`, `vX.Y.Zd`, …), not a new minor. **A bare `vX.Y.Z` tag is a claim that opencode upstream just released `X.Y.Z`** — if that claim is wrong, future opencode releases will collide with your tag namespace and the version-tracking story breaks.
|
||||
|
||||
Cautionary example: 2026-05-28 morning, `v1.15.12` was cut while opencode-ai was still at `1.15.11`. The commit message itself acknowledged "OPENCODE_VERSION stays at 1.15.11" but tagged `v1.15.12` anyway. Re-cut as `v1.15.11c` the same afternoon (see CHANGELOG). The `v1.15.12` git tag and Hub images stayed as historical artifacts; the slip cost a CI cycle and a CHANGELOG-rewrite. **Run the npm view check at the top of every release-day cut.**
|
||||
|
||||
CI produces eight Docker Hub tags per release: `vX.Y.Z[n]`, `latest`, `vX.Y.Z[n]-omos`, `latest-omos`, `vX.Y.Z[n]-with-pi`, `latest-with-pi`, `vX.Y.Z[n]-omos-with-pi`, `latest-omos-with-pi` — one tag pair (versioned + floating alias) per build variant.
|
||||
|
||||
When bumping the opencode version, bump `OPENCODE_VERSION` in `Dockerfile.variant` and update the comment in `.env.example` if it names a specific model/version for context.
|
||||
|
||||
## Upstream sources — where to look up release notes
|
||||
|
||||
When drafting a release CHANGELOG entry, pull notes from the **canonical upstream repo for each tracked package**. Getting this wrong leads to thin or wrong release notes; the image bytes are unaffected but the documentation suffers.
|
||||
|
||||
| Package | Canonical upstream | What you'll find there |
|
||||
|---|---|---|
|
||||
| `opencode-ai` (npm) | <https://github.com/anomalyco/opencode/releases> | Per-version release notes with Core / TUI / Desktop / SDK sections, contributor attributions. Some versions have empty bodies (internal/no-user-visible); most do not. |
|
||||
| `@earendil-works/pi-coding-agent` (npm) | The `CHANGELOG.md` shipped inside the npm tarball: `npm pack @earendil-works/pi-coding-agent@<version>` then extract `package/CHANGELOG.md`. | Rich changelog with New Features / Added / Changed / Fixed sections per version. |
|
||||
| Other floated tools (gosu, fzf, bat, eza, zoxide, uv, nvim, gitea-mcp, Go, oh-my-opencode-slim) | Each project's own GitHub releases page | Usually less material per release; quote selectively. |
|
||||
|
||||
**Trap to avoid:** there is a `github.com/sst/opencode` repo that some search results surface; that's a fork (and probably the historical name people associate with opencode given the upstream lineage). It does NOT track the same release timeline. Use `anomalyco/opencode` for opencode release notes.
|
||||
|
||||
Fetch pattern (saved here for muscle memory):
|
||||
|
||||
```bash
|
||||
# Latest stable opencode-ai versions on npm
|
||||
npm view opencode-ai time --json | python3 -c 'import sys,json,re; d=json.load(sys.stdin); print(*sorted([(v,t) for v,t in d.items() if re.fullmatch(r"\d+\.\d+\.\d+",v)], key=lambda x:x[1], reverse=True)[:6], sep="\n")'
|
||||
|
||||
# Release notes for a specific version
|
||||
curl -s https://api.github.com/repos/anomalyco/opencode/releases/tags/v1.15.10 | python3 -c 'import sys,json; print(json.load(sys.stdin).get("body","(empty)"))'
|
||||
|
||||
# pi changelog
|
||||
cd /tmp && npm pack @earendil-works/pi-coding-agent@0.75.5 && tar -xzf earendil-works-pi-coding-agent-0.75.5.tgz package/CHANGELOG.md && head -40 package/CHANGELOG.md
|
||||
```
|
||||
|
||||
## Critical conventions
|
||||
|
||||
- **entrypoint.sh volume ownership loop** — when adding a new named volume mount point, add it to the `for dir in ...` loop in `entrypoint.sh` so root-owned volumes get chowned on startup. The loop writes a `.devbox-owner` sentinel after a successful chown so subsequent starts skip the recursive walk. Users should not touch these files.
|
||||
@@ -43,8 +75,13 @@ When bumping the opencode version, bump `OPENCODE_VERSION` in `Dockerfile.varian
|
||||
- `.env.example` must be hand-updated to match Dockerfile/entrypoint behavior — it is not auto-generated.
|
||||
|
||||
Release-day checklist: README → (regenerate DOCKER_HUB.md only if HUB_TEMPLATE changed) → promote CHANGELOG Unreleased → grep AGENTS.md for stale counts → commit → tag → push tag.
|
||||
- **GitHub/Gitea-sourced binaries float by default** — gosu, fzf, git-lfs, nvim, bat, eza, zoxide, uv, gitea-mcp, Go, oh-my-opencode-slim all default to `latest`. Each build-time install step reads the `/releases/latest` Location redirect (or the go.dev JSON feed for Go) and derives the concrete version. Use the same `ARCH` case-switch pattern for multi-arch support (amd64/arm64). Intentional pins: `OPENCODE_VERSION` (drives the image tag), `NODE_VERSION=22` (major pin), `DEBIAN_VERSION=trixie-slim` (OS base). Adding a new upstream tool: follow the existing floated-version pattern, don't hardcode a specific tag.
|
||||
|
||||
**Between releases the same coupling applies.** Doc drift is not just a release-day concern — a workflow tweak, entrypoint change, or `generate-config.py` refactor can leave any of these four files lying. Before committing a non-release change, grep the docs for references to what you touched: `git diff --name-only HEAD | xargs -I{} grep -l 'thing-you-changed' README.md AGENTS.md DOCKER_HUB.md .gitea/README.md .env.example`. If a doc says "four variants" / "two phases" / "runs on amd64 only" and your change made that no longer true, fix it in the same commit.
|
||||
- **GitHub/Gitea-sourced binaries float by default** — gosu, fzf, git-lfs, gitleaks, nvim, bat, eza, zoxide, uv, gitea-mcp, Go, oh-my-opencode-slim all default to `latest`. Each build-time install step reads the `/releases/latest` Location redirect (or the go.dev JSON feed for Go) and derives the concrete version. Use the same `ARCH` case-switch pattern for multi-arch support (amd64/arm64) — mind project-specific arch-name deviations (gitleaks uses `x64`, bat/eza/zoxide use `x86_64`/`aarch64`, gosu uses `amd64`/`arm64`). Intentional pins: `OPENCODE_VERSION` (drives the image tag), `NODE_VERSION=22` (major pin), `DEBIAN_VERSION=trixie-slim` (OS base). Adding a new upstream tool: follow the existing floated-version pattern, don't hardcode a specific tag.
|
||||
- **Resolved versions are logged by the smoke test** — `scripts/smoke-test.sh` prints a "Resolved component versions" table as its first step. CI logs always capture what got baked into a given image even when ARGs default to `latest`.
|
||||
- **`PI_VERSION` and `OMOS_VERSION` MUST be passed by CI as concrete versions**, not left at the `latest` default. The npm install steps in `Dockerfile.variant` (`npm install -g @earendil-works/pi-coding-agent` / `oh-my-opencode-slim@${OMOS_VERSION}`) produce identical layer-hashes when the ARG values are byte-identical across builds; combined with the registry buildcache (`base-buildcache`) the layer gets reused even when `latest` would have resolved to a newer upstream. This is the same class of bug that bit pi-devbox v0.74.0 → v0.75.5 (silent same-bytes-across-releases regression discovered 2026-05-23, fixed in pi-devbox v0.75.5b). It is currently *masked* in opencode-devbox by `OPENCODE_VERSION` being a hard-coded ARG that bumps every release — that bump invalidates the parent-chain cache key for the downstream pi/omos layers — but the masking would fail the moment a `vN.N.Nb` opencode-version-unchanged release ships that only bumps pi or omos. Preventative fix: `.gitea/workflows/docker-publish-split.yml` has a `resolve-versions` job that runs `npm view @earendil-works/pi-coding-agent version` and `npm view oh-my-opencode-slim version`, exposing concrete values as outputs that every variant smoke + build job consumes via build-args. Smoke tests assert via `EXPECTED_PI_VERSION` / `EXPECTED_OMOS_VERSION` env vars — would catch the regression on the next release rather than four releases later. **If you change the variant build-args list, the resolve-versions job, or the smoke EXPECTED_*_VERSION wiring, audit all affected jobs in lockstep.**
|
||||
- **Registry buildkit cache-export is currently disabled** — do NOT re-add `cache-from`/`cache-to` to the `build-base` step in `.gitea/workflows/docker-publish-split.yml` without first verifying that buildkit's `mode=max` cache-export to `registry-1.docker.io` no longer returns HTTP 400 from the Hub CDN edge. The regression surfaced ~2026-05-23 and broke five consecutive opencode-devbox publish attempts (runs #332/333/334/336 + a rerun); root-caused on 2026-05-28 by a manual host-side publish that reproduced the same 400 only on `--cache-to` while image push worked fine. Failure shape is stable (`Offset:0` in the `_state` token, HTML response body = CDN-tier rejection, not registry backend), repo-specific (we're the only repo writing `:base-buildcache` mode=max), and explains why pinning `setup-buildx-action@v4.0.0` didn't help (action pin doesn't change the bundled buildkit version on the catthehacker runner image). Trade-off: dockerfile.base changes pay a full ~3 min rebuild instead of pulling cached layers; unchanged bases short-circuit at the Hub-probe step in `base-decide` and never re-build anyway. Variants don't use registry cache so they're unaffected. Re-enable condition: upstream moby/buildkit fix lands AND a low-risk test run succeeds without 400s. See CHANGELOG v1.15.12 `Unreleased` block for the full diagnostic chain. Manual escape-hatch publish procedure: `docs/manual-host-publish.md`.
|
||||
- **Push steps wrap `docker buildx build --push` in a 3-attempt retry loop** (15s, 30s backoff) for transient `registry-1.docker.io` blips — rate limits, brief 5xx, CDN flap. Implemented as inline `shell: bash` steps with `docker buildx build` raw rather than `docker/build-push-action@v7` so the loop is visible and tweakable. Affects the 1 base + 4 variant push steps in `.gitea/workflows/docker-publish-split.yml`; smoke-test builds (`load: true`, no push) are untouched. **This does NOT mask deterministic failures** — a true regression (like the cache-export 400 of 2026-05-23..28) fails all 3 attempts identically and the job still fails. Orthogonal to the cache-export disablement above: cache-export was about a deterministic protocol mismatch, retry is about absorbing genuine transients. Both are belt-and-braces with the `ci-release-watcher` skill's transient-rerun heuristic. If you change the matrix of push steps, keep the retry wrapper consistent across them — the pattern is duplicated rather than factored out because Gitea Actions doesn't support reusable composite shell steps cleanly.
|
||||
- **Shell scripts use `set -euo pipefail`** — both entrypoints are strict. Errors in volume chown or SSH permission operations are intentionally suppressed with `|| true`.
|
||||
- **MemPalace install path** — installed via `uv tool install` into `/opt/uv-tools/mempalace/`. Both the `mempalace` CLI and the `mempalace-mcp` MCP server binary are shipped as entry points by the mempalace package itself and placed on PATH by uv as shims whose shebangs point at the venv's Python. No hand-rolled wrapper is needed. Do not use `pip install --break-system-packages` — that was the previous approach and has been removed. Do not use `["python3", "-m", "mempalace.mcp_server"]` in `opencode.jsonc` — system Python can't import from the uv venv.
|
||||
- **generate-config.py idempotency** — the script MUST never overwrite an existing `opencode.jsonc` or legacy `opencode.json`. Config persists in the `devbox-opencode-config` named volume; accidentally clobbering that file would destroy hand-edits. The smoke test asserts this.
|
||||
|
||||
+229
@@ -8,6 +8,235 @@ Tags follow `v{opencode_version}[letter]` — bare tag for the first build on a
|
||||
|
||||
## Unreleased
|
||||
|
||||
_(no changes since v1.15.11c)_
|
||||
|
||||
## v1.15.11c — 2026-05-28
|
||||
|
||||
**Re-cut of v1.15.12 to fix a versioning-scheme violation.** The morning's v1.15.12 release was tagged in error: `opencode-ai` stayed at `1.15.11` upstream (no 1.15.12 exists on npm), so per the project's `v{opencode_version}[letter]` scheme this should have been the third container build on opencode 1.15.11 — `v1.15.11c` — not a new minor version bump. The `v1.15.12` git tag and the eight `v1.15.12*` / `latest*` Docker Hub images remain as historical artifacts but are superseded by this release. Future builds on opencode 1.15.11 continue the letter sequence as `v1.15.11d`, `v1.15.11e`, … — v1.15.12 will only be reused if and when opencode upstream actually releases 1.15.12.
|
||||
|
||||
Content inherited from v1.15.12 (see that block below for the full diagnostic chain on the v4.0.0 pin disproof and the manual host-side publish):
|
||||
|
||||
- pi `0.75.5` → `0.76.0`.
|
||||
- `setup-buildx-action` pin reverted from `@v4.0.0` back to `@v4` (the v1.15.11b regression hypothesis was disproven).
|
||||
- Inheritance from base: SSH ControlMaster on a writable socket path, gitleaks, git-crypt.
|
||||
- Cache-hit silent same-bytes regression fix carried forward from v0.75.5b's pattern.
|
||||
|
||||
Additional changes since v1.15.12 (afternoon 2026-05-28 followup work):
|
||||
|
||||
### Hub-push regression — root cause identified, CI fixed
|
||||
|
||||
The `400 Bad request` from `registry-1.docker.io` that broke CI publishing across runs #332/333/334/336 (and forced v1.15.12 to ship via manual host-side push) is **buildkit's registry cache-export with `mode=max`**, not the image push itself.
|
||||
|
||||
**Diagnostic that nailed it:** the manual v1.15.12 publish from an Orbstack host reproduced the exact same 400 — but only on the cache-export step. Image layers pushed cleanly (911s for the base, all variants succeeded). Dropping `--cache-to` from the manual script let the publish complete. Running the same buildx version against the same Hub account from the same network, the only differential was cache export vs. image export.
|
||||
|
||||
This explains every observation:
|
||||
|
||||
- Failure shape stable across attempts (`Offset:0`, HTML body, CDN-tier rejection): cache-export protocol-level mismatch, not transient network or per-blob corruption.
|
||||
- Repo-specific (`joakimp/opencode-devbox` only): we're the only Hub repo currently writing a `:base-buildcache` tag with `mode=max`.
|
||||
- Started ~2026-05-23: lines up with buildx 0.34.x rolling out and bundling moby/buildkit v0.30.0, which changed the `_state` token format on resumable cache uploads.
|
||||
- Image push works fine: cache-export is a separate codepath using a different manifest/layer scheme.
|
||||
- Action-pin to `setup-buildx-action@v4.0.0` didn't help: that pin pulls older actions-toolkit, but the bundled buildkit was still 0.34.x via Buildx CLI on the runner image. Pin was correctly disproven by run #336.
|
||||
|
||||
### Workflow change
|
||||
|
||||
- **`.gitea/workflows/docker-publish-split.yml`** — registry cache (`cache-from`/`cache-to`) removed from the `build-base` step. Comment in place documenting the regression and the re-enable condition. Variants don't use registry cache so they're untouched. The base tag is content-addressed (`base-<hash>` derived from Dockerfile.base + rootfs/* + entrypoint*.sh) so unchanged bases short-circuit at the Hub-probe step in `base-decide` and never re-build anyway — the lost cache only affects the rare case of a Dockerfile.base change, where we now pay the full ~3 min build instead of pulling cached layers. Acceptable trade-off vs. broken publishes.
|
||||
|
||||
Next tag push (e.g. v1.15.13) is expected to publish cleanly via Gitea CI again. validate.yml on this main push will be the first real-time test of the smoke side; full publish path will be tested on the next opencode bump or by a deliberate letter-suffix re-tag.
|
||||
|
||||
### Status of earlier suspects
|
||||
|
||||
- ~~`setup-buildx-action@v4.1.0`~~ — disproven by v1.15.11b CI run #336 with v4.0.0 pin failing identically. Pin reverted in v1.15.12. Not the regressor.
|
||||
- ~~`@docker/actions-toolkit 0.79.0 → 0.90.0`~~ — rolled back via the action pin; same failure. Not the regressor.
|
||||
- ~~Account / repo / Hub-CDN globally~~ — local pushes from developer host succeed. Always was healthy.
|
||||
- ~~`catthehacker/ubuntu:act-latest`~~ / ~~act-runner egress~~ — manual publish from host reproduced the same 400, ruling out runner-side network. Not the cause.
|
||||
- **Confirmed:** buildkit cache-export protocol (mode=max) hitting Hub-CDN edge rejection. Workaround: don't export cache to registry. Long-term: track moby/buildkit upstream for protocol fix or switch to GHA cache (not portable to Gitea Actions).
|
||||
|
||||
### Docs: manual host-publish runbook + script archive
|
||||
|
||||
- `docs/manual-host-publish.sh` — the literal script that shipped v1.15.12 from a developer Mac via Orbstack, preserved as-is.
|
||||
- `docs/manual-host-publish.md` — runbook explaining when to reach for the escape hatch, the four constants to edit (`RELEASE_TAG`, `BASE_HASH`, `PI_VERSION`, `OMOS_VERSION`), three sources for `BASE_HASH` (CI's `base-decide` log = canonical, Hub `base-latest` probe, local recompute matching CI's exact recipe including `__pycache__`/`.DS_Store`/`._*` junk filters), and adaptations for pi-devbox / letter-suffix rebuilds / partial-failure single-variant recovery.
|
||||
- `AGENTS.md` — new Critical conventions bullet documenting that `cache-from`/`cache-to` is currently disabled, why, and the re-enable condition.
|
||||
|
||||
### CI: workflow-level retry around `docker buildx build --push`
|
||||
|
||||
All five push steps in `.gitea/workflows/docker-publish-split.yml` (1 base + 4 variants) are now wrapped in a 3-attempt retry loop with backoff (15s, 30s) as belt-and-braces against transient `registry-1.docker.io` blips. Replaces the `docker/build-push-action@v7` invocations with `shell: bash` steps that run `docker buildx build --push` directly so the loop is visible and tweakable. Smoke-test build steps (`load: true`, no push) are unchanged — they don't suffer from registry-side flakiness.
|
||||
|
||||
Does **not** mask deterministic failures: a true regression (e.g. the cache-export 400 documented above) will fail all 3 attempts identically and the job still fails by design. Belt-and-braces with the workflow-level retry-on-failure rerun heuristic in the `ci-release-watcher` skill, which catches transient-shaped runner-side failures separately. No image-side change.
|
||||
|
||||
### AGENTS.md addition: pre-flight scheme check
|
||||
|
||||
New "Versioning scheme" subsection documenting the **mandatory `npm view opencode-ai version` pre-flight check** before cutting any non-letter-suffixed tag, with this slip cited as the cautionary example.
|
||||
|
||||
---
|
||||
|
||||
## v1.15.12 — 2026-05-28
|
||||
|
||||
> **Note (2026-05-28 PM):** this tag violates the project's `v{opencode_version}[letter]` versioning scheme — there is no `opencode-ai@1.15.12` on npm; OPENCODE_VERSION stayed at 1.15.11 across this build. Re-cut as `v1.15.11c` at HEAD per the scheme. The git tag and Hub images for `v1.15.12*` remain as historical artifacts but are superseded by `v1.15.11c`. See the `v1.15.11c` block above for the corrected release notes.
|
||||
|
||||
Manual-published release. Reverts the `setup-buildx-action@v4.0.0` pin from v1.15.11b (hypothesis was disproven — see below) and bumps the bundled `pi-coding-agent` to 0.76.0 via the floating `PI_VERSION=latest` resolution.
|
||||
|
||||
### Why "manual-published"
|
||||
|
||||
v1.15.11b reproduced the exact same Hub `400 Bad request` regression as v1.15.11 (CI run #336, build-base failed twice including a Gitea auto-rerun), confirming `setup-buildx-action@v4.1.0` is **not** the regressor. After four consecutive identical CI failures across two days, the SSH-CM and gitleaks fixes were shipped by hand from a developer host's Orbstack/Docker-Desktop — a path we already knew worked in ~25s for the same multi-arch build to the same Hub account.
|
||||
|
||||
This release ships the same content the runner-side build would have shipped; it just bypasses the broken runner-network → Hub-CDN combo. CI auto-publishing remains broken pending separate runner-side investigation (see [AGENTS.md — known issues](AGENTS.md)).
|
||||
|
||||
### Workflow change
|
||||
|
||||
- **`.gitea/workflows/docker-publish-split.yml`** — all nine `setup-buildx-action@v4.0.0` pins reverted to `@v4`. The pin added no value (failure reproduced) and was holding us off action improvements.
|
||||
|
||||
### Bumped: pi-coding-agent (latest → 0.76.0)
|
||||
|
||||
`PI_VERSION=latest` in `Dockerfile.variant` resolves at build time. 0.76.0 was published 2026-05-27 20:03 UTC. No Dockerfile edit needed; floating-`latest` is intentional so each opencode-devbox release pulls the freshest pi without a manual bump.
|
||||
|
||||
### Hub-push regression — ruled out / still suspect
|
||||
|
||||
**Ruled out:**
|
||||
- `setup-buildx-action@v4.1.0` — v4.0.0 reproduces the failure identically.
|
||||
- `@docker/actions-toolkit 0.79.0 → 0.90.0` — rolled back via the action pin; same failure.
|
||||
- Account / repo / Hub-CDN globally — local pushes from a developer host succeed.
|
||||
- Multi-arch as such — pi-devbox v0.75.5b pushed multi-arch on 2026-05-23.
|
||||
|
||||
**Still suspect:**
|
||||
- `catthehacker/ubuntu:act-latest` runner image (floating, not pinned in workflows).
|
||||
- act-runner host network egress from `runner-2` (sustained CDN-edge rejection from this specific source IP).
|
||||
- buildx 0.34.x's signed `_state` token format hitting a Hub-edge WAF/length rule that didn't apply to 0.33.x.
|
||||
- Hub-side per-repo state for `joakimp/opencode-devbox` specifically (other Hub repos from the same account work).
|
||||
|
||||
Four failing runs share the exact failure shape: HTTP 400 with HTML body (CDN-tier, not registry backend) on the very first PUT (`Offset:0`) of the resumable layer-blob upload. UUIDs and `_state` signatures differ across attempts — only the failure pattern is stable.
|
||||
|
||||
---
|
||||
|
||||
## v1.15.11b — 2026-05-27
|
||||
|
||||
Container-level rebuild of v1.15.11. The original v1.15.11 release-day publish failed three times in a row (CI runs #332/333/334) with identical `400 Bad request` responses from `registry-1.docker.io` on the buildx layer-blob PUT. Build itself succeeded 30/30 each time; only the multi-arch push failed. Triaged on 2026-05-27 evening:
|
||||
|
||||
- **Local multi-arch buildx push from a developer host succeeds in ~25s** — same Hub account, same multi-arch path. Account, repo, and Hub-CDN are all healthy.
|
||||
- **Last known-good Gitea Actions Hub push: 2026-05-23 ~20:26 UTC** (`pi-devbox v0.75.5b`). All Gitea-runner-driven pushes since 2026-05-24 have failed identically.
|
||||
- **Smoking gun candidate:** `docker/setup-buildx-action@v4` floats to `v4.1.0` (published 2026-05-22 16:00 UTC). Action-resolver caches on the runner appear to have rolled forward to v4.1.0 sometime between the May 23 success and the first May 24 failure. v4.1.0 ships a newer bundled buildx/buildkit which may be using a different push protocol that trips Hub's CDN URI-length cap (the failing `_state` query string is ~1.4 KB).
|
||||
|
||||
### Workflow change
|
||||
|
||||
- **`.gitea/workflows/docker-publish-split.yml`** — all nine `docker/setup-buildx-action@v4` uses pinned to `@v4.0.0`. `setup-qemu-action@v3` left floating since QEMU wasn't in the suspected blast radius and was working on May 23. If v4.0.0 publishes cleanly we keep the pin and file an upstream buildkit/buildx issue documenting the regression.
|
||||
|
||||
No other source changes — same `OPENCODE_VERSION=1.15.11`, same `Dockerfile.base` and `Dockerfile.variant`, same SSH-CM bake, same gitleaks. v1.15.11 (the original tag) is preserved in the repo as a historical marker of the first publish attempt; v1.15.11b is the canonical release.
|
||||
|
||||
### v1.15.11
|
||||
|
||||
First release on opencode 1.15.11. Also bakes in four devbox-side fixes accumulated since v1.15.10 (SSH ControlMaster on a writable path, gitleaks added to base, CI resolve-versions hardening, CI cache-hit regression fix). Downstream pi-devbox inherits all of these on its next build against `base-latest`.
|
||||
|
||||
### Bumped: opencode 1.15.10 → 1.15.11
|
||||
|
||||
`OPENCODE_VERSION` ARG bumped in `Dockerfile.variant`. Highlights from the upstream release (full notes: <https://github.com/anomalyco/opencode/releases/tag/v1.15.11>):
|
||||
|
||||
- **Core / Improvements** — new `headerTimeout` config for provider requests (10s default for default OpenAI setups); experimental background agents now push updates without polling; remote-backed projects resolve a stable project identity; `modalities.input` / `modalities.output` can be set independently.
|
||||
- **Core / Bugfixes** — dynamically added MCP servers now disconnect cleanly on removal; Google tool calling fixed after upstream tool-ID regression; resumed sessions no longer continue orphaned interrupted tools; OpenAI reasoning summaries render as separate blocks; the `shell` tool now advertises its configured timeout to the model; config loading falls back cleanly when user info is unavailable.
|
||||
- **TUI** — prompt resizes with terminal width (new prompt-size config); accelerated diff-viewer scrolling; external editors open from the worktree directory when available.
|
||||
- **Desktop** — refined v2 home screen, prompt, status popover, and session controls; fixed V2 titlebar errors when a session sync cache was deleted; web deployments no longer run desktop health checks; duplicate server connections are merged.
|
||||
- **Extensions** — new `dispose` hook for plugins; Codex plugin now sends the expected session-ID header.
|
||||
|
||||
No `opencode-devbox`-side changes were required to consume 1.15.11 — pure version bump.
|
||||
|
||||
### Base: SSH ControlMaster default on a writable socket path
|
||||
|
||||
Devboxes typically mount `~/.ssh` from the host as **read-only** (security: keys remain readable but agents can't tamper with config / known_hosts / authorized_keys / plant a malicious ProxyCommand). OpenSSH's default `ControlPath` lands inside `~/.ssh/cm/`, which is unwritable on such mounts — so any attempt to use `ControlMaster auto` (or anything that wants to multiplex) fails with:
|
||||
|
||||
```
|
||||
unix_listener: cannot bind to path /home/.../.ssh/cm/...: Read-only file system
|
||||
kex_exchange_identification: Connection closed by remote host
|
||||
```
|
||||
|
||||
The second line is downstream: when ControlMaster fails the ssh client falls back to a fresh TCP connection, and on residential CGNAT (most European ISPs) the per-(src,dst) concurrent-flow cap (~4) silently drops further SYNs once exceeded — manifesting as banner-exchange timeouts that look like a remote problem.
|
||||
|
||||
- **`Dockerfile.base`** — new section right after the apt block bakes `/etc/ssh/ssh_config.d/00-devbox-controlmaster.conf` with `Host *` defaults: `ControlMaster auto`, `ControlPath /tmp/sshcm/%r@%h:%p`, `ControlPersist 10m`, plus `ServerAliveInterval 30` / `ServerAliveCountMax 6` for resilience to mid-stream NAT timeouts. `/tmp` is per-container and always writable, so the read-only `~/.ssh` mount is left untouched. Debian's stock `/etc/ssh/ssh_config` includes `ssh_config.d/*.conf` *before* its own `Host *` block, so user `~/.ssh/config` overrides still win.
|
||||
- **`entrypoint-user.sh`** — creates `/tmp/sshcm` mode 700 on every container start. `/tmp` is per-container so the dir doesn't survive recreation; baking it into a Dockerfile layer would be wrong. Mode 700 is required — OpenSSH refuses to use a `ControlPath` directory others can write to.
|
||||
- **`scripts/smoke-test.sh`** — two new assertions: (a) the conf file exists at the expected path; (b) `ssh -G example.invalid` reports a `controlpath` rooted at `/tmp/sshcm/`. The second catches the silent regression where something later in the SSH config chain shadows the bake-in.
|
||||
- **No size/threshold impact:** the conf file is ~250 bytes.
|
||||
|
||||
Downstream pi-devbox and any other variant inherits this on its next build against `base-latest`. Discovered while running a recon-shell from inside pi-devbox to a Proxmox node — fresh ssh hit banner timeout, debug output pointed at the read-only socket dir.
|
||||
|
||||
_(Originally landed on `main` 2026-05-24 as commit `668592d`; first ships in v1.15.11.)_
|
||||
|
||||
### Base: gitleaks added; git-crypt confirmed already installed
|
||||
|
||||
`gitleaks` is now baked into `Dockerfile.base` (Go-compiled binary fetched from GitHub releases, same `/releases/latest` redirect-resolution pattern as gosu/fzf/git-lfs/etc.). It pairs with `git-crypt`, which has been installed via apt all along but wasn't asserted by smoke or called out in user-facing docs. Several of the user's repos use both as part of their secret-management setup (gitleaks pre-commit hook + git-crypt for selectively-encrypted canonical config); having them in the devbox means `pi install`-style hooks fire correctly inside the container instead of warning that gitleaks is missing.
|
||||
|
||||
- **`Dockerfile.base`** — new `GITLEAKS_VERSION=latest` ARG + install RUN block right after `git-lfs`. Arch suffix is `x64` (not `x86_64` or `amd64`) on this project; comment in the Dockerfile flags the deviation. Adds ~21 MB to the base layer.
|
||||
- **`scripts/smoke-test.sh`** — adds `git-crypt` and `gitleaks` to the "Resolved component versions" table and to the "Core binaries" assertion list. Now fails fast if either binary disappears from the base.
|
||||
- **`README.md`** — "What's in the image" tree updated to name `gitleaks` alongside `git-crypt` in the dev-tools line.
|
||||
- **No threshold bumps:** 21 MB on a 2500–3700 MB envelope is noise; existing variant thresholds keep their headroom.
|
||||
|
||||
This is a base-layer change — `base-decide` will compute a fresh `base-<hash>`, `build-base` will run on the next release (no cache hit), and all four variants will rebuild against the new base. **Downstream pi-devbox** picks up gitleaks automatically on its next release that resolves `joakimp/opencode-devbox:base-latest` to the new digest — no Dockerfile change needed there.
|
||||
|
||||
### CI: preventative fix for PI_VERSION/OMOS_VERSION cache-hit silent regression
|
||||
|
||||
Mirrors the pi-devbox v0.75.5b fix (2026-05-23) onto the four-variant pipeline here. The `with-pi`, `omos`, and `omos-with-pi` variants all install upstream npm packages (`@earendil-works/pi-coding-agent`, `oh-my-opencode-slim`) whose `*_VERSION` build-args defaulted to `latest`. When the build-arg string is byte-identical across builds, the resulting layer-hash is identical, and the registry buildcache (`base-buildcache` / variant cache-from chain) silently reuses the layer from whatever upstream version was current when the cache was first populated — the same mechanism that caused pi-devbox v0.74.0 through v0.75.5 to ship the same image bytes.
|
||||
|
||||
Currently masked here because `OPENCODE_VERSION` is a hard-coded ARG that bumps every release — changing a parent layer invalidates the downstream cache key for the pi/omos install layers. Masking would fail the moment we cut a `vN.N.Nb` opencode-version-unchanged release that only bumps pi or omos. Filed as a parked followup that bedtime; fixing it preventatively now.
|
||||
|
||||
- **`.gitea/workflows/docker-publish-split.yml`** — new `resolve-versions` job runs `npm view @earendil-works/pi-coding-agent version` and `npm view oh-my-opencode-slim version`, exposing concrete strings as job outputs. All six affected jobs (`smoke-omos`, `smoke-with-pi`, `smoke-omos-with-pi`, `build-variant-omos`, `build-variant-with-pi`, `build-variant-omos-with-pi`) now `needs:` it and pass the concrete versions as `PI_VERSION` / `OMOS_VERSION` build-args. `smoke-base` and `build-variant-base` are unaffected (no pi or omos).
|
||||
- **`scripts/smoke-test.sh`** — new `run_expect` helper asserts an expected substring in command output. The pi-version check uses `EXPECTED_PI_VERSION` when set; the omos check uses `EXPECTED_OMOS_VERSION` against `npm ls -g`. Both env vars are wired from `resolve-versions` outputs in the smoke jobs. Catches the regression on the next release rather than four releases later.
|
||||
- **`Dockerfile.variant`** — comment block above each affected `ARG` (`OPENCODE_VERSION`, `PI_VERSION`, `OMOS_VERSION`) documenting the cache-hit footgun + which ones are CI-resolved vs source-pinned.
|
||||
- **`AGENTS.md`** — new convention bullet explaining the cache-hit class of bug and naming the resolve-versions job + EXPECTED_*_VERSION wiring as the contract to keep in lockstep.
|
||||
|
||||
No image-content change expected on the next release vs what `latest` would have resolved to anyway — this is purely about making sure the cache invalidates correctly going forward.
|
||||
|
||||
## v1.15.10 — 2026-05-23
|
||||
|
||||
opencode 1.15.6 → 1.15.10 bump (four upstream patch releases over two days). Plus implicit pi 0.75.4 → 0.75.5 in the `with-pi` and `omos-with-pi` variants since `PI_VERSION=latest` resolves at build time.
|
||||
|
||||
No image-content changes beyond the version bumps; cache hit expected on `base-35ee5fe7861a` (no `Dockerfile.base` or `rootfs/` edits since v1.14.50b).
|
||||
|
||||
### Notable upstream opencode changes
|
||||
|
||||
Sourced from <https://github.com/anomalyco/opencode/releases> (the upstream this devbox tracks).
|
||||
|
||||
**v1.15.7** — Grok OAuth (SuperGrok) sign-in including device-code login (@Jaaneek). V2 session APIs gain safe error responses with reference IDs (UnknownError, SessionNotFoundError, ServiceUnavailableError) so generic 500s no longer leak config details. Codex OAuth refreshes deduped to avoid repeated refresh failures (@cooper-oai). Native OpenAI OAuth requests restored. Tool schema failures now surface as friendly tool errors. PDF attachment support for Grok. Restored OpenAI reasoning streams. TUI: clearer collapsed-thinking punctuation, new sessions default to local project, single-select question checkmarks no longer collide with labels. Desktop: pinch zoom, new home view + session entry flow + titlebar, log export.
|
||||
|
||||
**v1.15.8** — Upstream release body empty; assumed internal/no user-visible changes.
|
||||
|
||||
**v1.15.9** — Redesigned diff viewer with file tree, **enabled by default**. MCP OAuth configs can set callback port and include configured scopes in client metadata (@sebin). Vertex Anthropic provider uses working `.rep.googleapis.com` endpoints for US/EU multi-region (@JPFrancoia). Many "show clearer error" improvements (default model invalid, missing PTY session, skill invocation failure, installation upgrade failure, project not found via HTTP API, MCP server not found, session busy). Native reasoning continuation metadata preserved across turns. TUI: copy worktree path from command palette, refined diff viewer shortcuts, spinner color aligned with active agent (@OpeOginni). Desktop: tab navigation in titlebar, session status in titlebar, multi-colon callback URL fix (@OpeOginni), debounced VCS refreshes.
|
||||
|
||||
**v1.15.10** — Single fix: restored the legacy production desktop flows for opening projects and starting sessions.
|
||||
|
||||
### Devbox-side notes
|
||||
|
||||
- **Bump:** opencode 1.15.6 → 1.15.10 (`OPENCODE_VERSION` in `Dockerfile.variant`).
|
||||
- **Implicit pi bump:** `with-pi` and `omos-with-pi` variants pick up pi 0.75.5 (one patch release with cleaner read-tool cards, async file tools, more reliable package updates, Bedrock token cap fix, etc.). See [pi-devbox v0.75.5 CHANGELOG](https://gitea.jordbo.se/joakimp/pi-devbox/src/branch/main/CHANGELOG.md) for the full list.
|
||||
- **Smoke threshold check:** `omos-with-pi` threshold remains at 3700 MB (set v1.15.4b 2026-05-18). Four opencode patches plus one pi patch typically add only a few MB across both; not expected to trip. If it does, recovery is the well-worn letter-suffix pattern (v1.15.10b with threshold bump).
|
||||
- Built on the same CI path as v1.15.6 (pinned-crane install on real-base-rebuild, skip-promote-on-cache-hit, update-description-always-on-base-success) — all expected to remain quiet on this cache-hit run.
|
||||
|
||||
### Note on this CHANGELOG vs the v1.15.10 tag snapshot
|
||||
|
||||
The v1.15.10 tag itself was pushed before the upstream release notes were located (originally I checked `sst/opencode` which is a fork; the canonical upstream is `anomalyco/opencode`). The image content under the tag is correct, but the CHANGELOG snapshot at the tag was thinner. This expanded version is on `main` going forward; the tag's snapshot will not be retroactively rewritten.
|
||||
|
||||
## v1.15.6 — 2026-05-21
|
||||
|
||||
opencode 1.15.4 → 1.15.6 bump (two upstream patch releases) plus two workflow improvements that landed on `main` between v1.15.4b and now. No image-content changes beyond the version bump; cache hit expected on `base-35ee5fe7861a` (no `Dockerfile.base` or `rootfs/` edits).
|
||||
|
||||
- **Bump:** opencode 1.15.4 → 1.15.6 (`OPENCODE_VERSION` in `Dockerfile.variant`). The `with-pi` and `omos-with-pi` variants will also implicitly pick up pi 0.75.3 → 0.75.4 since `PI_VERSION=latest` resolves at build time.
|
||||
- **CI: defensive `__pycache__` and macOS-metadata filter in `base-decide` hash compute.** `find rootfs -type f` previously included gitignored junk like `rootfs/__pycache__/*.pyc`, `.DS_Store`, and `._AppleDouble` files — which CI's clean checkout never sees. This bit us during v1.15.4 debugging when a stale `generate-config.cpython-314.pyc` on the local rootfs/ produced `base-3605aa6b6ab1` while CI computed `base-35ee5fe7861a`. The filter is a no-op on a clean tree (verified to still produce `35ee5fe7861a` post-filter), but defends against future stale-pyc / Finder-touched-rootfs hash mismatches. `.gitea/README.md` updated in lockstep. (commit `b6e4d89`)
|
||||
- **AGENTS.md: documentation drift sweep as explicit pre-commit workflow step.** Codifies the rule that non-release commits must also grep docs for stale claims about behaviour they change, with concrete repo-specific drift hotspots. Companion clause added across the wider repo set (cloud-init, ansible, pi-devbox, pi-extensions, pi-toolkit, cli_utils, proxmox) the same day. (commit `90e5a1f`)
|
||||
- **First release that exercises both the pinned-crane install (T14, v1.15.3) and the skip-promote-on-cache-hit guard (T15, v1.15.4) on this CI run path** — still cache-hit on base, so `promote-base-latest` should remain skipped via T15 and the pinned crane install will only fire when a real base rebuild happens.
|
||||
|
||||
## v1.15.4b — 2026-05-18
|
||||
|
||||
Recovery release for v1.15.4 — the `omos-with-pi` variant landed at >3500 MB and tripped the smoke threshold, so `smoke-omos-with-pi` and `build-variant-omos-with-pi` were skipped. The other three variants (base, omos, with-pi) published cleanly. Plus a latent workflow bug fix exposed by the partial publish.
|
||||
|
||||
- **Smoke threshold bump:** `omos-with-pi` 3500 → 3700 MB. Compounded growth: opencode 1.15.0 → 1.15.4 (4 patch versions) plus pi 0.74.0 → 0.75.3 (minor + 3 patches) both added a few MB each, and they sum in the omos-with-pi variant. Same pattern as previous threshold bumps (v1.14.31c, v1.15.0b); restores ~150 MB headroom.
|
||||
- **Workflow fix — `update-description` no longer skips on partial publish.** Pre-existing latent bug: `update-description.needs` includes all four `build-variant-*` jobs, and gitea Actions' default behavior is "skipped need ⇒ skip dependent". When `build-variant-omos-with-pi` got skipped (because its smoke failed), `update-description` cascaded into a skip even though the job's `if:` condition (`tag pushed`) was true. Result: Hub description wasn't refreshed on v1.15.4 despite three variants publishing. Fix: wrap the `if:` in `always() && needs.build-variant-base.result == 'success' && ...` so the job runs as long as the base variant published, regardless of what other variants did.
|
||||
- **Same fix applied to `promote-base-latest`** — had the identical latent bug. Currently masked by the cache-hit skip, but would have surfaced on a real-base-rebuild release with a single failed variant.
|
||||
- No image-side changes from v1.15.4. Cache hit on the same base hash (`base-35ee5fe7861a`).
|
||||
|
||||
## v1.15.4 — 2026-05-18
|
||||
|
||||
opencode 1.15.3 → 1.15.4 bump (one upstream patch release), bundled with the CI hardening that landed on main between v1.15.3 and now.
|
||||
|
||||
- **Bump:** opencode 1.15.3 → 1.15.4 (`OPENCODE_VERSION` in `Dockerfile.variant`).
|
||||
- **CI: pinned crane install in `promote-base-latest`.** Replaced `imjasonh/setup-crane@v0.4` with a direct `curl + tar` install pinned to crane v0.21.6. The action's bootstrap script calls `api.github.com/.../releases/latest` to discover what crane version to install. That call periodically rate-limits and produces `tag=null` → the action downloads `releases/download/null/...` → 404 → `gzip: unexpected end of file` → exit 2. We hit this on v1.15.3 (cosmetic failure since base-latest was already correct from cache hit). Pinned install removes the runtime GitHub API dependency entirely. Bump `CRANE_VERSION` deliberately when wanting updates, same pattern as the other GitHub-sourced binaries in the Dockerfile layer.
|
||||
- **CI: skip `promote-base-latest` on cache-hit base builds.** When the base layer hash hasn't changed (cache-hit on the existing `base-<hash>` from a previous run), `base-latest` already points at the correct digest, so the retag is a tautology. Job now skipped entirely when `needs.base-decide.outputs.need_build == 'false'`. Manual `workflow_dispatch` with `promote_latest: true` overrides the gate as an escape hatch for hand-recovery scenarios.
|
||||
- No image-side changes from the v1.15.3 baseline beyond the opencode npm version. Smoke thresholds unchanged.
|
||||
|
||||
## v1.15.3 — 2026-05-16
|
||||
|
||||
opencode 1.15.0 → 1.15.3 bump (three upstream patch releases).
|
||||
|
||||
@@ -71,6 +71,44 @@ RUN apt-get update && \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# ── SSH client defaults: ControlMaster on a writable socket path ──────
|
||||
# Why this exists: the devbox typically mounts ~/.ssh from the host as
|
||||
# read-only (security: keys are readable, but agents can't tamper with
|
||||
# config / known_hosts / authorized_keys / plant a malicious ProxyCommand).
|
||||
# OpenSSH's default ControlPath is ~/.ssh/cm/... which is unwritable on
|
||||
# such mounts, so any attempt to use ControlMaster fails. Symptoms:
|
||||
# unix_listener: cannot bind to path /home/.../.ssh/cm/...: Read-only file system
|
||||
# kex_exchange_identification: Connection closed by remote host
|
||||
# The latter manifests downstream of CGNAT per-destination flow caps
|
||||
# (~4 concurrent flows on most European residential ISPs) which silently
|
||||
# drop further SYNs once exceeded — making fresh ssh attempts fail with
|
||||
# banner-exchange timeouts that look like a remote problem.
|
||||
#
|
||||
# Fix: set a system-wide default ControlPath in /tmp (per-container,
|
||||
# tmpfs-friendly, always writable) so multiplexing Just Works without
|
||||
# touching the read-only ~/.ssh mount. Per-host overrides in user's
|
||||
# ~/.ssh/config still win — Debian's default /etc/ssh/ssh_config has
|
||||
# `Include /etc/ssh/ssh_config.d/*.conf` *before* the `Host *` block,
|
||||
# so user config can override these defaults if desired.
|
||||
#
|
||||
# ControlPersist=10m means the master socket sticks around 10 min after
|
||||
# the last session closes, so consecutive ssh calls in a workflow reuse
|
||||
# the same TCP flow. Companion entrypoint-user.sh creates /tmp/sshcm
|
||||
# (mode 700) on each container start.
|
||||
RUN mkdir -p /etc/ssh/ssh_config.d && \
|
||||
printf '%s\n' \
|
||||
'# Devbox-baked default. See Dockerfile.base "SSH client defaults".' \
|
||||
'# Override per-host in ~/.ssh/config if the master socket location' \
|
||||
'# needs to differ.' \
|
||||
'Host *' \
|
||||
' ControlMaster auto' \
|
||||
' ControlPath /tmp/sshcm/%r@%h:%p' \
|
||||
' ControlPersist 10m' \
|
||||
' ServerAliveInterval 30' \
|
||||
' ServerAliveCountMax 6' \
|
||||
> /etc/ssh/ssh_config.d/00-devbox-controlmaster.conf && \
|
||||
chmod 644 /etc/ssh/ssh_config.d/00-devbox-controlmaster.conf
|
||||
|
||||
# ── Go-compiled tools (install from GitHub to avoid CVEs in Debian's old Go builds)
|
||||
#
|
||||
# Version policy for the binaries below:
|
||||
@@ -126,6 +164,24 @@ RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "amd64" ;; arm64) echo "arm64" ;;
|
||||
git lfs install --system && \
|
||||
git-lfs --version
|
||||
|
||||
# gitleaks — secret scanner (used as a pre-commit hook in several of the
|
||||
# repos this devbox is meant to operate on; pairs with git-crypt below).
|
||||
# Distributed as a Go-compiled tarball; arch suffix is `x64` (not `x86_64`
|
||||
# or `amd64`) on this project — mind the deviation from the surrounding
|
||||
# tools' naming.
|
||||
ARG GITLEAKS_VERSION=latest
|
||||
RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "x64" ;; arm64) echo "arm64" ;; *) echo "x64" ;; esac) && \
|
||||
V="${GITLEAKS_VERSION}" && \
|
||||
if [ "$V" = "latest" ]; then \
|
||||
V=$(curl -sI --retry 5 --retry-delay 5 --retry-all-errors "https://github.com/gitleaks/gitleaks/releases/latest" | awk 'tolower($1)=="location:" { sub(/\r$/,"",$2); n=split($2,a,"/"); print a[n] }'); \
|
||||
fi && \
|
||||
V="${V#v}" && \
|
||||
[ -n "$V" ] && \
|
||||
echo "Installing gitleaks ${V}" && \
|
||||
curl -fsSL --retry 5 --retry-delay 5 --retry-all-errors "https://github.com/gitleaks/gitleaks/releases/download/v${V}/gitleaks_${V}_linux_${ARCH}.tar.gz" | tar -xz -C /usr/local/bin gitleaks && \
|
||||
chmod +x /usr/local/bin/gitleaks && \
|
||||
gitleaks version
|
||||
|
||||
# neovim — modern text editor
|
||||
ARG NVIM_VERSION=latest
|
||||
RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "x86_64" ;; arm64) echo "arm64" ;; *) echo "x86_64" ;; esac) && \
|
||||
|
||||
+21
-1
@@ -31,8 +31,12 @@ ARG TARGETARCH
|
||||
ARG USER_NAME=developer
|
||||
|
||||
# ── Install opencode via npm ─────────────────────────────────────────
|
||||
# OPENCODE_VERSION is intentionally pinned in this Dockerfile (not
|
||||
# 'latest'). It drives the release tag and gets bumped via a source
|
||||
# edit, so the cache-hit class of bug that bit pi-devbox v0.74.0..
|
||||
# v0.75.5 cannot apply here.
|
||||
ARG INSTALL_OPENCODE=true
|
||||
ARG OPENCODE_VERSION=1.15.3
|
||||
ARG OPENCODE_VERSION=1.15.11
|
||||
RUN if [ "${INSTALL_OPENCODE}" = "true" ]; then \
|
||||
NPM_CONFIG_PREFIX=/usr npm install -g opencode-ai@${OPENCODE_VERSION} && \
|
||||
opencode --version ; \
|
||||
@@ -42,6 +46,18 @@ RUN if [ "${INSTALL_OPENCODE}" = "true" ]; then \
|
||||
# pi-toolkit and pi-extensions are cloned into /opt/. entrypoint-user.sh
|
||||
# runs each repo's install.sh on container start so symlinks land under
|
||||
# ~/.pi/agent/ on the named volume.
|
||||
# PI_VERSION should be passed explicitly by CI as a concrete version
|
||||
# (resolved from `npm view @earendil-works/pi-coding-agent version`,
|
||||
# see .gitea/workflows/docker-publish-split.yml § resolve-versions).
|
||||
# The default `latest` is for local dev convenience only — it has a
|
||||
# known cache-hit footgun when used in registry-cached CI builds: the
|
||||
# resulting build-arg string is byte-identical across builds, the
|
||||
# layer-hash is identical, and the registry buildcache silently reuses
|
||||
# the layer from whatever pi version was current when the cache was
|
||||
# first populated. Currently masked here because OPENCODE_VERSION (a
|
||||
# parent layer) bumps every release; will manifest the moment a
|
||||
# vN.N.Nb opencode-version-unchanged release ships. See pi-devbox
|
||||
# v0.75.5b 2026-05-23 for the discovery + canonical fix.
|
||||
ARG INSTALL_PI=false
|
||||
ARG PI_VERSION=latest
|
||||
ARG PI_TOOLKIT_REF=main
|
||||
@@ -89,6 +105,10 @@ RUN if [ "${INSTALL_GO}" = "true" ]; then \
|
||||
|
||||
# ── Optional: oh-my-opencode-slim (multi-agent orchestration) ────────
|
||||
# Installs Bun runtime and the oh-my-opencode-slim npm package.
|
||||
# OMOS_VERSION shares the same cache-hit footgun as PI_VERSION when
|
||||
# left at the `latest` default in registry-cached CI builds. CI
|
||||
# resolves it via `npm view oh-my-opencode-slim version` and passes
|
||||
# the concrete value as a build-arg. See PI_VERSION block above.
|
||||
ARG INSTALL_OMOS=false
|
||||
ARG OMOS_VERSION=latest
|
||||
RUN if [ "${INSTALL_OMOS}" = "true" ]; then \
|
||||
|
||||
@@ -762,7 +762,7 @@ Container (Debian trixie)
|
||||
├── oh-my-opencode-slim (optional — multi-agent orchestration plugin, includes Bun)
|
||||
├── AWS CLI v2 (SSO + Bedrock auth)
|
||||
├── neovim 0.12, tmux, htop, bat, eza, zoxide, uv, rustup, make, gcc, g++, rsync
|
||||
├── git, git-crypt, age, ssh, ripgrep, fd, fzf, jq, curl, tree
|
||||
├── git, git-crypt, age, gitleaks, ssh, ripgrep, fd, fzf, jq, curl, tree
|
||||
├── Node.js (for MCP servers)
|
||||
├── Bun (optional — included with oh-my-opencode-slim)
|
||||
├── entrypoint.sh (UID adjustment, git config, provider setup)
|
||||
|
||||
@@ -0,0 +1,127 @@
|
||||
# Manual host-side publish — escape hatch when CI is broken
|
||||
|
||||
This runbook is the procedure for publishing an opencode-devbox release **directly from a developer host** when the Gitea Actions → Docker Hub path is broken. Used in anger on 2026-05-28 to ship `v1.15.12` after five consecutive CI publish failures (runs #332/333/334/336 + a rerun) and as a parallel diagnostic that pinpointed the root cause (buildkit `cache-export mode=max` returning HTTP 400 from the Hub CDN).
|
||||
|
||||
The procedure is also a **diagnostic probe**. If the host-side publish succeeds where CI fails, the failure is somewhere in the runner → Hub path (cache-export, runner egress, runner-image, action versions). If host-side fails the same way, the failure is in your local buildx + Hub combination and you need a different escape (different network, different account, file an upstream).
|
||||
|
||||
## When to reach for this
|
||||
|
||||
- Tag pushed, CI keeps failing on `docker buildx build --push`, the failure shape is stable across reruns.
|
||||
- Failure body looks like a registry-tier rejection (HTTP 4xx, HTML response body, repeats on every retry) — i.e. not a transient.
|
||||
- You've already disproved the obvious suspects (action pin, runner image, network) per the [`ci-release-watcher` skill](../../../.agents/skills/ci-release-watcher/SKILL.md) playbook.
|
||||
- You need the release **shipped today** and don't want to wait for a CI fix to land + re-trigger.
|
||||
|
||||
If CI is broken because **a workflow change you just made is bad**, fix the workflow and re-tag with a letter suffix. This runbook is for when the workflow looks correct but the publish path itself is broken.
|
||||
|
||||
## Prerequisites on the host
|
||||
|
||||
- Docker (or Orbstack on macOS) with `docker buildx` available — multi-arch publish needs `setup-qemu` equivalent. Orbstack ships QEMU emulators for both archs by default; on Linux install `qemu-user-static` and run `docker run --privileged --rm tonistiigi/binfmt --install all` once per host.
|
||||
- `docker login` credentials for `joakimp` on Docker Hub (PAT or password). Confirm with `docker info | grep Username`.
|
||||
- A clone of `opencode-devbox` checked out at the **exact tag** you want to publish. `git status` clean. `git describe --tags --exact-match HEAD` should print the tag.
|
||||
- Network connectivity to `registry-1.docker.io` from the host. Verify with `curl -sI https://registry-1.docker.io/v2/ | head -1` (expects `401 Unauthorized` — that's the v2 API saying "auth required", which means you can reach it).
|
||||
|
||||
## How to use this runbook
|
||||
|
||||
A working reference script lives next to this doc: **[`docs/manual-host-publish.sh`](manual-host-publish.sh)**. It is the literal script that shipped opencode-devbox v1.15.12 on 2026-05-28 from a developer Mac via Orbstack, with the BASE_HASH and version pins of that release. To publish a different release, **copy it to a new file, edit four constants at the top, and run it**:
|
||||
|
||||
```bash
|
||||
cp docs/manual-host-publish.sh /tmp/manual-publish-vX.Y.Z.sh
|
||||
# Edit at top of file:
|
||||
# RELEASE_TAG="vX.Y.Z"
|
||||
# BASE_HASH="<12-char hash from CI's base-decide step>"
|
||||
# PI_VERSION="<from npm registry, see step 2 below>"
|
||||
# OMOS_VERSION="<from npm registry, see step 2 below>"
|
||||
bash /tmp/manual-publish-vX.Y.Z.sh
|
||||
```
|
||||
|
||||
Keep the historical script in `docs/` as-is — it's an archive of the v1.15.12 publish, useful as a reference if a future debug needs to compare exact arg sets across releases. Don't edit it in place.
|
||||
|
||||
The sections below explain what the script does and what you need to know to edit those four constants safely.
|
||||
|
||||
## 1. Pin RELEASE_TAG
|
||||
|
||||
The git tag you're publishing. Must match a tag in the local clone:
|
||||
|
||||
```bash
|
||||
git fetch && git checkout v1.15.13 # whatever you're publishing
|
||||
git describe --tags --exact-match HEAD
|
||||
```
|
||||
|
||||
The script asserts `HEAD == ${RELEASE_TAG}^{commit}` before doing anything destructive. If you've drifted, fix it with `git checkout` before running.
|
||||
|
||||
## 2. Pin PI_VERSION and OMOS_VERSION
|
||||
|
||||
Gitea CI's `resolve-versions` job queries the npm registry at workflow time and threads concrete versions through every variant build, mitigating the silent same-bytes-across-releases regression class documented in `AGENTS.md`. Do the same by hand:
|
||||
|
||||
```bash
|
||||
curl -sf https://registry.npmjs.org/@earendil-works%2Fpi-coding-agent/latest | jq -r .version
|
||||
curl -sf https://registry.npmjs.org/oh-my-opencode-slim/latest | jq -r .version
|
||||
```
|
||||
|
||||
Paste the two version strings into the script's `PI_VERSION` / `OMOS_VERSION` constants. Don't leave the script defaulting to `latest` — the registry buildcache will silently reuse a stale layer if the build-arg byte-equals a previous build.
|
||||
|
||||
## 3. Pin BASE_HASH
|
||||
|
||||
This is the 12-char hash that CI's `base-decide` job computes from `Dockerfile.base` + `rootfs/**` + `entrypoint*.sh`. Three ways to get it, in order of preference:
|
||||
|
||||
**A. From a prior CI run on the same commit** (cheapest — if the Gitea Actions run that triggered on this tag got far enough to log `base-decide`'s output, just read it):
|
||||
|
||||
```
|
||||
Gitea Actions → the run for vX.Y.Z → base-decide job → "Compute base tag" step → last line:
|
||||
Computed base tag: base-XXXXXXXXXXXX
|
||||
```
|
||||
|
||||
This is the canonical source. The whole reason for the manual escape is that *something later in CI broke* — `base-decide` itself is fast, deterministic, and almost always succeeds.
|
||||
|
||||
**B. From an existing image on the Hub** if a recent release already published a `base-<hash>` tag and the inputs haven't changed, you can copy that hash. Confirm with `docker manifest inspect joakimp/opencode-devbox:base-latest` and read the digest — if it matches a `base-<hash>` you already see on the Hub, that hash is yours.
|
||||
|
||||
**C. Compute it locally**, replicating CI's exact recipe (the script in `.gitea/workflows/docker-publish-split.yml` `base-decide.compute`):
|
||||
|
||||
```bash
|
||||
{
|
||||
cat Dockerfile.base
|
||||
find rootfs -type f \
|
||||
! -path '*/__pycache__/*' \
|
||||
! -name '*.pyc' \
|
||||
! -name '.DS_Store' \
|
||||
! -name '._*' \
|
||||
-print0 2>/dev/null | sort -z | xargs -0 cat 2>/dev/null
|
||||
cat entrypoint.sh entrypoint-user.sh
|
||||
} | sha256sum | cut -c1-12
|
||||
```
|
||||
|
||||
The junk-file filters (`__pycache__`, `.DS_Store`, `._*` AppleDouble) matter — they are gitignored but `find -type f` picks them up locally and would diverge your hash from CI's clean checkout. Don't skip them.
|
||||
|
||||
If method C disagrees with method A, **trust A** and find out why your local tree differs. The hash in CI is what's on the Hub; that's what variants must FROM.
|
||||
|
||||
## What the script does (high level)
|
||||
|
||||
After the constants are set, the script runs a 5-step procedure. No editing needed inside the body; the whole flow is parameterised by the four constants above plus `IMAGE` (which is fixed to `joakimp/opencode-devbox`).
|
||||
|
||||
1. **Preflight** — buildx present, tag exists, `HEAD == tag`, multi-arch builder created if missing.
|
||||
2. **Base build (conditional)** — probe `${IMAGE}:base-${BASE_HASH}` on the Hub; if missing, build it multi-arch and push. **No `--cache-from` / `--cache-to`.** That's the whole point of this escape. If the base push itself fails the same way CI did, stop — the regression has spread to image push and you need a different host or account, not this runbook.
|
||||
3. **Promote `base-latest`** — `docker buildx imagetools create` re-tags by manifest reference. No rebuild.
|
||||
4. **Variants × 4** — sequential (not parallel; one host's egress can't saturate four multi-arch pushes safely). Each variant is `Dockerfile.variant` `FROM ${IMAGE}:base-${BASE_HASH}` plus the appropriate `INSTALL_OMOS` / `INSTALL_PI` build-args, tagged `${RELEASE_TAG}${suffix}` and `latest${suffix}`.
|
||||
5. **Verify** — prints the digest of all 10 expected tags (8 variant + base-hash + base-latest). Spot-check that each `vX.Y.Z*` and its `latest*` alias share a digest.
|
||||
|
||||
Expected wall time on a recent Mac: ~25-40 min (base ~3 min if rebuilt, each variant ~3-7 min mostly QEMU arm64 emulation).
|
||||
|
||||
## Optional: update DOCKER_HUB.md description
|
||||
|
||||
CI's `update-description` job posts the rendered Hub description via the Hub API. The manual script does **not** do this — the release works fine without it. If you want parity, copy the curl invocation from the `update-description` job in `.gitea/workflows/docker-publish-split.yml` and run it from the host with a Hub PAT loaded into `HUB_PAT`. Cosmetic; can wait until CI is healthy and the next release pushes a fresh description automatically.
|
||||
|
||||
## After: capture diagnostic value
|
||||
|
||||
The whole point of running this manually is the diagnostic. Three things to record before moving on:
|
||||
|
||||
1. **Did the host publish succeed?** If yes and CI was failing on the same exact code, you've localised the failure to the runner side (cache-export, network, runner image). If no, the failure is in your local buildx + Hub combination and CI is a victim, not a cause.
|
||||
2. **What was different from CI?** Document at minimum: `docker buildx version`, the host's `buildx ls` output (driver name + version), whether you used `--cache-to` or not, and which network you were on.
|
||||
3. **File the upstream.** If the diagnostic narrowed the failure to a specific buildkit/buildx behaviour, file at `moby/buildkit` or `docker/buildx` with: stable failure shape, the exact request URL fragment (`Offset:0` / `_state=...` / digest if visible), the timeline boundary when failures started, and what worked vs what failed in your repro. The 2026-05-28 cache-export-mode=max regression is a worked example.
|
||||
|
||||
Restore CI as the primary publish path as soon as the underlying regression is fixed or worked around at workflow level. This runbook should be exercised rarely.
|
||||
|
||||
## Variants of this runbook
|
||||
|
||||
- **pi-devbox** — same idea, simpler: only one image (`joakimp/pi-devbox`), one tag pair (`vX.Y.Z` + `latest`), no split base. Adapt the script: drop the `BASE_HASH` constant + steps 2-3 + the variant function; replace with a single `docker buildx build --file Dockerfile --build-arg PI_VERSION=... --tag joakimp/pi-devbox:${RELEASE_TAG} --tag joakimp/pi-devbox:latest --push .`.
|
||||
- **opencode-devbox letter-suffix rebuild** (e.g. `v1.15.12b`) — same procedure end-to-end. The `BASE_HASH` will probably be unchanged from the prior release if no rootfs/entrypoint/Dockerfile.base changes shipped, so the base-build step skips itself automatically via the Hub probe.
|
||||
- **Single-variant publish** for partial-failure recovery (e.g. CI succeeded for base + 3 variants but the 4th failed) — comment out the three completed `build_variant` calls in your copy of the script. Keep `imagetools create` for `base-latest` only if it didn't already promote. Then re-run.
|
||||
Executable
+117
@@ -0,0 +1,117 @@
|
||||
#!/usr/bin/env bash
|
||||
# Manual publish of opencode-devbox v1.15.12 — bypasses broken Gitea-runner
|
||||
# Hub push by building & pushing from a developer host (Orbstack/Docker Desktop).
|
||||
#
|
||||
# Mirrors what .gitea/workflows/docker-publish-split.yml would do:
|
||||
# 1. Build & push Dockerfile.base → joakimp/opencode-devbox:base-<hash>
|
||||
# 2. Promote → joakimp/opencode-devbox:base-latest
|
||||
# 3. Build & push 4 variants on top of base-<hash>:
|
||||
# :v1.15.12 :latest (INSTALL_OPENCODE only)
|
||||
# :v1.15.12-omos :latest-omos (+ OMOS)
|
||||
# :v1.15.12-with-pi :latest-with-pi (+ pi)
|
||||
# :v1.15.12-omos-with-pi :latest-omos-with-pi (+ both)
|
||||
#
|
||||
# Usage on your host:
|
||||
# 1. Make sure Orbstack/Docker Desktop is running with multi-arch enabled
|
||||
# (docker buildx ls should show linux/amd64,linux/arm64).
|
||||
# 2. docker login docker.io (joakimp account)
|
||||
# 3. cd ~/path/to/opencode-devbox && git fetch && git checkout v1.15.12
|
||||
# 4. bash /path/to/this/script.sh
|
||||
#
|
||||
# Total expected time: ~25-40 min on a recent Mac (4 multi-arch builds, base
|
||||
# layers cache after the first variant).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
IMAGE="joakimp/opencode-devbox"
|
||||
RELEASE_TAG="v1.15.12"
|
||||
BASE_HASH="8d72a9e44796" # sha256 of Dockerfile.base + rootfs/* + entrypoints (computed by CI logic)
|
||||
BASE_TAG="base-${BASE_HASH}"
|
||||
PI_VERSION="0.76.0" # resolved from npm @earendil-works/pi-coding-agent latest (2026-05-28)
|
||||
OMOS_VERSION="1.1.1" # resolved from npm oh-my-opencode-slim latest (2026-05-28)
|
||||
PLATFORMS="linux/amd64,linux/arm64"
|
||||
|
||||
# -------- preflight --------
|
||||
echo "==> Preflight"
|
||||
docker buildx version >/dev/null || { echo "buildx not available"; exit 1; }
|
||||
git rev-parse --verify "$RELEASE_TAG" >/dev/null 2>&1 || {
|
||||
echo "Tag $RELEASE_TAG not found locally. git fetch && git checkout $RELEASE_TAG first."; exit 1; }
|
||||
[[ "$(git rev-parse HEAD)" == "$(git rev-parse "${RELEASE_TAG}^{commit}")" ]] || {
|
||||
echo "HEAD is not at $RELEASE_TAG. git checkout $RELEASE_TAG first."; exit 1; }
|
||||
docker buildx inspect default >/dev/null 2>&1 || docker buildx create --use --name multi --driver docker-container
|
||||
|
||||
# Probe whether base-<hash> already exists on Hub (CI does this; saves 10 min if yes)
|
||||
if docker manifest inspect "${IMAGE}:${BASE_TAG}" >/dev/null 2>&1; then
|
||||
echo "==> Base tag ${IMAGE}:${BASE_TAG} already exists on Hub — skipping base rebuild"
|
||||
SKIP_BASE=1
|
||||
else
|
||||
echo "==> Base tag ${IMAGE}:${BASE_TAG} missing — will build"
|
||||
SKIP_BASE=0
|
||||
fi
|
||||
|
||||
# -------- 1. base (if needed) --------
|
||||
if [[ "$SKIP_BASE" == "0" ]]; then
|
||||
echo "==> [1/5] Build & push Dockerfile.base → ${IMAGE}:${BASE_TAG}"
|
||||
docker buildx build \
|
||||
--platform "$PLATFORMS" \
|
||||
-f Dockerfile.base \
|
||||
-t "${IMAGE}:${BASE_TAG}" \
|
||||
--push \
|
||||
.
|
||||
fi
|
||||
|
||||
# -------- 2. promote base-latest --------
|
||||
echo "==> [2/5] Promote ${IMAGE}:${BASE_TAG} → ${IMAGE}:base-latest"
|
||||
docker buildx imagetools create -t "${IMAGE}:base-latest" "${IMAGE}:${BASE_TAG}"
|
||||
|
||||
# -------- 3-5. variants --------
|
||||
build_variant() {
|
||||
local suffix="$1" # "" | "-omos" | "-with-pi" | "-omos-with-pi"
|
||||
local install_omos="$2"
|
||||
local install_pi="$3"
|
||||
local extra_args=()
|
||||
[[ "$install_pi" == "true" ]] && extra_args+=(--build-arg "PI_VERSION=${PI_VERSION}")
|
||||
[[ "$install_omos" == "true" ]] && extra_args+=(--build-arg "OMOS_VERSION=${OMOS_VERSION}")
|
||||
|
||||
local versioned="${IMAGE}:${RELEASE_TAG}${suffix}"
|
||||
local floating="${IMAGE}:latest${suffix}"
|
||||
|
||||
echo "==> Build & push variant${suffix:-(default)} → ${versioned} + ${floating}"
|
||||
docker buildx build \
|
||||
--platform "$PLATFORMS" \
|
||||
-f Dockerfile.variant \
|
||||
--build-arg "BASE_IMAGE=${IMAGE}:${BASE_TAG}" \
|
||||
--build-arg "INSTALL_OPENCODE=true" \
|
||||
--build-arg "INSTALL_OMOS=${install_omos}" \
|
||||
--build-arg "INSTALL_PI=${install_pi}" \
|
||||
${extra_args[@]+"${extra_args[@]}"} \
|
||||
-t "${versioned}" \
|
||||
-t "${floating}" \
|
||||
--push \
|
||||
.
|
||||
}
|
||||
|
||||
echo "==> [3/5] Variant: base (opencode only)"
|
||||
build_variant "" false false
|
||||
|
||||
echo "==> [4/5] Variant: omos"
|
||||
build_variant "-omos" true false
|
||||
|
||||
echo "==> [4/5] Variant: with-pi"
|
||||
build_variant "-with-pi" false true
|
||||
|
||||
echo "==> [5/5] Variant: omos-with-pi"
|
||||
build_variant "-omos-with-pi" true true
|
||||
|
||||
echo
|
||||
echo "==> Done. Verifying tags on Hub:"
|
||||
for t in \
|
||||
"${RELEASE_TAG}" "latest" \
|
||||
"${RELEASE_TAG}-omos" "latest-omos" \
|
||||
"${RELEASE_TAG}-with-pi" "latest-with-pi" \
|
||||
"${RELEASE_TAG}-omos-with-pi" "latest-omos-with-pi" \
|
||||
"${BASE_TAG}" "base-latest"
|
||||
do
|
||||
d=$(docker manifest inspect "${IMAGE}:${t}" 2>/dev/null | python3 -c "import json,sys,hashlib; m=json.load(sys.stdin); print(m.get('digest','-'))" 2>/dev/null || echo "MISSING")
|
||||
printf " %-32s %s\n" "$t" "$d"
|
||||
done
|
||||
@@ -1,6 +1,17 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# ── SSH ControlMaster socket dir ────────────────────────────────
|
||||
# Companion to /etc/ssh/ssh_config.d/00-devbox-controlmaster.conf in the
|
||||
# base image — that file declares ControlPath=/tmp/sshcm/%r@%h:%p; this
|
||||
# creates the directory with the right permissions on every container
|
||||
# start. /tmp is per-container so the dir doesn't survive recreation;
|
||||
# baking it into a Dockerfile layer would be wrong.
|
||||
# Mode 700 is required — OpenSSH refuses to use a ControlPath dir that
|
||||
# others can write to.
|
||||
mkdir -p /tmp/sshcm
|
||||
chmod 700 /tmp/sshcm
|
||||
|
||||
# ── Shell defaults: copy baked files from /etc/skel-devbox/ if absent
|
||||
# Respects host bind-mounts and user customizations — existing files
|
||||
# are never overwritten. To restore defaults: rm ~/.bash_aliases (or
|
||||
|
||||
+44
-1
@@ -43,6 +43,25 @@ run() {
|
||||
fi
|
||||
}
|
||||
|
||||
# Stricter version of `run` that also asserts an expected substring in
|
||||
# the command's stdout. Used to catch the "image bytes silently identical
|
||||
# to previous release" class of regression — Docker layer-cache hit on
|
||||
# a bare `npm install -g <pkg>` (or @latest) because the build-arg
|
||||
# string is identical across builds, even when 'latest' would have
|
||||
# resolved differently. Discovered in pi-devbox 2026-05-23 (every
|
||||
# release v0.74.0..v0.75.5 shipped the same image bytes); preventatively
|
||||
# applied here for PI_VERSION + OMOS_VERSION.
|
||||
run_expect() {
|
||||
local label="$1"; local cmd="$2"; local expect="$3"
|
||||
local out
|
||||
out=$(docker run --rm --entrypoint="" "$IMAGE" sh -c "$cmd" 2>&1) || true
|
||||
if echo "$out" | grep -Fq "$expect"; then
|
||||
pass "$label (got $expect)"
|
||||
else
|
||||
fail "$label — expected substring '$expect', got: $out"
|
||||
fi
|
||||
}
|
||||
|
||||
echo "=== Smoke test: $IMAGE (variant: $VARIANT) ==="
|
||||
echo
|
||||
echo "-- Resolved component versions --"
|
||||
@@ -68,6 +87,8 @@ docker run --rm --entrypoint="" "$IMAGE" sh -c '
|
||||
printf " %-15s %s\n" "rg" "$(rg --version | head -1)"
|
||||
printf " %-15s %s\n" "gosu" "$(gosu --version)"
|
||||
printf " %-15s %s\n" "git-lfs" "$(git-lfs --version)"
|
||||
printf " %-15s %s\n" "git-crypt" "$(git-crypt --version 2>&1 | head -1)"
|
||||
printf " %-15s %s\n" "gitleaks" "$(gitleaks version 2>&1 | head -1)"
|
||||
printf " %-15s %s\n" "gitea-mcp" "$(gitea-mcp --version 2>&1 | head -1)"
|
||||
printf " %-15s %s\n" "aws" "$(aws --version 2>&1)"
|
||||
if command -v bun >/dev/null 2>&1; then
|
||||
@@ -103,11 +124,20 @@ run "fzf" "fzf --version"
|
||||
run "fd" "fd --version"
|
||||
run "rg" "rg --version | head -1"
|
||||
run "jq" "jq --version"
|
||||
run "git-crypt" "git-crypt --version | head -1"
|
||||
run "gitleaks" "gitleaks version"
|
||||
run "aws" "aws --version"
|
||||
run "gitea-mcp" "gitea-mcp --version"
|
||||
run "gosu" "gosu --version"
|
||||
run "tmux" "tmux -V"
|
||||
|
||||
# SSH ControlMaster baked defaults: the config file must exist (image-level)
|
||||
# and ssh -G must report ControlPath rooted at /tmp/sshcm/ for an arbitrary
|
||||
# host. Catches both regressions: someone removing the conf file, OR something
|
||||
# else later in the config chain shadowing the ControlPath setting.
|
||||
run "ssh-config-cm-file" "test -f /etc/ssh/ssh_config.d/00-devbox-controlmaster.conf"
|
||||
run_expect "ssh-config-cm-path" "ssh -G example.invalid 2>/dev/null | grep -i ^controlpath" "/tmp/sshcm/"
|
||||
|
||||
echo
|
||||
echo "-- Optional / variant-gated --"
|
||||
# mempalace: present unless built with INSTALL_MEMPALACE=false
|
||||
@@ -134,7 +164,11 @@ fi
|
||||
# entrypoint-user.sh on first start, so we test by running the entry
|
||||
# point chain (not just `docker run --entrypoint=""`).
|
||||
if docker run --rm --entrypoint="" "$IMAGE" sh -c "command -v pi" >/dev/null 2>&1; then
|
||||
if [ -n "${EXPECTED_PI_VERSION:-}" ]; then
|
||||
run_expect "pi version matches build-arg" "pi --version" "$EXPECTED_PI_VERSION"
|
||||
else
|
||||
run "pi" "pi --version"
|
||||
fi
|
||||
run "pi-toolkit clone" "test -d /opt/pi-toolkit && git -C /opt/pi-toolkit rev-parse --short HEAD"
|
||||
run "pi-extensions clone" "test -d /opt/pi-extensions && git -C /opt/pi-extensions rev-parse --short HEAD"
|
||||
|
||||
@@ -192,6 +226,11 @@ if [ "$VARIANT" = "omos" ] || [ "$VARIANT" = "omos-with-pi" ]; then
|
||||
# queries the user prefix and would miss the baked binaries even though
|
||||
# they're correctly on PATH at /usr/bin.
|
||||
run "oh-my-opencode-slim" "NPM_CONFIG_PREFIX=/usr npm ls -g --depth=0 2>/dev/null | grep oh-my-opencode-slim"
|
||||
if [ -n "${EXPECTED_OMOS_VERSION:-}" ]; then
|
||||
run_expect "omos version matches build-arg" \
|
||||
"NPM_CONFIG_PREFIX=/usr npm ls -g --depth=0 2>/dev/null | grep oh-my-opencode-slim" \
|
||||
"$EXPECTED_OMOS_VERSION"
|
||||
fi
|
||||
else
|
||||
if docker run --rm --entrypoint="" "$IMAGE" sh -c "command -v bun" >/dev/null 2>&1; then
|
||||
fail "bun should NOT be in base image but was found"
|
||||
@@ -293,12 +332,16 @@ echo " Uncompressed size: ${SIZE_MB} MB"
|
||||
# omos bumped 3000→3200 on v1.14.31c — mempalace-toolkit bake-in pushed the
|
||||
# baseline; bumped 3200→3300 on v1.15.0 — opencode 1.15.0 came in at
|
||||
# 3206 MB, leaving zero headroom for routine apt-get upgrade drift.
|
||||
# omos-with-pi bumped 3400→3500 on v1.15.0 alongside the omos bump.
|
||||
# omos-with-pi bumped 3500→3700 on v1.15.4b — omos+pi compounded as both
|
||||
# upstream packages grew (opencode 1.15.0→1.15.4, pi 0.74.0→0.75.3) and
|
||||
# the variant landed just over 3500 in v1.15.4's smoke.
|
||||
# omos variant to ~3.1 GB. Functional smoke checks all pass; this is a
|
||||
# guardrail, not a performance limit.
|
||||
THRESHOLD=2500
|
||||
[ "$VARIANT" = "omos" ] && THRESHOLD=3300
|
||||
[ "$VARIANT" = "with-pi" ] && THRESHOLD=2700
|
||||
[ "$VARIANT" = "omos-with-pi" ] && THRESHOLD=3500
|
||||
[ "$VARIANT" = "omos-with-pi" ] && THRESHOLD=3700
|
||||
if [ "$SIZE_MB" -gt "$THRESHOLD" ]; then
|
||||
fail "image size ${SIZE_MB} MB exceeds threshold ${THRESHOLD} MB for variant=$VARIANT"
|
||||
else
|
||||
|
||||
Reference in New Issue
Block a user