CI: preventative fix for PI_VERSION/OMOS_VERSION cache-hit silent regression
Mirrors the pi-devbox v0.75.5b fix (2026-05-23) onto the four-variant pipeline here. The with-pi, omos, and omos-with-pi variants install upstream npm packages whose *_VERSION build-args defaulted to 'latest'. When the build-arg string is byte-identical across builds, the layer hash is identical and the registry buildcache silently reuses the layer from whatever upstream version was current when the cache was first populated — same mechanism that shipped pi-devbox v0.74.0..v0.75.5 with identical image bytes. Currently masked here because OPENCODE_VERSION is a hard-coded ARG that bumps every release; parent-chain cache invalidation flushes the downstream pi/omos layers. Masking would fail on any vN.N.Nb opencode- version-unchanged release that only bumps pi or omos. Filed last night as parked followup; fixing preventatively now that #5 (AWS SSO inside tor-ms22 container) cleared. CHANGES .gitea/workflows/docker-publish-split.yml — new resolve-versions job running 'npm view @earendil-works/pi-coding-agent version' and 'npm view oh-my-opencode-slim version', exposing concrete strings as job outputs. All six affected jobs (smoke-omos, smoke-with-pi, smoke-omos-with-pi, build-variant-omos, build-variant-with-pi, build-variant-omos-with-pi) now consume them as PI_VERSION / OMOS_VERSION build-args. smoke-base / build-variant-base unaffected. scripts/smoke-test.sh — new run_expect helper asserting an expected substring in command output. The pi check uses EXPECTED_PI_VERSION; the omos check uses EXPECTED_OMOS_VERSION against npm ls -g. Both env vars are wired from resolve-versions outputs in the smoke jobs. Catches this regression class on the next release, not four releases later. Dockerfile.variant — comment blocks above OPENCODE_VERSION (source- pinned, not subject to the bug), PI_VERSION (CI-resolved), and OMOS_VERSION (CI-resolved) explaining the cache-hit footgun. AGENTS.md — new convention bullet under 'Critical conventions' naming the resolve-versions job + EXPECTED_*_VERSION wiring as the contract to keep in lockstep when modifying variant build-args. .gitea/README.md — Step 1 expanded to cover the parallel resolve- versions job alongside base-decide; pipeline diagram updated. CHANGELOG.md — Unreleased entry describing the fix, masking mechanism, and audit footprint. No image-content change expected on the next release vs what 'latest' would have resolved to anyway. Purely makes the cache invalidate correctly going forward.
This commit is contained in:
+30
-7
@@ -30,10 +30,10 @@ The split-base architecture is what the `docker-publish-split.yml` workflow exer
|
||||
┌──────────────────┐
|
||||
│ base-decide │ compute base-<hash>;
|
||||
│ │ probe Docker Hub.
|
||||
│ hash inputs: │
|
||||
│ Dockerfile.base│
|
||||
│ rootfs/ │
|
||||
│ entrypoint*.sh │
|
||||
│ hash inputs: │ (resolve-versions
|
||||
│ Dockerfile.base│ runs in parallel:
|
||||
│ rootfs/ │ npm view pi/omos
|
||||
│ entrypoint*.sh │ → concrete versions)
|
||||
└────────┬─────────┘
|
||||
│
|
||||
┌─────────────┴─────────────┐
|
||||
@@ -73,10 +73,10 @@ The split-base architecture is what the `docker-publish-split.yml` workflow exer
|
||||
└──────────────────────────┘
|
||||
```
|
||||
|
||||
### Step 1: `base-decide`
|
||||
### Step 1: `base-decide` (and `resolve-versions` in parallel)
|
||||
|
||||
Compute a SHA-256 hash over the inputs that determine the base image's
|
||||
content:
|
||||
**`base-decide`** computes a SHA-256 hash over the inputs that determine
|
||||
the base image's content:
|
||||
|
||||
```sh
|
||||
{
|
||||
@@ -106,6 +106,29 @@ This is the core cache-reuse mechanism. Version-bump-only releases
|
||||
that change anything in the base — apt packages, AWS CLI, Node version,
|
||||
locale list, entrypoint scripts — pay the full base-build cost once.
|
||||
|
||||
**`resolve-versions`** runs alongside `base-decide` (no `needs:`
|
||||
dependency between them) and resolves the floating npm packages whose
|
||||
`*_VERSION` build-args default to `latest`:
|
||||
|
||||
```sh
|
||||
PI_VERSION=$(npm view @earendil-works/pi-coding-agent version)
|
||||
OMOS_VERSION=$(npm view oh-my-opencode-slim version)
|
||||
```
|
||||
|
||||
The outputs (`pi_version`, `omos_version`) are consumed by every variant
|
||||
smoke and build job that installs pi or omos. **Why this exists:** without
|
||||
it, the `npm install -g` RUN layer in `Dockerfile.variant` hashes
|
||||
identically across builds (same ARG default, same command string), so
|
||||
the registry buildcache silently reuses the layer from whatever upstream
|
||||
version was current when the cache was first populated. This is the
|
||||
cache-hit silent-regression class of bug that shipped pi-devbox v0.74.0
|
||||
through v0.75.5 with identical image bytes (fixed in pi-devbox v0.75.5b
|
||||
2026-05-23). Currently masked here by `OPENCODE_VERSION` bumping every
|
||||
release (parent-chain cache-key invalidation), but masking would fail on
|
||||
a `vN.N.Nb` opencode-version-unchanged release that only bumps pi or
|
||||
omos. Smoke jobs additionally assert `EXPECTED_PI_VERSION` /
|
||||
`EXPECTED_OMOS_VERSION` against the resolved values.
|
||||
|
||||
### Step 2: `build-base` (conditional)
|
||||
|
||||
Only runs when `need_build=true`. Multi-arch (amd64 + arm64) build of
|
||||
|
||||
Reference in New Issue
Block a user