34cae2a1d2
ALL FOUR releases v0.74.0 -> v0.75.5 had been shipping the same image bytes due to a Docker layer-cache hit on the bare 'npm install -g @earendil-works/pi-coding-agent' command (when PI_VERSION=latest). The command string is identical across builds, so the layer-hash is identical, so registry buildcache (cache-from/cache-to) silently reuses the layer from whatever pi version was current when the cache was first populated. Verification: docker manifest inspect joakimp/pi-devbox:vX.Y.Z showed identical SHA256 digests on both linux/amd64 and linux/arm64 for v0.74.0, v0.75.3, v0.75.4, v0.75.5. Users on :latest were getting whatever pi version was baked into the v0.74.0 build. DISCOVERED 2026-05-23 by user trying to update pi-devbox on MBP-M1 and seeing pi 0.74.0 reported despite pulling v0.75.5. CHANGES .gitea/workflows/docker-publish.yml — both smoke and publish jobs get a new 'Resolve PI_VERSION from tag' step that strips the leading 'v' and any trailing letter suffix from github.ref_name. Result is passed as a build-arg to docker/build-push-action so the npm install layer's hash includes the concrete version, forcing cache miss when pi bumps. scripts/smoke-test.sh — new run_expect helper that asserts pi --version contains the EXPECTED_PI_VERSION env var. Smoke job sets this from the resolve step output. Would have caught this regression on v0.75.3. Dockerfile — comment block above ARG PI_VERSION=latest documenting the cache-hit footgun. The 'if latest' branch in the install RUN is preserved for local dev convenience but never fires in CI now. AGENTS.md — new convention bullet explaining the cache-hit class of bug and noting the latent same-bug in opencode-devbox's with-pi variants (currently masked by OPENCODE_VERSION bumps; will manifest when cutting a vN.N.Nb-style opencode-version-unchanged release that only bumps pi). CHANGELOG.md — full entry under v0.75.5b describing the recovery, the silent-failure mechanism, and the verification steps. NO IMAGE-CONTENT CHANGES vs v0.75.5 INTENT. This build produces the actual pi 0.75.5 image content that v0.75.5 was supposed to ship. NEXT FOLLOWUP (parked, not in this commit) opencode-devbox should get the same workflow change for its build-variant-with-pi and build-variant-omos-with-pi jobs. Currently masked because every release also bumps OPENCODE_VERSION which invalidates the cache, but that masking would fail on a pi-only bump release.
118 lines
5.0 KiB
Bash
Executable File
118 lines
5.0 KiB
Bash
Executable File
#!/usr/bin/env bash
|
|
# smoke-test.sh — basic sanity checks for the pi-devbox image
|
|
#
|
|
# Usage: ./scripts/smoke-test.sh <image>
|
|
#
|
|
# Verifies:
|
|
# - pi binary present and returns a version
|
|
# - pi-toolkit cloned at /opt/pi-toolkit
|
|
# - pi-extensions cloned at /opt/pi-extensions
|
|
# - entrypoint deploys pi-toolkit keybindings symlink
|
|
# - entrypoint deploys ≥4 extensions
|
|
# - mempalace bridge symlink present
|
|
# - settings.json bootstrapped
|
|
# - image size within threshold
|
|
|
|
set -euo pipefail
|
|
|
|
IMAGE="${1:?usage: $0 <image>}"
|
|
PASS=0; FAIL=0
|
|
SIZE_THRESHOLD_MB=2200
|
|
|
|
run() {
|
|
local label="$1"; local cmd="$2"
|
|
if docker run --rm --entrypoint="" "$IMAGE" sh -c "$cmd" >/dev/null 2>&1; then
|
|
printf " ✅ %s\n" "$label"; PASS=$((PASS+1))
|
|
else
|
|
printf " ❌ %s\n" "$label"; FAIL=$((FAIL+1))
|
|
fi
|
|
}
|
|
|
|
# Stricter version of `run` that also asserts an expected substring in stdout.
|
|
# Used for catching the "image bytes silently identical to previous release"
|
|
# class of regression (Docker layer cache hit on `npm install -g <pkg>` because
|
|
# the bare command string is identical across builds, even when `latest` would
|
|
# resolve differently). Discovered 2026-05-23 — every pi-devbox release v0.74.0
|
|
# through v0.75.5 had been shipping the same image bytes.
|
|
run_expect() {
|
|
local label="$1"; local cmd="$2"; local expect="$3"
|
|
local out
|
|
out=$(docker run --rm --entrypoint="" "$IMAGE" sh -c "$cmd" 2>&1) || true
|
|
if echo "$out" | grep -Fq "$expect"; then
|
|
printf " ✅ %s (got %s)\n" "$label" "$expect"; PASS=$((PASS+1))
|
|
else
|
|
printf " ❌ %s — expected substring %q, got: %s\n" "$label" "$expect" "$out"; FAIL=$((FAIL+1))
|
|
fi
|
|
}
|
|
|
|
echo "=== pi-devbox smoke test: $IMAGE ==="
|
|
echo ""
|
|
|
|
# ── Basic binary checks ───────────────────────────────────────────────
|
|
echo "── Binaries ──"
|
|
if [ -n "${EXPECTED_PI_VERSION:-}" ]; then
|
|
run_expect "pi version matches build arg" "pi --version" "$EXPECTED_PI_VERSION"
|
|
else
|
|
run "pi" "pi --version"
|
|
fi
|
|
run "node" "node --version"
|
|
run "git" "git --version"
|
|
run "aws" "aws --version"
|
|
run "uv" "uv --version"
|
|
run "nvim" "nvim --version"
|
|
run "mempalace-mcp" "mempalace-mcp --help"
|
|
|
|
# ── Repo clones ───────────────────────────────────────────────────────
|
|
echo ""
|
|
echo "── Repo clones ──"
|
|
run "pi-toolkit clone" "test -d /opt/pi-toolkit && git -C /opt/pi-toolkit rev-parse --short HEAD"
|
|
run "pi-extensions clone" "test -d /opt/pi-extensions && git -C /opt/pi-extensions rev-parse --short HEAD"
|
|
|
|
# ── Runtime deployment (needs entrypoint to run) ──────────────────────
|
|
echo ""
|
|
echo "── Runtime deployment ──"
|
|
# Spin up a long-running container WITHOUT overriding the entrypoint, so
|
|
# the baked entrypoint chain (entrypoint.sh → entrypoint-user.sh) runs and
|
|
# deploys pi-toolkit + pi-extensions to ~/.pi/agent/. Override CMD to
|
|
# tail -f /dev/null so the container stays alive while we docker-exec.
|
|
CID=$(docker run -d --rm "$IMAGE" tail -f /dev/null)
|
|
cleanup() { docker rm -f "$CID" >/dev/null 2>&1 || true; }
|
|
trap cleanup EXIT
|
|
|
|
# Wait for entrypoint-user.sh to finish deploying pi-toolkit + extensions
|
|
for i in $(seq 1 30); do
|
|
if docker exec "$CID" test -L /home/developer/.pi/agent/keybindings.json 2>/dev/null; then
|
|
break
|
|
fi
|
|
sleep 1
|
|
done
|
|
|
|
exec_test() {
|
|
local label="$1"; local cmd="$2"
|
|
if docker exec -u developer "$CID" sh -c "$cmd" >/dev/null 2>&1; then
|
|
printf " ✅ %s\n" "$label"; PASS=$((PASS+1))
|
|
else
|
|
printf " ❌ %s\n" "$label"; FAIL=$((FAIL+1))
|
|
fi
|
|
}
|
|
|
|
exec_test "keybindings.json (pi-toolkit)" 'test -L $HOME/.pi/agent/keybindings.json && echo ok'
|
|
exec_test "extensions ≥ 4 (pi-extensions)" 'count=$(ls -1 $HOME/.pi/agent/extensions/*.ts 2>/dev/null | wc -l); [ $count -ge 4 ] && echo "$count extensions"'
|
|
exec_test "mempalace.ts bridge" 'test -L $HOME/.pi/agent/extensions/mempalace.ts && echo ok'
|
|
exec_test "settings.json bootstrapped" 'test -f $HOME/.pi/agent/settings.json && echo ok'
|
|
|
|
# ── Image size ────────────────────────────────────────────────────────
|
|
echo ""
|
|
echo "── Image size ──"
|
|
SIZE_MB=$(docker image inspect "$IMAGE" --format='{{.Size}}' | awk '{printf "%d", $1/1048576}')
|
|
if [ "$SIZE_MB" -le "$SIZE_THRESHOLD_MB" ]; then
|
|
printf " ✅ size: %d MB (threshold %d MB)\n" "$SIZE_MB" "$SIZE_THRESHOLD_MB"; PASS=$((PASS+1))
|
|
else
|
|
printf " ❌ size: %d MB exceeds threshold %d MB\n" "$SIZE_MB" "$SIZE_THRESHOLD_MB"; FAIL=$((FAIL+1))
|
|
fi
|
|
|
|
# ── Summary ───────────────────────────────────────────────────────────
|
|
echo ""
|
|
echo "=== Results: ${PASS} passed, ${FAIL} failed ==="
|
|
[ "$FAIL" -eq 0 ]
|