opencode-devbox

Public Access

Author	SHA1	Message	Date
pi	1e98b53113	feat: publish pi-only build into the pi-devbox repo, not opencode-devbox (Option B) The pi-only variant was published as opencode-devbox:latest-pi-only — an 'opencode-devbox' tag containing no opencode, which confused users. - build-variant-pi-only now pushes joakimp/pi-devbox:base-pi-only[-vX.Y.Z] instead of opencode-devbox:*-pi-only. New PI_IMAGE workflow env. - Still built from the same Dockerfile.variant (single source of truth), still smoke-tested by smoke-pi-only / validate-pi-only before publish. - De-advertised pi-only from README, DOCKER_HUB (HUB_TEMPLATE), AGENTS, .gitea/README. opencode-devbox now publishes 8 tags + base-latest. - Documented in CHANGELOG (Unreleased) and the plan doc. Note: old opencode-devbox:{latest,vX.Y.Z}-pi-only tags from v1.15.13b are superseded and should be deleted from Docker Hub.	2026-06-03 17:04:21 +02:00
pi	237588253f	docs: fix stale variant/job counts missed in pi-only sweep Validate / base-change-warning (push) Successful in 6s Details Validate / docs-check (push) Successful in 17s Details Validate / validate-base (push) Successful in 3m40s Details Validate / validate-with-pi (push) Failing after 4m43s Details Validate / validate-omos (push) Successful in 7m7s Details Validate / validate-pi-only (push) Failing after 3m44s Details Validate / validate-omos-with-pi (push) Failing after 18m16s Details - AGENTS.md: 'eight load:true jobs' -> ten (add validate-pi-only, smoke-pi-only) - .gitea/README.md: 'four variants / eight tags' -> five / ten - docs/manual-host-publish.md: 'Variants x4 / 10 tags' -> x5 / 12 tags These are living operational facts; the remaining 'four/eight' hits are illustrative meta-instructions or dated historical CHANGELOG entries (correct as-is).	2026-06-03 16:34:36 +02:00
pi	fc034ceade	feat: add pi-only variant (pi without opencode) as basis for pi-devbox Validate / docs-check (push) Successful in 10s Details Validate / base-change-warning (push) Successful in 23s Details Validate / validate-omos (push) Successful in 4m36s Details Validate / validate-omos-with-pi (push) Failing after 5m40s Details Validate / validate-with-pi (push) Failing after 7m35s Details Validate / validate-pi-only (push) Failing after 3m45s Details Validate / validate-base (push) Failing after 16m12s Details All opencode-devbox variants set INSTALL_OPENCODE=true, so pointing pi-devbox at with-pi dragged opencode along and made it ~a re-tag of latest-with-pi. Add a 5th variant pi-only (INSTALL_OPENCODE=false, INSTALL_PI=true): pi + companions (toolkit, extensions, fork, recall) + base tooling, no opencode (~145 MB lighter than with-pi). - Dockerfile.variant: document pi-only in the variant table. - CI docker-publish-split.yml: new smoke-pi-only + build-variant-pi-only jobs (tags :VERSION-pi-only / :latest-pi-only, multi-arch); wired into promote-base-latest and update-description needs. - validate.yml: new validate-pi-only main-branch gate job. - smoke-test.sh: accept --variant pi-only; threshold 2750 MB; opencode-absent path already handled. - Docs: HUB_TEMPLATE (regenerated DOCKER_HUB.md), README, AGENTS (variant/tag counts 4->5, 8->10 tags), .gitea/README, manual-host-publish.sh (5 variants), plan doc implementation note. This is the single source of truth for joakimp/pi-devbox, which now FROMs latest-pi-only. Versions unchanged (opencode 1.15.13, pi 0.78.0).	2026-06-03 16:13:44 +02:00
pi	f09a4f382a	feat: host-agnostic LAN access (base) + fork/recall in pi variants Validate / base-change-warning (push) Successful in 22s Details Validate / docs-check (push) Successful in 44s Details Validate / validate-base (push) Successful in 3m27s Details Validate / validate-omos (push) Successful in 7m3s Details Validate / validate-with-pi (push) Failing after 4m33s Details Validate / validate-omos-with-pi (push) Failing after 8m29s Details Item A — LAN access (base image): - New rootfs/usr/local/lib/opencode-devbox/setup-lan-access.sh, invoked non-fatally from entrypoint-user.sh. On VM-backed hosts (macOS OrbStack / Docker Desktop, detected via host.docker.internal) it generates a writable ~/.ssh-local/config that uses the host as an SSH jump to reach LAN peers; no-op on native Linux. Ships the mechanism (generic 'host' jump alias), not policy (targets stay in the user's bind-mounted ~/.ssh/config). - New env knobs: DEVBOX_LAN_ACCESS (auto\|jump\|off), HOST_SSH_USER, DEVBOX_HOST_ALIAS. dssh/dscp aliases in .bash_aliases (guarded). Item B — pi-fork (fork) + pi-observational-memory (recall) in pi variants: - Dockerfile.variant clones both elpapi42 repos to /opt and runs npm install there at build time (local-path 'pi install' does not npm-install, so deps must be present to load). New args PI_FORK_REPO/REF, PI_OBSMEM_REPO/REF. - entrypoint-user.sh registers them at runtime via 'pi install /opt/<pkg>' (instant, in-place, idempotent; tools bind on next pi start). - CI resolve-versions resolves each repo's master HEAD to a commit SHA and passes PI_FORK_REF/PI_OBSMEM_REF — same cache-hit guard as PI_VERSION. - smoke-test asserts /opt clones + node_modules + settings.json registration; size thresholds bumped (with-pi 2700->2900, omos-with-pi 3700->3900). Versions unchanged (opencode 1.15.13, pi 0.78.0 — both still latest). Docs: README LAN section + env table, .env.example, AGENTS.md, CHANGELOG. Plan recorded in docs/plan-lan-access-and-pi-extensions.md.	2026-06-03 15:45:45 +02:00
pi	1fe5b5df91	ci: workflow-level 3-attempt retry around buildx build --push Validate / docs-check (push) Successful in 7s Details Validate / base-change-warning (push) Successful in 6s Details Validate / validate-with-pi (push) Successful in 4m11s Details Validate / validate-omos (push) Successful in 4m31s Details Validate / validate-base (push) Successful in 5m19s Details Validate / validate-omos-with-pi (push) Successful in 11m38s Details Belt-and-braces against transient registry-1.docker.io blips (rate limits, brief 5xx, CDN flap). Replaces all five push docker/build-push- action@v7 invocations (1 base + 4 variants) with shell: bash steps that run docker buildx build --push in a for-loop with backoff (15s, 30s). Smoke build steps (load: true, no push) are untouched. Does NOT mask deterministic failures: a true regression (e.g. the cache-export 400 we hit 2026-05-23..28) fails all 3 attempts identically and the job still fails by design. Orthogonal layer to both cache-export disablement and the ci-release-watcher skill's transient-rerun heuristic. - AGENTS.md: new Critical conventions bullet documenting the retry pattern, the consistency rule across push steps, and why it's duplicated rather than factored (Gitea Actions doesn't support reusable composite shell steps cleanly). - CHANGELOG.md: Unreleased section addendum, no image-side change. No image-side change.	2026-05-28 16:32:41 +02:00
joakimp	51ec4a88cf	CI: drop registry cache-export from build-base (Hub 400 root cause) Validate / base-change-warning (push) Successful in 6s Details Validate / docs-check (push) Successful in 13s Details Validate / validate-with-pi (push) Successful in 4m9s Details Validate / validate-omos (push) Successful in 4m31s Details Validate / validate-base (push) Successful in 5m40s Details Validate / validate-omos-with-pi (push) Successful in 12m49s Details Diagnosed during manual v1.15.12 publish: buildkit's mode=max cache export to registry-1.docker.io reproducibly returns HTTP 400 with HTML body on the resumable-upload PUT. Image push (layers + manifest) works fine in parallel; only --cache-to fails. Removing cache-from/cache-to lets the publish complete. This explains all four prior CI failures (runs 332/333/334/336) which shared the exact same failure shape. Action-pin hypothesis (setup-buildx-action v4.1.0) was correctly disproven by run 336 with v4.0.0 pinned. Trade-off: every Dockerfile.base change now pays the full ~3 min multi-arch build. Unchanged bases short-circuit at the content-addressed probe step in base-decide and never re-build, so day-to-day cost is zero. Re-enable when moby/buildkit upstream resolves the cache-export protocol mismatch with Hub CDN, or when we can switch to a non-registry cache backend. CHANGELOG.md: full root-cause writeup in Unreleased section, including status update on every prior suspect (all ruled out).	2026-05-28 10:40:08 +00:00
joakimp	be2a16834c	Cut v1.15.12 — revert v4.0.0 pin (busted), bump pi to 0.76.0 Validate / docs-check (push) Successful in 8s Details Validate / base-change-warning (push) Successful in 52s Details Validate / validate-base (push) Failing after 3m34s Details Publish Docker Image / base-decide (push) Successful in 10s Details Publish Docker Image / resolve-versions (push) Successful in 4s Details Validate / validate-with-pi (push) Failing after 4m0s Details Validate / validate-omos (push) Failing after 6m50s Details Validate / validate-omos-with-pi (push) Failing after 12m15s Details Publish Docker Image / build-base (push) Failing after 30m40s Details Publish Docker Image / smoke-base (push) Has been skipped Details Publish Docker Image / smoke-with-pi (push) Has been skipped Details Publish Docker Image / build-variant-base (push) Has been skipped Details Publish Docker Image / build-variant-with-pi (push) Has been skipped Details Publish Docker Image / smoke-omos (push) Has been skipped Details Publish Docker Image / build-variant-omos-with-pi (push) Has been skipped Details Publish Docker Image / build-variant-omos (push) Has been skipped Details Publish Docker Image / smoke-omos-with-pi (push) Has been skipped Details Publish Docker Image / promote-base-latest (push) Has been skipped Details Publish Docker Image / update-description (push) Has been skipped Details The v1.15.11b experiment confirmed setup-buildx-action@v4.1.0 is NOT the regressor: pinning all 9 references to @v4.0.0 reproduced the exact same '400 Bad request' from registry-1.docker.io on the first layer-blob PUT. CI run #336 failed twice (original + Gitea auto-rerun), both with HTML 400 bodies (CDN-tier rejection) at Offset:0. UUIDs and _state signatures differ across attempts; only the failure pattern is stable. Reverting all 9 pins back to @v4 — keeping a wrong pin holds us off action improvements with no benefit. Real suspects now narrow to: runner-image (catthehacker:act-latest, floating), runner-2 host network egress, buildx 0.34.x signed _state token format, or per-repo Hub-side state. Investigation deferred; this release ships via manual docker buildx build --push from a developer Orbstack to bypass the broken runner-network → Hub-CDN combo (we know that path works in ~25s for the same multi-arch build to the same Hub account). PI_VERSION=latest resolves to pi-coding-agent 0.76.0 (published 2026-05-27 20:03 UTC). OPENCODE_VERSION stays at 1.15.11 (no upstream bump since 1.15.11 was published 2026-05-27 03:59 UTC). Files: - .gitea/workflows/docker-publish-split.yml: 9 setup-buildx-action references reverted from @v4.0.0 to @v4 - CHANGELOG.md: v1.15.12 entry with regression triage status (ruled-out vs still-suspect)	2026-05-28 08:11:00 +00:00
joakimp	a16da2f041	Cut v1.15.11b — pin setup-buildx-action@v4.0.0 Validate / docs-check (push) Successful in 6s Details Validate / base-change-warning (push) Successful in 6s Details Validate / validate-with-pi (push) Failing after 4m1s Details Publish Docker Image / base-decide (push) Successful in 8s Details Publish Docker Image / resolve-versions (push) Successful in 5s Details Validate / validate-omos-with-pi (push) Failing after 4m52s Details Validate / validate-omos (push) Failing after 6m41s Details Validate / validate-base (push) Failing after 8m55s Details Publish Docker Image / promote-base-latest (push) Has been skipped Details Publish Docker Image / update-description (push) Has been skipped Details Publish Docker Image / build-base (push) Failing after 37m43s Details Publish Docker Image / smoke-base (push) Has been skipped Details Publish Docker Image / smoke-omos (push) Has been skipped Details Publish Docker Image / smoke-with-pi (push) Has been skipped Details Publish Docker Image / build-variant-omos (push) Has been skipped Details Publish Docker Image / build-variant-with-pi (push) Has been skipped Details Publish Docker Image / smoke-omos-with-pi (push) Has been skipped Details Publish Docker Image / build-variant-base (push) Has been skipped Details Publish Docker Image / build-variant-omos-with-pi (push) Has been skipped Details The v1.15.11 publish failed three times in a row (runs #332/333/334) with identical '400 Bad request' from registry-1.docker.io on the multi-arch buildx layer-blob PUT. Triage on 2026-05-27 confirmed: - Multi-arch buildx push from a developer host: succeeds in 25s (same Hub account, same multi-arch path) - Account / repo / Hub-CDN: all healthy - Last known-good Gitea-runner Hub push: 2026-05-23 ~20:26 UTC (pi-devbox v0.75.5b) — predates docker/setup-buildx-action v4.1.0 by <24h - docker/setup-buildx-action@v4 floats to v4.1.0 (published 2026-05-22 16:00 UTC), bundling a newer buildx/buildkit whose push protocol may trip Hub's CDN URI-length cap on the ~1.4 KB _state query string in resumable-upload PUT URLs. Pinning all nine setup-buildx-action references to @v4.0.0 to test the hypothesis. setup-qemu-action@v3 left floating since QEMU wasn't in the suspected blast radius. If v4.0.0 publishes cleanly we keep the pin and file an upstream buildkit/buildx issue. No source changes — same OPENCODE_VERSION=1.15.11, same Dockerfile.base and Dockerfile.variant. v1.15.11 (original tag) is preserved as a historical marker of the first publish attempt; v1.15.11b becomes the canonical release.	2026-05-27 21:05:17 +00:00
joakimp	3cbcb44cf5	CI: fix resolve-versions to use curl+jq instead of npm view catthehacker/ubuntu:act-latest ships Node/npm under /opt/acttoolcache/ with PATH updated only in /etc/environment. act_runner (nektos/act) does not source /etc/environment — it reads the Docker image's ENV instructions (inspectResult.Config.Env) which only contain DEBIAN_FRONTEND=noninteractive. So npm is NOT on PATH and 'npm view ...' would have CI-failed on first run. Fix: query the npm registry HTTP API directly with curl+jq, both of which are already used extensively by this workflow (curl for Hub auth/manifest inspect, jq for token parsing). The endpoint https://registry.npmjs.org/<pkg>/latest returns JSON with a 'version' field — equivalent to 'npm view <pkg> version' but with no toolchain dependency. Verified locally: both URLs resolve correctly to 0.75.5 (pi) and 1.1.1 (omos). Evidence: nektos/act pkg/container/docker_run.go reads imageEnv from inspectResult.Config.Env, not /etc/environment. DefaultPathVariable() in linux_container_environment_extensions.go returns a hardcoded path with no /opt/acttoolcache in it.	2026-05-24 15:59:53 +00:00
joakimp	f7c34091b1	CI: preventative fix for PI_VERSION/OMOS_VERSION cache-hit silent regression Mirrors the pi-devbox v0.75.5b fix (2026-05-23) onto the four-variant pipeline here. The with-pi, omos, and omos-with-pi variants install upstream npm packages whose _VERSION build-args defaulted to 'latest'. When the build-arg string is byte-identical across builds, the layer hash is identical and the registry buildcache silently reuses the layer from whatever upstream version was current when the cache was first populated — same mechanism that shipped pi-devbox v0.74.0..v0.75.5 with identical image bytes. Currently masked here because OPENCODE_VERSION is a hard-coded ARG that bumps every release; parent-chain cache invalidation flushes the downstream pi/omos layers. Masking would fail on any vN.N.Nb opencode- version-unchanged release that only bumps pi or omos. Filed last night as parked followup; fixing preventatively now that #5 (AWS SSO inside tor-ms22 container) cleared. CHANGES .gitea/workflows/docker-publish-split.yml — new resolve-versions job running 'npm view @earendil-works/pi-coding-agent version' and 'npm view oh-my-opencode-slim version', exposing concrete strings as job outputs. All six affected jobs (smoke-omos, smoke-with-pi, smoke-omos-with-pi, build-variant-omos, build-variant-with-pi, build-variant-omos-with-pi) now consume them as PI_VERSION / OMOS_VERSION build-args. smoke-base / build-variant-base unaffected. scripts/smoke-test.sh — new run_expect helper asserting an expected substring in command output. The pi check uses EXPECTED_PI_VERSION; the omos check uses EXPECTED_OMOS_VERSION against npm ls -g. Both env vars are wired from resolve-versions outputs in the smoke jobs. Catches this regression class on the next release, not four releases later. Dockerfile.variant — comment blocks above OPENCODE_VERSION (source- pinned, not subject to the bug), PI_VERSION (CI-resolved), and OMOS_VERSION (CI-resolved) explaining the cache-hit footgun. AGENTS.md — new convention bullet under 'Critical conventions' naming the resolve-versions job + EXPECTED__VERSION wiring as the contract to keep in lockstep when modifying variant build-args. .gitea/README.md — Step 1 expanded to cover the parallel resolve- versions job alongside base-decide; pipeline diagram updated. CHANGELOG.md — Unreleased entry describing the fix, masking mechanism, and audit footprint. No image-content change expected on the next release vs what 'latest' would have resolved to anyway. Purely makes the cache invalidate correctly going forward.	2026-05-24 15:38:36 +00:00
joakimp	b6e4d89a2c	ci: filter __pycache__ and macOS metadata from base hash compute Validate / docs-check (push) Successful in 14s Details Validate / base-change-warning (push) Successful in 18s Details Validate / validate-omos (push) Successful in 4m34s Details Validate / validate-omos-with-pi (push) Successful in 4m57s Details Validate / validate-with-pi (push) Successful in 6m9s Details Validate / validate-base (push) Successful in 14m48s Details Defensive against local-vs-CI hash divergence. `find rootfs -type f` includes gitignored junk like rootfs/__pycache__/*.pyc and macOS .DS_Store/._AppleDouble files, which CI's clean checkout never sees. This bit us during v1.15.4 debugging when a stale generate-config.cpython-314.pyc on the local rootfs/ produced base-3605aa6b6ab1 while CI computed base-35ee5fe7861a. Took meaningful time to track down because git status doesn't surface gitignored files. Verified: same filter applied to current clean tree still produces 35ee5fe7861a (the published v1.15.4b base digest).	2026-05-20 22:45:27 +02:00
joakimp	8f2c9f5112	v1.15.4b: omos-with-pi threshold bump + update-description partial-publish fix Validate / docs-check (push) Successful in 7s Details Validate / base-change-warning (push) Successful in 20s Details Validate / validate-base (push) Successful in 3m36s Details Publish Docker Image / base-decide (push) Successful in 13s Details Publish Docker Image / build-base (push) Has been skipped Details Validate / validate-with-pi (push) Successful in 4m14s Details Validate / validate-omos (push) Successful in 7m1s Details Publish Docker Image / smoke-base (push) Successful in 3m37s Details Publish Docker Image / smoke-omos (push) Successful in 4m39s Details Publish Docker Image / smoke-omos-with-pi (push) Successful in 5m7s Details Publish Docker Image / smoke-with-pi (push) Successful in 6m24s Details Validate / validate-omos-with-pi (push) Successful in 15m59s Details Publish Docker Image / build-variant-base (push) Successful in 14m12s Details Publish Docker Image / build-variant-omos (push) Successful in 19m29s Details Publish Docker Image / build-variant-with-pi (push) Successful in 23m7s Details Publish Docker Image / build-variant-omos-with-pi (push) Successful in 26m16s Details Publish Docker Image / promote-base-latest (push) Has been skipped Details Publish Docker Image / update-description (push) Successful in 8s Details Recovery for v1.15.4's partial publish (omos-with-pi exceeded 3500 MB smoke threshold; other 3 variants published cleanly). Two changes: 1. omos-with-pi threshold 3500 -> 3700 MB. Compounded growth from opencode 1.15.0 -> 1.15.4 (4 patch versions) plus pi 0.74.0 -> 0.75.3 (minor + 3 patches) summed in the omos-with-pi variant, just over the existing limit. Same pattern as prior threshold bumps (v1.14.31c, v1.15.0b). Restores ~150 MB headroom for routine apt-upgrade drift. 2. update-description workflow bug fix. Pre-existing latent bug exposed by v1.15.4's partial publish: update-description.needs includes all 4 build-variant-* jobs, and gitea Actions' default behavior is 'skipped need => skip dependent' \u2014 even when the job's own if: condition is satisfied. So when build-variant-omos-with-pi was skipped (because its smoke failed), update-description cascaded into a skip too, and Hub description didn't refresh on v1.15.4 despite 3 variants publishing. Fix: wrap if: in always() + explicit success check on the base variant. Same fix applied to promote-base-latest preemptively (it has the same latent bug, currently masked by the cache-hit gate). No image-side changes \u2014 cache hit on base-35ee5fe7861a.	2026-05-18 22:30:59 +02:00
joakimp	18b9c9c549	CI: harden promote-base-latest (pinned crane + skip on cache-hit) Validate / docs-check (push) Successful in 10s Details Validate / base-change-warning (push) Successful in 16s Details Validate / validate-with-pi (push) Successful in 4m10s Details Validate / validate-omos (push) Successful in 4m34s Details Validate / validate-base (push) Has been cancelled Details Validate / validate-omos-with-pi (push) Has been cancelled Details Two workflow-only changes for promote-base-latest, no image-side impact: T14 \u2014 replace imjasonh/setup-crane@v0.4 with direct pinned crane install. The action's bootstrap script calls api.github.com/.../releases/latest at every run to discover the crane version. That call periodically rate-limits and returns JSON without .tag_name, jq emits 'null', the action then downloads .../releases/download/null/... \u2192 404 \u2192 'gzip: unexpected end of file' \u2192 exit 2. We hit this on the v1.15.3 release (2026-05-16) where it was cosmetic only \u2014 base-latest was already correct from cache hit \u2014 but the red-X is annoying. Replaced with curl + tar pinned to crane v0.21.6 (latest at time of change). Same pattern as other GitHub-sourced binaries in the Dockerfile layer (gosu, fzf, eza etc.); operator bumps CRANE_VERSION deliberately when wanting updates. T15 \u2014 gate promote-base-latest on need_build == 'true'. When the base layer's content hash hasn't changed (cache hit on existing base-<hash> from a prior run), base-latest already points at the correct digest. The retag is a tautology, and any transient failure of it produces a red-X for an operation that didn't need to happen. Skipping the job entirely on cache-hit is correct and removes a whole class of cosmetic failure. Manual workflow_dispatch with promote_latest=true still bypasses the gate as an escape hatch (e.g., if base-latest got hand-deleted and needs regeneration without rebuilding the base). This will not trigger a CI publish run (main-branch commit, no tag).	2026-05-18 21:45:10 +02:00
joakimp	034830710c	workflow: use github.ref_type directly in promote/update-description if-conditions Validate / docs-check (push) Successful in 8s Details Validate / base-change-warning (push) Successful in 10s Details Validate / validate-with-pi (push) Successful in 4m23s Details Validate / validate-omos-with-pi (push) Successful in 5m10s Details Validate / validate-omos (push) Successful in 7m5s Details Validate / validate-base (push) Successful in 10m5s Details Gitea Actions evaluates 'env.PROMOTE_LATEST' as empty in YAML 'if:' contexts even though the same env var substitutes correctly in shell run: blocks. Result: on v1.15.0/v1.15.0b tag pushes, the build-variant-* jobs correctly pushed latest-* aliases (shell context), but promote-base-latest and update-description got skipped (YAML context), so the Hub README description wasn't refreshed. Switch to evaluating github.ref_type directly in the if-conditions — matches the production-trigger semantics and avoids the env-var indirection that gitea evaluates inconsistently.	2026-05-15 13:50:46 +02:00
joakimp	dba05da7d1	validate.yml: use Hub base-latest as variant parent + warn on base-input changes Validate / docs-check (push) Successful in 9s Details Validate / base-change-warning (push) Successful in 11s Details Validate / validate-base (push) Failing after 21s Details Validate / validate-omos (push) Failing after 1m49s Details Validate / validate-with-pi (push) Failing after 1m46s Details Validate / validate-omos-with-pi (push) Failing after 13m9s Details The previous two-step approach (build Dockerfile.base \ then Dockerfile.variant FROM the local image) doesn't work: each docker/build-push-action@v7 invocation runs in its own buildx container context, and an image loaded into the host docker daemon by step N is not visible to step N+1's buildx invocation. Variant builds in validate.yml now FROM joakimp/opencode-devbox:base-latest on Docker Hub, matching the production smokes' parent. Trade-off: PRs/pushes that change Dockerfile.base, rootfs/, or entrypoint*.sh are not exercised here \u2014 only release tags rebuild the base via docker-publish-split.yml. The new base-change-warning job surfaces a runtime warning when a commit modifies any base-image input, telling the author to run a workflow_dispatch test if they want full validation before merging.	2026-05-14 20:53:19 +02:00
joakimp	a438c67f06	fix: update validate.yml for split-base Dockerfiles Validate / validate-omos (push) Failing after 20s Details Validate / docs-check (push) Successful in 22s Details Validate / validate-with-pi (push) Failing after 3m8s Details Validate / validate-omos-with-pi (push) Failing after 3m6s Details Validate / validate-base (push) Failing after 14m57s Details Replace single-Dockerfile build with two-step: build Dockerfile.base first (loads as opencode-devbox:validate-base), then build Dockerfile.variant with BASE_IMAGE pointing at the local base image. All four validate jobs updated.	2026-05-14 19:48:46 +02:00
joakimp	07e07ec611	Bump opencode 1.14.44 -> 1.14.50; cut over to split-base pipeline Validate / validate-omos-with-pi (push) Waiting to run Details Validate / docs-check (push) Successful in 1m7s Details Validate / validate-with-pi (push) Failing after 3m16s Details Validate / validate-omos (push) Failing after 3m15s Details Validate / validate-base (push) Failing after 6m31s Details Publish Docker Image / base-decide (push) Failing after 11m59s Details Publish Docker Image / build-base (push) Has been cancelled Details Publish Docker Image / smoke-base (push) Has been cancelled Details Publish Docker Image / smoke-omos (push) Has been cancelled Details Publish Docker Image / smoke-with-pi (push) Has been cancelled Details Publish Docker Image / smoke-omos-with-pi (push) Has been cancelled Details Publish Docker Image / build-variant-base (push) Has been cancelled Details Publish Docker Image / build-variant-omos (push) Has been cancelled Details Publish Docker Image / build-variant-with-pi (push) Has been cancelled Details Publish Docker Image / build-variant-omos-with-pi (push) Has been cancelled Details Publish Docker Image / promote-base-latest (push) Has been cancelled Details Publish Docker Image / update-description (push) Has been cancelled Details - Bump OPENCODE_VERSION 1.14.44 -> 1.14.50 in Dockerfile.variant - Cut over: docker-publish-split.yml now triggers on push: tags: v* (was workflow_dispatch only). RELEASE_TAG and PROMOTE_LATEST derived from github.ref_type/ref_name for tag-push; inputs still available for manual workflow_dispatch runs. - Delete docker-publish.yml (retired, replaced by split-base pipeline) - Delete Dockerfile (retired, replaced by Dockerfile.base + Dockerfile.variant) - Update CHANGELOG: promote Unreleased -> v1.14.50 - Update AGENTS.md, .gitea/README.md, validate.yml: remove all references to the old single-Dockerfile pipeline and WIP migration plan	2026-05-14 19:39:45 +02:00
joakimp	7dc836ab66	fix: replace echo -e heredoc with brace-block in build-variant tags steps Validate / docs-check (push) Successful in 13s Details Validate / validate-base (push) Successful in 12m21s Details Validate / validate-omos (push) Successful in 18m38s Details Validate / validate-with-pi (push) Successful in 13m23s Details Validate / validate-omos-with-pi (push) Successful in 16m34s Details echo -e doesn't interpret \n in /bin/sh (dash), which is the default shell in catthehacker/ubuntu:act-latest. This caused steps.tags.outputs.tags to be empty, resulting in 'tag is needed when pushing to registry' from buildx. Also fixes a secondary bug: TAGS='${TAGS}\n...' stored a literal backslash-n rather than a real newline, which would have broken multi-tag output when promote_latest=true. Fix: replace with a brace block using plain echo, which produces actual newlines and works in both sh and bash.	2026-05-10 11:59:04 +02:00
joakimp	6fde27c212	Document the build pipeline architecture in .gitea/README.md Validate / docs-check (push) Successful in 16s Details Validate / validate-base (push) Successful in 12m9s Details Validate / validate-omos (push) Successful in 16m45s Details Validate / validate-with-pi (push) Successful in 13m30s Details Validate / validate-omos-with-pi (push) Successful in 15m15s Details The split-base build architecture, the NPM_CONFIG_PREFIX gotcha, the hash-driven base cache reuse mechanism, and the cutover plan from docker-publish.yml to docker-publish-split.yml were previously scattered across: - inline Dockerfile.base / Dockerfile.variant comments - CHANGELOG Unreleased entries - AGENTS.md mentions - docker-publish-split.yml header comment - my own session notes Consolidate into .gitea/README.md as the canonical architectural doc. Gitea (like GitHub) auto-renders this when navigating to .gitea/ in the web UI, so anyone investigating 'why is CI shaped this way?' finds it on the first click. Cross-referenced from AGENTS.md as the first thing to read when touching CI. Covers: - The two release pipelines and why both exist - Why split-base: cross-variant cache misses on layer-hash-divergence - The 6 phases of the split-base pipeline with an ASCII diagram - base-decide hash inputs and Docker Hub probe logic - NPM_CONFIG_PREFIX variant-override pattern (the volume-shadow trap) - Registry cache strategy (mode=max for cross-arch reuse) - Wall-clock estimates: version-bump vs base-touching releases - Validate workflow role - Runner expectations: catthehacker image, disk reclaim, concurrency, Gitea Actions @v4 artifact incompatibility - 4-step migration plan from docker-publish.yml to .split.yml - Cross-refs to related docs Does not duplicate AGENTS.md content; links to it for domain facts and release-day checklist.	2026-05-09 19:28:03 +02:00
joakimp	4c27e6fd8a	feat: split-base build pipeline (parallel, manual-trigger only) Validate / docs-check (push) Successful in 15s Details Validate / validate-base (push) Successful in 12m13s Details Validate / validate-omos (push) Failing after 15m48s Details Validate / validate-with-pi (push) Successful in 13m43s Details Validate / validate-omos-with-pi (push) Has been cancelled Details Two-Dockerfile split-base build alongside the existing single-Dockerfile pipeline. Goal: cut CI wall clock from ~165-180min to ~30-40min on typical version-bump-only releases by reusing a base image across the four variants. Files added: - Dockerfile.base variant-independent layers (apt, locales, AWS CLI, Node.js, mempalace, gitea-mcp, user setup, chromadb prewarm, ENVs, entrypoints). - Dockerfile.variant FROMs ${BASE_IMAGE} and adds opencode / pi / omos / Go installs gated by INSTALL_* args. Each npm install -g uses NPM_CONFIG_PREFIX=/usr per-RUN to keep baked binaries off the volume- shadowed ~/.pi/npm-global path inherited from base. - .gitea/workflows/docker-publish-split.yml workflow_dispatch-only pipeline: base-decide -> build-base (conditional) -> smoke-* (4 parallel) -> build-variant-* (4 parallel) -> promote-base-latest -> update-description. Hash-driven base reuse: if base-<sha> already exists on Docker Hub, the build is skipped entirely. Inputs: release_tag (test tag suffix, default v0.0.0-split-test) and promote_latest (default false; gates latest-* aliases and Hub description update). Files unchanged: - Dockerfile, docker-publish.yml, validate.yml all left in place so the production tag-push pipeline keeps working untouched. Migration plan (in CHANGELOG Unreleased): 1. workflow_dispatch test run with promote_latest=false; verify the four variant images smoke-pass and have plausible sizes. 2. Compare manifest digests against the same-version output from the production pipeline (independent test run on the same commit). 3. Once verified across 1-2 release cycles, swap docker-publish-split.yml to on: push: tags: v* and retire docker-publish.yml. AGENTS.md and CHANGELOG.md updated with file roles and the migration plan. Production pipeline behavior is bit-for-bit unchanged on this branch.	2026-05-09 16:16:25 +02:00
joakimp	f46c4ed017	CI matrix: add with-pi and omos-with-pi build variants Validate / docs-check (push) Successful in 39s Details Validate / validate-base (push) Successful in 13m40s Details Validate / validate-omos (push) Successful in 19m15s Details Validate / validate-with-pi (push) Successful in 13m53s Details Validate / validate-omos-with-pi (push) Successful in 18m26s Details Publish Docker Image / smoke-base (push) Successful in 12m21s Details Publish Docker Image / smoke-with-pi (push) Successful in 14m17s Details Publish Docker Image / smoke-omos (push) Successful in 16m55s Details Publish Docker Image / smoke-omos-with-pi (push) Successful in 16m22s Details Publish Docker Image / build-base (push) Successful in 40m52s Details Publish Docker Image / build-with-pi (push) Successful in 47m32s Details Publish Docker Image / build-omos (push) Successful in 51m41s Details Publish Docker Image / build-omos-with-pi (push) Successful in 56m44s Details Publish Docker Image / update-description (push) Successful in 15s Details .gitea/workflows/validate.yml: Adds validate-with-pi (INSTALL_PI=true) and validate-omos-with-pi (INSTALL_OMOS=true + INSTALL_PI=true). amd64 single-arch with smoke test, no push. .gitea/workflows/docker-publish.yml: Adds smoke-with-pi → build-with-pi and smoke-omos-with-pi → build-omos-with-pi job pairs. Each push-by-digest multi-arch (amd64+arm64) to Docker Hub with two tags: ${VERSION}-with-pi + latest-with-pi ${VERSION}-omos-with-pi + latest-omos-with-pi update-description.needs[] extended to wait on both new build jobs. scripts/smoke-test.sh: bun-presence check now treats omos and omos-with-pi as the bun variants. Pi state assertions wait up to 30s for entrypoint-user.sh to finish deploying pi-toolkit + extensions (omos-with-pi has more setup work than the base+pi path; the previous sleep-1 was too short and caused empty-error assertion failures on cold starts). Local verification (arm64 via OrbStack): base → 1871 MB, all checks PASS omos → 2813 MB, all checks PASS with-pi → 2277 MB, all checks PASS omos-with-pi → 3030 MB, all checks PASS CI now produces 8 Docker Hub tags per release: vX.Y.Z[n], latest vX.Y.Z[n]-omos, latest-omos vX.Y.Z[n]-with-pi, latest-with-pi vX.Y.Z[n]-omos-with-pi, latest-omos-with-pi	2026-05-08 13:53:08 +02:00
Joakim Persson	fc74a8f906	Collapse per-arch matrix back into single multi-arch push jobs Validate / docs-check (push) Successful in 17s Details Validate / validate-omos (push) Successful in 14m21s Details Validate / validate-base (push) Successful in 14m50s Details Publish Docker Image / smoke-base (push) Successful in 11m12s Details Publish Docker Image / smoke-omos (push) Successful in 22m0s Details Publish Docker Image / build-base (push) Successful in 42m25s Details Publish Docker Image / build-omos (push) Failing after 1h16m24s Details Publish Docker Image / update-description (push) Has been cancelled Details v1.14.31c's matrix jobs failed on Upload digest with GHESNotSupportedError — Gitea Actions doesn't support actions/upload-artifact@v4+. Separately, build-omos arm64 hung silently for 12 min in Set-up job, likely catthehacker pull contention between concurrent matrix children. Rather than downgrade artifacts to @v3, collapse the matrix entirely. docker/build-push-action@v7 with platforms: linux/amd64,linux/arm64 publishes a proper multi-arch manifest in one job, so the artifact-passing and imagetools create merge dance only existed to support a matrix split we no longer need. The matrix was designed around load: true disk exhaustion (v1.14.30b), but push-by-digest streams straight to the registry with fundamentally different disk profile. Reclaim step gives enough headroom for the combined amd64+arm64 push case. Workflow: 7 jobs → 5. docker-publish.yml: 263 → ~110 lines of YAML. Also: - timeout-minutes: 90 on build jobs so hung builds fail explicitly - BUILDKIT_PROGRESS=plain at workflow level for line-by-line arm64 logs - AGENTS.md §CI quirks documents the Gitea-specific traps (upload-artifact@v3-only, dash-not-bash, build-push-action@v7 multi-arch convention, reclaim requirement)	2026-05-01 12:28:34 +00:00
Joakim Persson	5a2d06340e	Fix dash-incompatible slash substitution and bump omos size threshold Validate / docs-check (push) Successful in 18s Details Validate / validate-base (push) Successful in 15m44s Details Validate / validate-omos (push) Successful in 15m21s Details Publish Docker Image / smoke-base (push) Successful in 14m30s Details Publish Docker Image / smoke-omos (push) Successful in 15m51s Details Publish Docker Image / build-base (linux/amd64) (push) Failing after 10m58s Details Publish Docker Image / build-omos (linux/amd64) (push) Failing after 15m9s Details Publish Docker Image / build-omos (linux/arm64) (push) Failing after 11m57s Details Publish Docker Image / build-base (linux/arm64) (push) Failing after 39m30s Details Publish Docker Image / merge-base (push) Has been skipped Details Publish Docker Image / merge-omos (push) Has been skipped Details Publish Docker Image / update-description (push) Has been skipped Details v1.14.31b made it through smoke-base and validate-base (reclaim worked), but two narrow bugs blocked the rest: 1. 'Derive platform slug' in the per-arch matrix jobs used bash ${PLATFORM_PAIR//\//-} which dash (/bin/sh in the runner) can't parse — 'Bad substitution'. Rewrote with 'tr / -'. 2. smoke-omos image size 3107 MB tripped the 3000 MB guardrail. All functional checks pass; the mempalace-toolkit bake-in from v1.14.30b added ~100 MB and the threshold was stale. Bumped to 3200 MB. No image-level changes.	2026-05-01 10:43:04 +00:00
Joakim Persson	23894bc19f	Reclaim runner disk before load: true smoke builds Validate / docs-check (push) Successful in 22s Details Validate / validate-base (push) Successful in 18m10s Details Validate / validate-omos (push) Failing after 25m54s Details Publish Docker Image / smoke-base (push) Successful in 11m50s Details Publish Docker Image / build-base (linux/amd64) (push) Failing after 38s Details Publish Docker Image / build-base (linux/arm64) (push) Failing after 21s Details Publish Docker Image / merge-base (push) Has been skipped Details Publish Docker Image / smoke-omos (push) Failing after 19m18s Details Publish Docker Image / build-omos (linux/amd64) (push) Has been skipped Details Publish Docker Image / build-omos (linux/arm64) (push) Has been skipped Details Publish Docker Image / merge-omos (push) Has been skipped Details Publish Docker Image / update-description (push) Has been skipped Details v1.14.31 publish and validate both hit 'No space left on device' on single-arch amd64 smoke/validate builds. The image has crossed ~3 GB and the runner's ~40 GB overlay starts ~70% full, so 'load: true' peak disk (tarball + unpacked image + buildx cache) no longer fits. Add a 'Reclaim runner disk' step to validate-base, validate-omos, smoke-base, smoke-omos. Strips catthehacker-resident toolchains we never use (hosted-tool-cache, dotnet, android, powershell, swift, ghc, jvm, microsoft, chromium, boost), then runs 'docker system prune -af --volumes' + 'docker builder prune -af' against the runner's dockerd before setup-buildx-action. Expected reclaim is 6-12 GB depending on what's resident. Deliberately NOT in the per-arch matrix build jobs — push-by-digest doesn't need it and pruning in parallel jobs risks one job nuking another's in-flight buildx cache. Also add workflow-level concurrency on docker-publish.yml so concurrent tag pushes serialize cleanly.	2026-05-01 09:34:52 +00:00
Joakim Persson	f0918ba915	Bump opencode to 1.14.31 and split multi-arch publish across runners Validate / docs-check (push) Successful in 26s Details Publish Docker Image / smoke-base (push) Failing after 11m1s Details Publish Docker Image / build-base (linux/amd64) (push) Has been skipped Details Publish Docker Image / build-base (linux/arm64) (push) Has been skipped Details Publish Docker Image / merge-base (push) Has been skipped Details Validate / validate-base (push) Failing after 13m48s Details Validate / validate-omos (push) Failing after 15m23s Details Publish Docker Image / smoke-omos (push) Failing after 16m20s Details Publish Docker Image / build-omos (linux/amd64) (push) Has been skipped Details Publish Docker Image / build-omos (linux/arm64) (push) Has been skipped Details Publish Docker Image / merge-omos (push) Has been skipped Details Publish Docker Image / update-description (push) Has been skipped Details The v1.14.30b publish failed on both variants with 'No space left on device' — arm64 QEMU-emulated layers were stored alongside amd64 on the same ~40 GB runner, and the mempalace-toolkit bake-in from v1.14.30b tipped peak disk over the edge during the nodejs dpkg unpack and the git-lfs layer export. Refactor docker-publish.yml to the canonical push-by-digest + manifest-merge pattern: smoke test (amd64) runs on its own runner, each (variant x arch) push target runs on its own fresh runner with outputs=type=image,push-by-digest=true,push=true (no local image store), then a tiny merge job assembles the multi-arch manifest with docker buildx imagetools create from digest artifacts. Per-runner disk peak is roughly one-quarter of the old single-job peak. The four Docker Hub tags per release are unchanged. As a bonus, amd64 and arm64 now build in parallel. No image-level changes beyond the opencode bump.	2026-05-01 08:43:08 +00:00
joakimp	113c9f0bb0	Infrastructure pass: CI smoke tests, floating versions, chown sentinel, generate-config script Main changes: - Extract opencode.json generation from entrypoint-user.sh into a standalone Python script (rootfs/usr/local/lib/opencode-devbox/ generate-config.py). Preserves the never-overwrite-existing-config guarantee. Cuts entrypoint-user.sh from 176 to 97 lines. - Install MemPalace via 'uv tool install' into an isolated venv at /opt/uv-tools/mempalace/ with a /usr/local/bin/mempalace-mcp-server wrapper, replacing the 'pip install --break-system-packages' escape hatch. The wrapper is what generate-config.py references in the auto-generated opencode.json. Also fix 'mempalace init' in entrypoint-user.sh to use --yes so first-start initialization isn't interactive (this used to hang or print prompts into the user's terminal). Gated by INSTALL_MEMPALACE build arg (default true) so users who don't need AI memory can shave ~300 MB. - Sentinel-file pattern in entrypoint.sh volume-ownership loop: write .devbox-owner after a successful chown -R, skip the recursive walk on subsequent starts when the sentinel matches FINAL_UID:FINAL_GID. Cuts multi-second startup costs to milliseconds on large volumes (nvim plugins, palace data). UID changes still trigger a full chown. - Float all GitHub/Gitea-hosted binary versions: gosu, fzf, git-lfs, neovim, bat, eza, zoxide, uv, gitea-mcp now default to 'latest' and resolve the newest upstream release at build time via the /releases/ latest redirect. Go (go.dev JSON feed) and oh-my-opencode-slim (npm @latest) likewise. Intentional pins still in place: OPENCODE_VERSION, NODE_VERSION=22, DEBIAN_VERSION=trixie-slim. Each *_VERSION ARG accepts an explicit value to lock a specific version when needed. - New scripts/smoke-test.sh verifies binary presence, opencode startup, entrypoint user drop, generate-config idempotency, bun's presence- per-variant, and image size against thresholds (2500 MB base, 3000 MB OMOS). Prints resolved component versions as its first step so CI logs always record what got baked into a given image. - New .gitea/workflows/validate.yml runs on push to main and PRs: single-arch amd64 build, smoke test, DOCKER_HUB.md sync check. Tag- triggered docker-publish.yml now smoke-tests each variant on amd64 before the full multi-arch push. - scripts/generate-dockerhub-md.py auto-generates DOCKER_HUB.md from README.md using explicit SECTION_RULES. --check mode fails CI when the committed file is out of sync. Enforces the 25 kB Docker Hub limit. Adding a new README section forces an explicit keep/drop/ replace decision. - Remove dead INSTALL_PYTHON build arg (was a no-op since mempalace added python3 unconditionally).	2026-04-28 23:28:43 +02:00
joakimp	de659fbc54	Switch to new Docker Hub /v2/auth/token API for description updates The old /v2/users/login endpoint is deprecated and returns tokens with insufficient permissions. Use /v2/auth/token with Bearer auth instead.	2026-04-12 19:10:55 +02:00
joakimp	d651a084de	Fix Docker Hub short description: trim to 100-byte limit	2026-04-12 19:00:34 +02:00
joakimp	18b4df23e5	Fix IPv6 connectivity failures: force IPv4 preference in CI builds Publish Docker Image / build-base (push) Successful in 30m22s Details Publish Docker Image / build-omos (push) Successful in 34m37s Details Publish Docker Image / update-description (push) Failing after 15s Details	2026-04-12 17:39:25 +02:00
joakimp	017f7f1343	Fix Docker Hub description update: use --rawfile and capture error response	2026-04-11 23:52:50 +02:00
joakimp	56f98da914	Add error handling to Docker Hub description update step Publish Docker Image / build-base (push) Successful in 31m21s Details Publish Docker Image / build-omos (push) Successful in 32m22s Details Publish Docker Image / update-description (push) Failing after 15s Details	2026-04-11 22:16:09 +02:00
joakimp	078c095116	Parallelize base and omos image builds into separate CI jobs Publish Docker Image / build-base (push) Successful in 30m52s Details Publish Docker Image / build-omos (push) Successful in 35m17s Details Publish Docker Image / update-description (push) Successful in 14s Details	2026-04-11 18:20:33 +02:00
joakimp	c0b887791f	Add Docker Hub description update step to CI workflow	2026-04-10 08:47:27 +02:00
joakimp	4314a3fb88	Fix CI: use vars for username, secrets for token Publish Docker Image / build-and-push (push) Successful in 28m54s Details	2026-04-10 00:23:05 +02:00
joakimp	23a2c3da8c	Fix CI: remove duplicate Docker socket mount Publish Docker Image / build-and-push (push) Failing after 11m17s Details	2026-04-10 00:04:26 +02:00
joakimp	489fa5c60f	Fix CI: use container image with Docker CLI and mount Docker socket Publish Docker Image / build-and-push (push) Failing after 1m7s Details	2026-04-10 00:00:51 +02:00
joakimp	713a1a6f97	Add Gitea Actions workflow for Docker Hub publishing and Docker Hub usage docs Publish Docker Image / build-and-push (push) Failing after 27s Details	2026-04-09 19:26:32 +02:00

37 Commits