feat: split-base build pipeline (parallel, manual-trigger only)
Validate / docs-check (push) Successful in 15s
Validate / validate-base (push) Successful in 12m13s
Validate / validate-omos (push) Failing after 15m48s
Validate / validate-with-pi (push) Successful in 13m43s
Validate / validate-omos-with-pi (push) Has been cancelled

Two-Dockerfile split-base build alongside the existing single-Dockerfile
pipeline. Goal: cut CI wall clock from ~165-180min to ~30-40min on
typical version-bump-only releases by reusing a base image across the
four variants.

Files added:
- Dockerfile.base       variant-independent layers (apt, locales, AWS
                        CLI, Node.js, mempalace, gitea-mcp, user setup,
                        chromadb prewarm, ENVs, entrypoints).
- Dockerfile.variant    FROMs ${BASE_IMAGE} and adds opencode / pi /
                        omos / Go installs gated by INSTALL_* args.
                        Each npm install -g uses NPM_CONFIG_PREFIX=/usr
                        per-RUN to keep baked binaries off the volume-
                        shadowed ~/.pi/npm-global path inherited from
                        base.
- .gitea/workflows/docker-publish-split.yml
                        workflow_dispatch-only pipeline:
                        base-decide -> build-base (conditional) ->
                        smoke-* (4 parallel) -> build-variant-*
                        (4 parallel) -> promote-base-latest ->
                        update-description. Hash-driven base reuse:
                        if base-<sha> already exists on Docker Hub,
                        the build is skipped entirely. Inputs:
                        release_tag (test tag suffix, default
                        v0.0.0-split-test) and promote_latest
                        (default false; gates latest-* aliases and
                        Hub description update).

Files unchanged:
- Dockerfile, docker-publish.yml, validate.yml all left in place so
  the production tag-push pipeline keeps working untouched.

Migration plan (in CHANGELOG Unreleased):
1. workflow_dispatch test run with promote_latest=false; verify the
   four variant images smoke-pass and have plausible sizes.
2. Compare manifest digests against the same-version output from the
   production pipeline (independent test run on the same commit).
3. Once verified across 1-2 release cycles, swap docker-publish-split.yml
   to on: push: tags: v* and retire docker-publish.yml.

AGENTS.md and CHANGELOG.md updated with file roles and the migration
plan. Production pipeline behavior is bit-for-bit unchanged on this
branch.
This commit is contained in:
2026-05-09 16:16:25 +02:00
parent b5da6a5cf8
commit 4c27e6fd8a
5 changed files with 1017 additions and 2 deletions
+4
View File
@@ -19,6 +19,10 @@ Image changes (will ship in the next tagged release):
- **Fix:** `pi install npm:<pkg>` (and any `npm install -g`) by the `developer` user no longer EACCES against the system npm prefix. `NPM_CONFIG_PREFIX` is now `/home/developer/.pi/npm-global` and the prefix's `bin/` is prepended to `PATH`. The directory lives on the `devbox-pi-config` named volume, so user-installed pi packages (themes, skills, extensions) survive container recreation and image rebuilds. Build-time `npm install -g` calls (opencode, pi, oh-my-opencode-slim) are unaffected because the new ENVs are declared after those steps in the Dockerfile, so the baked binaries still install to `/usr` and are not shadowed by the volume mount.
Build pipeline (branch `feat/split-build`, not yet merged):
- **New: split-base build pipeline.** `Dockerfile.base` (variant-independent layers — apt, locales, AWS CLI, Node.js, mempalace, gitea-mcp, user setup, chromadb prewarm, ENVs, entrypoints) builds once and is published as `joakimp/opencode-devbox:base-<sha>`. `Dockerfile.variant` `FROM`s that base and adds only opencode/omos/pi installs (or skips them per build-args). Companion workflow `.gitea/workflows/docker-publish-split.yml` runs as a `workflow_dispatch`-only pipeline alongside the existing `docker-publish.yml` so they don't conflict. Hash-driven base reuse: a content hash of `Dockerfile.base + rootfs/ + entrypoint*.sh` becomes the base tag; if the tag already exists on Docker Hub, the base build is skipped entirely. Estimated wall clock: version-bump-only release ~3040 min (vs ~165180 min today); base-touching release ~6070 min. Trade-off: two Dockerfiles to maintain, and `npm install -g` in the variant must override `NPM_CONFIG_PREFIX=/usr` per-RUN to keep baked binaries off the volume-shadowed path. Once 12 successful workflow_dispatch runs validate the output against the existing pipeline, the new workflow takes over `on: push: tags: v*` and the original is retired.
Docs-only (the DOCKER_HUB.md change can be patched live to Hub without a CI rebuild; AGENTS.md change is repo-internal):
- **Hub doc rewrite:** `DOCKER_HUB.md` is now generated from a hand-maintained `HUB_TEMPLATE` constant in `scripts/generate-dockerhub-md.py` instead of a section-by-section transformation of `README.md`. Drops from 24 997 bytes (3 byte headroom) to 5 551 bytes (~78% headroom). The old derive-from-README mechanism (`SECTION_RULES`, `TRIM_SUBSECTIONS`, `REPLACEMENTS`, `split_sections`, `trim_subsections`) is gone — README and Hub doc are now independent surfaces. Hub copy stays slim and links out to the gitea README for full depth (build args, multi-user setup, AWS Bedrock walkthrough, MemPalace deep-dive, language-specific dev sections). Trade-off: image-variants table and quick-start flow are now coupled to `HUB_TEMPLATE` and need a manual edit when they change — explicit and local rather than spread across rules.