# Manual host-side publish — escape hatch when CI is broken

This runbook is the procedure for publishing an opencode-devbox release **directly from a developer host** when the Gitea Actions → Docker Hub path is broken. Used in anger on 2026-05-28 to ship `v1.15.12` after five consecutive CI publish failures (runs #332/333/334/336 + a rerun) and as a parallel diagnostic that pinpointed the root cause (buildkit `cache-export mode=max` returning HTTP 400 from the Hub CDN).

The procedure is also a **diagnostic probe**. If the host-side publish succeeds where CI fails, the failure is somewhere in the runner → Hub path (cache-export, runner egress, runner-image, action versions). If host-side fails the same way, the failure is in your local buildx + Hub combination and you need a different escape (different network, different account, file an upstream).

## When to reach for this

- Tag pushed, CI keeps failing on `docker buildx build --push`, the failure shape is stable across reruns.
- Failure body looks like a registry-tier rejection (HTTP 4xx, HTML response body, repeats on every retry) — i.e. not a transient.
- You've already disproved the obvious suspects (action pin, runner image, network) per the [`ci-release-watcher` skill](../../../.agents/skills/ci-release-watcher/SKILL.md) playbook.
- You need the release **shipped today** and don't want to wait for a CI fix to land + re-trigger.

If CI is broken because **a workflow change you just made is bad**, fix the workflow and re-tag with a letter suffix. This runbook is for when the workflow looks correct but the publish path itself is broken.

## Prerequisites on the host

- Docker (or Orbstack on macOS) with `docker buildx` available — multi-arch publish needs `setup-qemu` equivalent. Orbstack ships QEMU emulators for both archs by default; on Linux install `qemu-user-static` and run `docker run --privileged --rm tonistiigi/binfmt --install all` once per host.
- `docker login` credentials for `joakimp` on Docker Hub (PAT or password). Confirm with `docker info | grep Username`.
- A clone of `opencode-devbox` checked out at the **exact tag** you want to publish. `git status` clean. `git describe --tags --exact-match HEAD` should print the tag.
- Network connectivity to `registry-1.docker.io` from the host. Verify with `curl -sI https://registry-1.docker.io/v2/ | head -1` (expects `401 Unauthorized` — that's the v2 API saying "auth required", which means you can reach it).

## How to use this runbook

A working reference script lives next to this doc: **[`docs/manual-host-publish.sh`](manual-host-publish.sh)**. It is the literal script that shipped opencode-devbox v1.15.12 on 2026-05-28 from a developer Mac via Orbstack, with the BASE_HASH and version pins of that release. To publish a different release, **copy it to a new file, edit four constants at the top, and run it**:

```bash
cp docs/manual-host-publish.sh /tmp/manual-publish-vX.Y.Z.sh
# Edit at top of file:
#   RELEASE_TAG="vX.Y.Z"
#   BASE_HASH="<12-char hash from CI's base-decide step>"
#   OMOS_VERSION="<from npm registry, see step 2 below>"
bash /tmp/manual-publish-vX.Y.Z.sh
```

The sections below explain what the script does and what you need to know to edit those three constants safely.

## 1. Pin RELEASE_TAG

The git tag you're publishing. Must match a tag in the local clone:

```bash
git fetch && git checkout v1.15.13   # whatever you're publishing
git describe --tags --exact-match HEAD
```

The script asserts `HEAD == ${RELEASE_TAG}^{commit}` before doing anything destructive. If you've drifted, fix it with `git checkout` before running.

## 2. Pin OMOS_VERSION

Gitea CI's `resolve-versions` job queries the npm registry at workflow time and threads the concrete version through the omos variant build, mitigating the silent same-bytes-across-releases regression class documented in `AGENTS.md`. Do the same by hand:

```bash
curl -sf https://registry.npmjs.org/oh-my-opencode-slim/latest | jq -r .version
```

Paste the version string into the script's `OMOS_VERSION` constant. Don't leave the script defaulting to `latest` — the registry buildcache will silently reuse a stale layer if the build-arg byte-equals a previous build.

## 3. Pin BASE_HASH

This is the 12-char hash that CI's `base-decide` job computes from `Dockerfile.base` + `rootfs/**` + `entrypoint*.sh`. Three ways to get it, in order of preference:

**A. From a prior CI run on the same commit** (cheapest — if the Gitea Actions run that triggered on this tag got far enough to log `base-decide`'s output, just read it):

```
Gitea Actions → the run for vX.Y.Z → base-decide job → "Compute base tag" step → last line:
   Computed base tag: base-XXXXXXXXXXXX
```

This is the canonical source. The whole reason for the manual escape is that *something later in CI broke* — `base-decide` itself is fast, deterministic, and almost always succeeds.

**B. From an existing image on the Hub** if a recent release already published a `base-<hash>` tag and the inputs haven't changed, you can copy that hash. Confirm with `docker manifest inspect joakimp/opencode-devbox:base-latest` and read the digest — if it matches a `base-<hash>` you already see on the Hub, that hash is yours.

**C. Compute it locally**, replicating CI's exact recipe (the script in `.gitea/workflows/docker-publish-split.yml` `base-decide.compute`):

```bash
{
  cat Dockerfile.base
  find rootfs -type f \
    ! -path '*/__pycache__/*' \
    ! -name '*.pyc' \
    ! -name '.DS_Store' \
    ! -name '._*' \
    -print0 2>/dev/null | sort -z | xargs -0 cat 2>/dev/null
  cat entrypoint.sh entrypoint-user.sh
} | sha256sum | cut -c1-12
```

The junk-file filters (`__pycache__`, `.DS_Store`, `._*` AppleDouble) matter — they are gitignored but `find -type f` picks them up locally and would diverge your hash from CI's clean checkout. Don't skip them.

If method C disagrees with method A, **trust A** and find out why your local tree differs. The hash in CI is what's on the Hub; that's what variants must FROM.

## What the script does (high level)

After the constants are set, the script runs a 5-step procedure. No editing needed inside the body; the whole flow is parameterised by the four constants above plus `IMAGE` (which is fixed to `joakimp/opencode-devbox`).

1. **Preflight** — buildx present, tag exists, `HEAD == tag`, multi-arch builder created if missing.
2. **Base build (conditional)** — probe `${IMAGE}:base-${BASE_HASH}` on the Hub; if missing, build it multi-arch and push. **No `--cache-from` / `--cache-to`.** That's the whole point of this escape. If the base push itself fails the same way CI did, stop — the regression has spread to image push and you need a different host or account, not this runbook.
3. **Promote `base-latest`** — `docker buildx imagetools create` re-tags by manifest reference. No rebuild.
4. **Variants × 2** — sequential (not parallel; one host's egress can't saturate multiple multi-arch pushes safely). Each variant is `Dockerfile.variant` `FROM ${IMAGE}:base-${BASE_HASH}` plus the appropriate `INSTALL_OPENCODE` / `INSTALL_OMOS` build-args, tagged `${RELEASE_TAG}${suffix}` and `latest${suffix}`.
5. **Verify** — prints the digest of all 6 expected tags (4 variant + base-hash + base-latest). Spot-check that each `vX.Y.Z*` and its `latest*` alias share a digest.

Expected wall time on a recent Mac: ~25-40 min (base ~3 min if rebuilt, each variant ~3-7 min mostly QEMU arm64 emulation).

## Optional: update DOCKER_HUB.md description

CI's `update-description` job posts the rendered Hub description via the Hub API. The manual script does **not** do this — the release works fine without it. If you want parity, copy the curl invocation from the `update-description` job in `.gitea/workflows/docker-publish-split.yml` and run it from the host with a Hub PAT loaded into `HUB_PAT`. Cosmetic; can wait until CI is healthy and the next release pushes a fresh description automatically.

## After: capture diagnostic value

The whole point of running this manually is the diagnostic. Three things to record before moving on:

1. **Did the host publish succeed?** If yes and CI was failing on the same exact code, you've localised the failure to the runner side (cache-export, network, runner image). If no, the failure is in your local buildx + Hub combination and CI is a victim, not a cause.
2. **What was different from CI?** Document at minimum: `docker buildx version`, the host's `buildx ls` output (driver name + version), whether you used `--cache-to` or not, and which network you were on.
3. **File the upstream.** If the diagnostic narrowed the failure to a specific buildkit/buildx behaviour, file at `moby/buildkit` or `docker/buildx` with: stable failure shape, the exact request URL fragment (`Offset:0` / `_state=...` / digest if visible), the timeline boundary when failures started, and what worked vs what failed in your repro. The 2026-05-28 cache-export-mode=max regression is a worked example.

Restore CI as the primary publish path as soon as the underlying regression is fixed or worked around at workflow level. This runbook should be exercised rarely.

## Variants of this runbook

- **pi-devbox** — same idea, simpler: only one image (`joakimp/pi-devbox`), one tag pair (`vX.Y.Z` + `latest`), no split base. Adapt the script: drop the `BASE_HASH` constant + steps 2-3 + the variant function; replace with a single `docker buildx build --file Dockerfile --build-arg PI_VERSION=... --tag joakimp/pi-devbox:${RELEASE_TAG} --tag joakimp/pi-devbox:latest --push .`.
- **opencode-devbox letter-suffix rebuild** (e.g. `v1.15.12b`) — same procedure end-to-end. The `BASE_HASH` will probably be unchanged from the prior release if no rootfs/entrypoint/Dockerfile.base changes shipped, so the base-build step skips itself automatically via the Hub probe.
- **Single-variant publish** for partial-failure recovery (e.g. CI succeeded for base + 3 variants but the 4th failed) — comment out the three completed `build_variant` calls in your copy of the script. Keep `imagetools create` for `base-latest` only if it didn't already promote. Then re-run.