fc74a8f906
Validate / docs-check (push) Successful in 17s
Validate / validate-omos (push) Successful in 14m21s
Validate / validate-base (push) Successful in 14m50s
Publish Docker Image / smoke-base (push) Successful in 11m12s
Publish Docker Image / smoke-omos (push) Successful in 22m0s
Publish Docker Image / build-base (push) Successful in 42m25s
Publish Docker Image / build-omos (push) Failing after 1h16m24s
Publish Docker Image / update-description (push) Has been cancelled
v1.14.31c's matrix jobs failed on Upload digest with GHESNotSupportedError — Gitea Actions doesn't support actions/upload-artifact@v4+. Separately, build-omos arm64 hung silently for 12 min in Set-up job, likely catthehacker pull contention between concurrent matrix children. Rather than downgrade artifacts to @v3, collapse the matrix entirely. docker/build-push-action@v7 with platforms: linux/amd64,linux/arm64 publishes a proper multi-arch manifest in one job, so the artifact-passing and imagetools create merge dance only existed to support a matrix split we no longer need. The matrix was designed around load: true disk exhaustion (v1.14.30b), but push-by-digest streams straight to the registry with fundamentally different disk profile. Reclaim step gives enough headroom for the combined amd64+arm64 push case. Workflow: 7 jobs → 5. docker-publish.yml: 263 → ~110 lines of YAML. Also: - timeout-minutes: 90 on build jobs so hung builds fail explicitly - BUILDKIT_PROGRESS=plain at workflow level for line-by-line arm64 logs - AGENTS.md §CI quirks documents the Gitea-specific traps (upload-artifact@v3-only, dash-not-bash, build-push-action@v7 multi-arch convention, reclaim requirement)
82 lines
9.4 KiB
Markdown
82 lines
9.4 KiB
Markdown
# AGENTS.md
|
|
|
|
## Project overview
|
|
|
|
Docker image packaging [opencode](https://opencode.ai) into a production-ready dev container. Two image variants (base and omos) are published to Docker Hub via Gitea Actions CI. Not a library or application — this is infrastructure (Dockerfile, entrypoint scripts, docker-compose, documentation).
|
|
|
|
## File roles
|
|
|
|
- `Dockerfile` — single multi-stage build for both variants. OMOS variant is controlled by `INSTALL_OMOS=true` build arg; mempalace is controlled by `INSTALL_MEMPALACE` (default `true`). All GitHub-sourced binaries are pinned with version ARGs.
|
|
- `entrypoint.sh` — runs as root: UID/GID adjustment, SSH permissions, volume ownership fixes (skipped via `.devbox-owner` sentinel when ownership is already correct). Then drops to developer via gosu.
|
|
- `entrypoint-user.sh` — runs as developer: git config, opencode.json generation (delegated to `generate-config.py`), OMOS setup.
|
|
- `rootfs/usr/local/lib/opencode-devbox/generate-config.py` — generates `~/.config/opencode/opencode.json` from env vars. Never overwrites an existing config. Auto-registers MCP servers for detected tools (mempalace via the `mempalace-mcp` entry point, gitea-mcp).
|
|
- `scripts/smoke-test.sh` — post-build image verification. Asserts binary presence, opencode startup, entrypoint correctness, config generation idempotency, and image size thresholds. Used by both CI workflows.
|
|
- `scripts/generate-dockerhub-md.py` — generates `DOCKER_HUB.md` from `README.md` using explicit section rules. `--check` fails if the committed file is out of sync (enforced by the `validate` workflow).
|
|
- `DOCKER_HUB.md` — **auto-generated** from README. Do not edit directly. Pushed to Docker Hub description via CI API call. Must stay under 25 kB. Short description field must be ≤100 bytes.
|
|
- `README.md` — authoritative source documentation. Sections are selected/dropped/replaced for DOCKER_HUB.md per `SECTION_RULES` in `scripts/generate-dockerhub-md.py`.
|
|
- `.gitea/workflows/validate.yml` — lightweight amd64 build + smoke test on push to main and PRs. Also runs the DOCKER_HUB.md sync check.
|
|
- `.gitea/workflows/docker-publish.yml` — CI pipeline on tag push: smoke-test each variant on amd64, then full multi-arch (amd64 + arm64) build-and-push, then update Docker Hub description.
|
|
|
|
## Versioning scheme
|
|
|
|
Tags follow `v{opencode_version}[letter]` — e.g. `v1.14.20` for the first build on a new opencode release, and `v1.14.20b`, `v1.14.20c`, … for subsequent rebuilds on the same opencode version.
|
|
|
|
- The number tracks the opencode npm version (see `OPENCODE_VERSION` ARG in `Dockerfile`).
|
|
- **No letter suffix** on the first build of a new opencode version — the bare `v{opencode_version}` tag is the canonical release.
|
|
- **Letter suffix is the build ordinal**, starting at `b` for the second build. The letter `a` is **never used** — think of the suffix as counting rebuilds: `b = 2nd, c = 3rd, d = 4th, …`. For opencode version `1.14.20`: first build `v1.14.20`, second `v1.14.20b`, third `v1.14.20c`, and so on.
|
|
- A letter suffix is only used for container-level rebuilds — tooling changes, CVE fixes, doc-driven rebuilds, entrypoint bugfixes — that don't change the underlying opencode version.
|
|
|
|
CI produces four Docker Hub tags per release: `vX.Y.Z[n]`, `latest`, `vX.Y.Z[n]-omos`, `latest-omos`.
|
|
|
|
When bumping the opencode version, also bump `OPENCODE_VERSION` in `Dockerfile` and update the comment in `.env.example` if it names a specific model/version for context.
|
|
|
|
## Critical conventions
|
|
|
|
- **entrypoint.sh volume ownership loop** — when adding a new named volume mount point, add it to the `for dir in ...` loop in `entrypoint.sh` so root-owned volumes get chowned on startup. The loop writes a `.devbox-owner` sentinel after a successful chown so subsequent starts skip the recursive walk. Users should not touch these files.
|
|
- **Two docs to keep in sync (automated)** — `README.md` is the source of truth. `DOCKER_HUB.md` is auto-generated by `scripts/generate-dockerhub-md.py`. When adding a new top-level section to README, either add it to `SECTION_RULES` in that script or the `--check` run will fail CI. `.env.example` must still be hand-updated to match Dockerfile/entrypoint behavior.
|
|
- **GitHub/Gitea-sourced binaries float by default** — gosu, fzf, git-lfs, nvim, bat, eza, zoxide, uv, gitea-mcp, Go, oh-my-opencode-slim all default to `latest`. Each build-time install step reads the `/releases/latest` Location redirect (or the go.dev JSON feed for Go) and derives the concrete version. Use the same `ARCH` case-switch pattern for multi-arch support (amd64/arm64). Intentional pins: `OPENCODE_VERSION` (drives the image tag), `NODE_VERSION=22` (major pin), `DEBIAN_VERSION=trixie-slim` (OS base). Adding a new upstream tool: follow the existing floated-version pattern, don't hardcode a specific tag.
|
|
- **Resolved versions are logged by the smoke test** — `scripts/smoke-test.sh` prints a "Resolved component versions" table as its first step. CI logs always capture what got baked into a given image even when ARGs default to `latest`.
|
|
- **Shell scripts use `set -euo pipefail`** — both entrypoints are strict. Errors in volume chown or SSH permission operations are intentionally suppressed with `|| true`.
|
|
- **MemPalace install path** — installed via `uv tool install` into `/opt/uv-tools/mempalace/`. Both the `mempalace` CLI and the `mempalace-mcp` MCP server binary are shipped as entry points by the mempalace package itself and placed on PATH by uv as shims whose shebangs point at the venv's Python. No hand-rolled wrapper is needed. Do not use `pip install --break-system-packages` — that was the previous approach and has been removed. Do not use `["python3", "-m", "mempalace.mcp_server"]` in `opencode.json` — system Python can't import from the uv venv.
|
|
- **generate-config.py idempotency** — the script MUST never overwrite an existing `opencode.json`. Users bind-mount their config directory or persist it across container recreations; accidentally clobbering that file would destroy hand-edits. The smoke test asserts this.
|
|
- **Docker Hub description update** — uses `/v2/auth/token` endpoint (not the deprecated `/v2/users/login`). Auth uses `identifier`/`secret` fields, returns `access_token`, sent as `Bearer`. Short description must be ≤100 bytes.
|
|
|
|
## CI quirks
|
|
|
|
- Both build jobs include an IPv4 preference step (`gai.conf` + `driver-opts: network=host` for buildx) to work around intermittent IPv6 failures on the Gitea runners.
|
|
- `update-description` job runs only when both builds succeed (`needs: [build-base, build-omos]`).
|
|
- Tags must be pushed to trigger the publish workflow. The validate workflow runs on push to main and PRs.
|
|
- Smoke tests run on amd64 only (single-arch load into the local daemon). The multi-arch push happens after smoke passes.
|
|
- **Gitea Actions runner has ~40 GB disk, often 70%+ used at job start.** All four `load: true` jobs (`validate-base`, `validate-omos`, `smoke-base`, `smoke-omos`) include a `Reclaim runner disk` step that strips catthehacker-resident toolchains and prunes stale docker state before `setup-buildx-action`. Build jobs use a lighter version (push-by-digest doesn't need `docker system prune`). Don't remove these steps without testing on a fresh runner.
|
|
- **`docker/build-push-action@v7` with `platforms: linux/amd64,linux/arm64` handles multi-arch push natively in a single job** — produces a proper manifest list, no matrix or merge step needed. An earlier revision split into per-arch matrix jobs with digest artifacts, but that pattern requires `actions/{upload,download}-artifact@v4+` which Gitea Actions doesn't support (see below).
|
|
- **`actions/upload-artifact` and `actions/download-artifact` must stay at @v3 on Gitea.** v4+ uses a GitHub-Enterprise-specific Artifact API; runs fail with `GHESNotSupportedError`. If you need artifacts for a new reason (build logs, SBOMs, etc.), pin @v3 explicitly.
|
|
- **Step scripts run under `/bin/sh` (dash), not bash.** Avoid bash-isms like `${VAR//a/b}` parameter-pattern substitution; use POSIX alternatives (`tr`, `sed`) or declare `shell: bash` on the step.
|
|
- **`BUILDKIT_PROGRESS=plain`** is set at workflow level on `docker-publish.yml` so arm64-under-QEMU builds log each layer line-by-line. The default collapsed progress UI hides which step is stalled, which made diagnosing earlier hangs expensive.
|
|
|
|
## Testing changes
|
|
|
|
The smoke test (`scripts/smoke-test.sh`) is the canonical check and runs automatically in CI. To run locally:
|
|
|
|
```bash
|
|
# Base image
|
|
docker compose build
|
|
bash scripts/smoke-test.sh opencode-devbox --variant base
|
|
|
|
# OMOS image
|
|
docker build --build-arg INSTALL_OMOS=true -t opencode-devbox:omos .
|
|
bash scripts/smoke-test.sh opencode-devbox:omos --variant omos
|
|
```
|
|
|
|
For manual/exploratory testing:
|
|
1. `docker compose run --rm devbox bash`
|
|
2. Check specific tools inside: `nvim --version`, `bat --version`, `uv --version`, `mempalace --help`, etc.
|
|
3. For entrypoint changes: test with a non-1000 UID workspace to verify UID adjustment, volume ownership fixes, and the `.devbox-owner` sentinel behavior.
|
|
4. For `generate-config.py` changes: run standalone with `HOME=/tmp/fake OPENCODE_PROVIDER=anthropic python3 rootfs/usr/local/lib/opencode-devbox/generate-config.py`.
|
|
|
|
## Commit style
|
|
|
|
Imperative mood, first line summarizes the change. Multi-line body explains "why" when non-obvious. Examples from history:
|
|
- `Fix ownership of named volume mount points in entrypoint`
|
|
- `Add uv package manager to base image for on-demand Python support`
|
|
- `Upgrade base image from Debian bookworm to trixie (current stable)`
|