Files
pi-devbox/README.md
T
pi 9ebb0643c7 docs: fix drift — sync compose/volumes, studio coverage, mempalace link
Audit found README/AGENTS carried a stale compose/volume set that
diverged from the shipped docker-compose.yml (DOCKER_HUB + compose +
.env.example were already consistent — README was the outlier):

- README compose block + 'Volumes and persistence' table: correct volume
  names (devbox-shell-history not -bash-history; devbox-uv at
  ~/.local/share/uv not devbox-uv-tools at /opt/uv-tools — the latter
  would SHADOW the baked mempalace install at UV_TOOL_DIR); add
  devbox-ssh-local + devbox-zoxide; mark devbox-palace/-chroma-cache
  optional; WORKSPACE_PATH/SSH_KEY_PATH (not HOST_WORKSPACE).
- README quickstart: 'compose exec -u developer' (no USER in image; bare
  exec lands a root shell).
- README: pi-studio now 'shipped' not 'planned'; build-pipeline + tag
  table cover -studio + smoke-studio/build-variant-studio.
- AGENTS: backward-compat volume names corrected; repo-layout bullets
  cover pi-studio install + studio-expose + STUDIO_EXPOSE bridge.
- DOCKER_HUB: MemPalace source link -> upstream MemPalace/mempalace
  (matches Dockerfile.base + CHANGELOG refs).

Note: the shipped v1.0.0 CHANGELOG migration note still lists the old
(incorrect) volume names; left as immutable released history.
2026-06-10 23:52:17 +02:00

493 lines
17 KiB
Markdown

# pi-devbox
A self-contained Docker image for running [pi](https://pi.dev) — the pi
coding-agent — in an isolated, reproducible Linux environment with a
curated set of developer tooling, AI memory, and shell improvements.
pi-devbox is opinionated about what's inside but unopinionated about how
you use it: a single `docker compose up` gives you an interactive
container with pi, a stack of modern CLI tools, MemPalace for persistent
agent memory across sessions, and a UID-aligned `/workspace` mount so
files you edit inside the container appear with your normal ownership
on the host.
## What's inside
### The pi coding-agent
- `pi` — the pi-coding-agent CLI (`@earendil-works/pi-coding-agent`)
- `pi-toolkit` — keybindings, AWS env loader, settings template
- `pi-extensions` — TypeScript extensions for pi (preview, MCP bridges,
mempalace integration, etc.)
- `pi-fork` — the `fork` tool for spawning sub-agents
- `pi-observational-memory` — the `recall` tool for session compaction
### MemPalace (AI memory)
- `mempalace` — local-first agent memory system (29 MCP tools)
- `mempalace-toolkit` — bash wrappers for session/docs mining
- ChromaDB embedding model pre-warmed at build time (`all-MiniLM-L6-v2`)
The host-mounted palace at `~/.mempalace` is shared across the host and
this container so all your agents share one brain.
### Modern CLI tooling
| Tool | Purpose |
|---|---|
| `nvim` | Neovim text editor |
| `tmux` | Terminal multiplexer (configured for 0-indexed sessions) |
| `ripgrep`, `fd` | Fast file content / filename search |
| `fzf` | Fuzzy finder |
| `bat` | Syntax-highlighted `cat` |
| `eza` | Modern `ls` |
| `zoxide` | Smart `cd` |
| `jq`, `yq` | JSON / YAML query and transformation |
| `tldr` (tealdeer) | Quick command examples |
| `git`, `git-lfs`, `git-crypt` | Git + extensions |
| `gitleaks` | Secret scanning pre-commit hook |
| `gosu` | Privilege de-escalation in entrypoint |
| `htop`, `tree`, `less` | Inspection utilities |
### Document and image tooling
- `pandoc` — universal Markdown↔HTML/Org/RST/etc. converter
- `graphviz``dot` rendering for diagram pipelines
- `imagemagick` — image conversion / resizing (invoked as `magick`)
### Language toolchains
- `python3` + `python3-venv` + `python3-pip` (system Python)
- `uv` + `uvx` — fast Python package manager (preferred over pip/venv)
- `nodejs` (v22) + `npm`
- `gcc`, `g++`, `make` — C/C++ build tools
- `rustup-init` — Rust toolchain installer (toolchains opt-in at runtime)
- Optional `INSTALL_GO=true` build arg for Go
For Python REPLs and notebooks beyond the system interpreter, see the
[uv-driven REPL recipes](#uv-driven-repl-recipes) section.
### Cloud + secrets
- AWS CLI v2 — for SSO + Bedrock auth
- `gitea-mcp` — MCP server for Gitea API
- `age`, `git-crypt` — encryption tooling
### SSH and networking
- OpenSSH client with **ControlMaster auto** preconfigured on a
writable socket path (`/tmp/sshcm/`). Mitigates ssh banner-exchange
failures behind CGNAT-restricted residential ISPs (~4-flow caps) by
multiplexing many ssh calls over one TCP flow.
- A LAN-access helper that auto-configures ssh jump-via-host on
VM-backed hosts (OrbStack / Docker Desktop on macOS) so the container
can reach the host's directly-attached LAN peers.
## Quickstart
### Prerequisites
- Docker or OrbStack (recommended on macOS)
- Optional: AWS credentials configured on the host if you'll use the
Bedrock LLM provider
### Pull and run
```bash
git clone https://gitea.jordbo.se/joakimp/pi-devbox
cd pi-devbox
cp .env.example .env # edit if needed
docker compose up -d
docker compose exec -u developer devbox bash
```
You're now in the container as user `developer` with `pi` on PATH and
your host workspace mounted at `/workspace`.
To start pi:
```bash
pi
```
First-run pi-toolkit and pi-extensions install steps run automatically
on container start; symlinks are written to `~/.pi/agent/` on the
named volume (so they persist across container recreations).
### Stop / recreate / update
```bash
docker compose down # stop, keep volumes
docker compose down -v # stop, wipe per-container volumes (palace data is bind-mounted, so unaffected)
docker compose pull # fetch latest image
docker compose up -d --force-recreate
```
## Image variants
Currently published:
| Tag | Includes | Size (approx.) |
|---|---|---|
| `joakimp/pi-devbox:latest` | base + pi + tooling | ~3.2 GB |
| `joakimp/pi-devbox:vX.Y.Z` | pinned-version equivalent | ~3.2 GB |
| `joakimp/pi-devbox:latest-studio` | `latest` + [pi-studio](https://github.com/omaclaren/pi-studio) (browser prompt editor, KaTeX/Mermaid preview, tmux-backed literate REPLs) | ~3.25 GB |
| `joakimp/pi-devbox:vX.Y.Z-studio` | pinned-version studio equivalent | ~3.25 GB |
Planned for an upcoming minor release:
- `joakimp/pi-devbox:latest-studio-tex``-studio` plus `texlive-xetex`
for PDF export from Studio. Adds ~600 MB on top of `-studio`.
## Using pi-studio (`-studio` variant)
The `-studio` images bundle [pi-studio](https://github.com/omaclaren/pi-studio):
a two-pane browser workspace with a prompt/response editor, live
KaTeX/Mermaid preview, and tmux-backed literate REPLs (Shell / Python /
IPython / Julia / R / GHCi / Clojure). It is registered automatically on
container start (no `pi install` needed) and exposes the `/studio` slash
command plus the `studio_repl_send` / `studio_export_*` agent tools.
Inside a pi session in the container:
```
/studio --no-browser --port 8765 # pin a fixed port; STUDIO_PORT=8765 is the baked default
/studio --status # reprint the tokenized URL
```
### Reaching the UI from your browser (the container caveat)
pi-studio **hard-binds its server to `127.0.0.1` inside the container**
(`index.ts`: `.listen(port, "127.0.0.1")`) and serves a tokenized URL.
There is no `--host`/bind flag. This matters for a container: a plain
`docker run -p 8765:8765` publish forwards to the container's *external*
interface, **not** its loopback, so it will not reach Studio. Two paths
work:
**A. Host networking (simplest — OrbStack / single-host, no bridge).**
Run the container with host networking so the container's loopback is the
host's loopback:
```yaml
services:
devbox:
network_mode: host # container 127.0.0.1 == host 127.0.0.1
```
Then `http://127.0.0.1:8765/?token=…` works in a browser on the Docker
host. This is the most secure option (Studio never leaves loopback). Note:
host networking changes `host.docker.internal` semantics, so weigh it
against the LAN-jump SSH feature if you use that.
**B. `studio-expose` bridge (portable — any networking mode).** Publish a
port and run the bundled `studio-expose` helper, which uses `socat` to
bridge the container's loopback to its external interface (binding the
egress IP on the same port, so the token URL Studio printed works
verbatim):
```yaml
services:
devbox:
ports:
- "127.0.0.1:8765:8765" # host-localhost only
environment:
- STUDIO_EXPOSE=1 # auto-start the bridge on container boot
```
With `STUDIO_EXPOSE=1`, the entrypoint starts the bridge for you; just run
`/studio --port 8765` in your pi session. To bridge manually instead
(leave `STUDIO_EXPOSE` unset), run `studio-expose` in a container shell:
```bash
studio-expose # bridges $STUDIO_PORT (default 8765); --help for details
```
> **Security:** the bridge intentionally exposes Studio beyond loopback;
> its tokenized URL is the only auth. Keep the host-side publish on
> `127.0.0.1:` and use `ssh -L` for remote access. Default is **off**.
### Remote host (SSH / mosh)
When the Docker host is remote, keep Studio on localhost and forward the
port from your laptop:
```bash
ssh -L 8765:127.0.0.1:8765 user@docker-host # then open the token URL locally
```
**mosh cannot forward ports** (no `-L`/`-R` equivalent). To use Studio
over a mosh session, run a *separate* `ssh -L 8765:127.0.0.1:8765 host`
tunnel alongside mosh (mosh for the shell, ssh for the port), or reach the
host's published port directly over a trusted network (LAN / Tailscale /
WireGuard).
> PDF export (`/studio-pdf`, `studio_export_pdf`) needs a LaTeX engine,
> which is **not** in `-studio` (only the planned `-studio-tex`). HTML
> export, KaTeX, Mermaid, and all REPL features work without it.
## docker-compose.yml — basic shape
```yaml
name: pi-devbox
services:
devbox:
image: joakimp/pi-devbox:latest
container_name: pi-devbox
stdin_open: true
tty: true
env_file:
- .env
environment:
- TERM=xterm-256color
- GITEA_ACCESS_TOKEN=${GITEA_ACCESS_TOKEN:-}
- GITEA_HOST=${GITEA_HOST:-}
- GITHUB_PERSONAL_ACCESS_TOKEN=${GITHUB_PERSONAL_ACCESS_TOKEN:-}
volumes:
# Workspace: your host source tree
- ${WORKSPACE_PATH:-.}:/workspace
# SSH keys: read-only from host
- ${SSH_KEY_PATH:-~/.ssh}:/home/developer/.ssh:ro
# Per-container persistent state
- devbox-pi-config:/home/developer/.pi
- devbox-ssh-local:/home/developer/.ssh-local
- devbox-shell-history:/home/developer/.cache/bash
- devbox-zoxide:/home/developer/.local/share/zoxide
- devbox-nvim-data:/home/developer/.local/share/nvim
- devbox-uv:/home/developer/.local/share/uv
# Optional (uncomment to enable):
# - ~/.aws:/home/developer/.aws # AWS creds
# - devbox-palace:/home/developer/.mempalace # persist palace
# - devbox-chroma-cache:/home/developer/.cache/chroma # embedding cache
volumes:
devbox-pi-config:
devbox-ssh-local:
devbox-shell-history:
devbox-zoxide:
devbox-nvim-data:
devbox-uv:
# devbox-palace:
# devbox-chroma-cache:
```
See `docker-compose.yml` and `.env.example` in the repo for the full
template (build-from-source args, LAN-jump and skillset mounts, MemPalace
persistence). To share one palace between host pi and the container,
bind-mount your host `~/.mempalace` to `/home/developer/.mempalace`.
## uv-driven REPL recipes
uv is installed in the base image and is the recommended way to run
Python interpreters and notebooks without bloating the image:
| Goal | One-liner |
|---|---|
| IPython REPL | `uv run --with ipython ipython` |
| IPython + scientific stack | `uv run --with ipython --with numpy --with matplotlib --with pandas ipython` |
| JupyterLab (browser, port-forward needed) | `uv run --with jupyterlab jupyter lab --no-browser --port 8888` |
| Marimo (modern alternative) | `uv run --with marimo marimo edit --port 8889` |
For long-lived environments, prefer a project venv:
```bash
cd /workspace/myproj
uv init && uv add ipython numpy matplotlib
# then:
uv run ipython
```
`pyproject.toml` + `uv.lock` then capture the dependency state and
travel with the project in git.
uv only manages Python. For other languages:
| Toolchain | How to add |
|---|---|
| R | `sudo apt-get install r-base-core` (~200 MB) |
| GHCi (Haskell) | `sudo apt-get install ghc` (~700 MB) |
| Clojure | `sudo apt-get install clojure` (~150 MB + JVM) |
| Julia | `juliaup` is planned for an upcoming release |
These are runtime opt-ins and persist only in the container's writable
layer — they don't survive `docker compose down -v` or image updates.
## tldr — first-run cache
The `tldr` command (provided by tealdeer) shows a "Page cache not
found" message on first invocation. To populate the cache:
```bash
tldr --update
```
This fetches ~1500 command pages from the [tldr-pages](https://tldr.sh)
project and caches them in `~/.cache/tealdeer/`. After that, `tldr ls`,
`tldr docker`, etc. work instantly. Re-run `tldr --update` periodically
to refresh.
## Volumes and persistence
| Path inside container | Volume | What survives |
|---|---|---|
| `/workspace` | host bind-mount (`WORKSPACE_PATH`) | host filesystem |
| `~/.ssh` | host bind-mount (read-only, `SSH_KEY_PATH`) | host filesystem |
| `~/.pi` | named volume `devbox-pi-config` | `down -v` wipes |
| `~/.ssh-local` | named volume `devbox-ssh-local` | `down -v` wipes |
| `~/.cache/bash` | named volume `devbox-shell-history` | `down -v` wipes |
| `~/.local/share/zoxide` | named volume `devbox-zoxide` | `down -v` wipes |
| `~/.local/share/nvim` | named volume `devbox-nvim-data` | `down -v` wipes |
| `~/.local/share/uv` | named volume `devbox-uv` | `down -v` wipes |
| `~/.mempalace` | host bind-mount or `devbox-palace` (optional) | host / volume |
| `~/.cache/chroma` | `devbox-chroma-cache` (optional) | `down -v` wipes |
Anything not on a volume is on the writable layer and is lost on
container recreate.
## MemPalace integration
MemPalace is installed in the base image and pre-warmed with the
ChromaDB ONNX embedding model so first-time semantic search is
instant.
The palace data lives at `~/.mempalace/palace` on the host
(bind-mounted into the container). This means:
- A pi running on the host and a pi running inside this container see
the same palace.
- SQLite's WAL mode handles concurrent reads + single writer cleanly,
so simultaneous use is safe in practice.
`mempalace-session` and `mempalace-docs` are on PATH for one-off
session/docs mining; the 29 MCP tools (search, kg-query, drawer-add,
diary-write, etc.) are wired into pi automatically by the pi-extensions
mempalace bridge.
## SSH and ControlMaster
The base image preconfigures `Host *` ssh defaults:
```
ControlMaster auto
ControlPath /tmp/sshcm/%r@%h:%p
ControlPersist 10m
```
The socket directory `/tmp/sshcm/` is created mode 700 on every
container start (per-container, tmpfs-friendly). Multiple ssh calls
to the same host within 10 minutes reuse the master TCP flow —
important on residential ISPs with CGNAT per-destination flow caps
(~4 flows on most European broadband; symptoms are
`kex_exchange_identification: Connection closed by remote host` on
the 5th+ concurrent ssh).
User-level overrides in `~/.ssh/config` win because Debian's
`/etc/ssh/ssh_config` includes `/etc/ssh/ssh_config.d/*.conf` before
the `Host *` block.
## tmux and 0-indexed sessions
The image installs `/etc/tmux.conf` with:
```
set -g base-index 0
set -g pane-base-index 0
```
This is the default tmux indexing. It's baked here because `pi-studio`
(shipped in the `:latest-studio` variant) hard-codes its tmux send
target to `<session>:0.0`. If you override `base-index` to 1 in a
personal `~/.tmux.conf`, pi-studio will fail with "can't find window: 0".
## AWS Bedrock auth
If you use Bedrock as pi's LLM provider:
1. Configure SSO on the host: `aws configure sso`
2. Bind-mount `~/.aws:/home/developer/.aws:ro`
3. Set `AWS_PROFILE` and `AWS_REGION` in `.env`
4. Inside the container: `aws sso login` if needed; pi picks up the
profile via the env vars.
The pi-toolkit AWS env loader (in `~/.pi/agent/`) prepares Bedrock
inference-profile model IDs (with `eu.` / `us.` prefixes) automatically.
## Build pipeline
pi-devbox is built from this repo's CI in two phases:
1. **Base** (`Dockerfile.base`) — produces `joakimp/pi-devbox:base-<hash>`
where `<hash>` is content-addressed over `Dockerfile.base`,
`rootfs/`, and `entrypoint*.sh`. Rebuilt only when these change.
2. **Variant** (`Dockerfile.variant`) — `FROM ${BASE_IMAGE}` and adds
the pi install (+ pi-studio when `INSTALL_STUDIO=true`). The `:latest`
/ `vX.Y.Z` and `:latest-studio` / `vX.Y.Z-studio` tags are produced
from this layer. The studio variant builds via independent
`smoke-studio` + `build-variant-studio` CI jobs that gate only the
`-studio` tags.
Tag naming:
| Tag | Stage |
|---|---|
| `base-<hash>` | base image — internal building block |
| `base-latest` | promoted alias of the most recent base |
| `latest`, `vX.Y.Z` | variant: base + pi |
| `latest-studio`, `vX.Y.Z-studio` | variant: base + pi + pi-studio |
CI resolves `PI_VERSION` to a concrete version string before building
to defeat a registry-buildcache hit on `npm install -g
pi-coding-agent@latest` (the build-arg string would otherwise be
byte-identical across releases and the layer would silently reuse the
previous version's bytes).
## Troubleshooting
### Image grew unexpectedly
`docker history joakimp/pi-devbox:latest` shows per-layer sizes. The
biggest layers are typically the apt block (~600 MB), pi npm install
(~330 MB), MemPalace + ChromaDB (~315 MB), AWS CLI (~270 MB), Node.js
(~200 MB).
### pi can't reach LAN peers on macOS
The LAN-access helper (`/usr/local/lib/pi-devbox/setup-lan-access.sh`)
auto-runs on container start and writes `~/.ssh-local/config` with a
ssh-jump-via-host configuration. Set `DEVBOX_LAN_ACCESS=jump` and
`HOST_SSH_USER=<your-mac-user>` in `.env` if auto-detection fails.
### Smoke-testing a local build
```bash
./scripts/smoke-test.sh joakimp/pi-devbox:latest
```
## Versioning and release
pi-devbox follows semver-ish:
- **Major** — architectural changes. `v1.0.0` is the first decoupled
release (independent of opencode-devbox).
- **Minor** — new variants, significant base additions.
- **Patch** — pi version bumps, smaller fixes.
The `pi --version` inside the image is asserted by smoke tests to
match the release tag's pi component, so version drift between the
image and the tag is caught at CI time.
## Acknowledgements
pi-devbox was originally a thin re-brand of the `pi-only` variant of
[opencode-devbox](https://gitea.jordbo.se/joakimp/opencode-devbox).
It was decoupled at `v1.0.0` so it could evolve at its own pace, with
self-contained docs and a focused, pi-centric image. Significant base
infrastructure (the SSH ControlMaster setup, MemPalace integration,
the entrypoint UID/GID dance) was adopted from there.
The pi coding-agent itself is [@earendil-works/pi-coding-agent](https://www.npmjs.com/package/@earendil-works/pi-coding-agent).
## License
MIT