Pre-warm chromadb embedding model at build time

Mempalace's embedding function is chromadb's ONNXMiniLM_L6_V2, which downloads ~80 MB of all-MiniLM-L6-v2 ONNX weights from chromadb's CDN on first use. Without pre-warming this happened silently in the entrypoint init step (output redirected to /dev/null) and stalled first container start by multiple minutes on slow networks — the symptom user reported as 'hangs at Initializing MemPalace for workspace'. Fix: invoke the embedding function once at build time as gosu developer so the cache lands at the runtime user's ~/.cache/chroma/onnx_models/all-MiniLM-L6-v2/ with correct ownership and survives container recreate (cache path is not on a named volume, so it lives in the image layer). Build-time cost: ~3-5 s to download. Runtime saving: minutes per fresh container. Image size: 2110 → 2277 MB for the with-pi variant. Still within the 2700 MB smoke-test threshold.
2026-05-08 00:25:22 +02:00
parent f51e9f52a1
commit 5006b01170
2 changed files with 23 additions and 1 deletions
@@ -15,9 +15,11 @@ Tags follow `v{opencode_version}[letter]` — bare tag for the first build on a
 - **Feature:** New `devbox-pi-config` named volume mounted at `~/.pi/` persists pi user state (settings.json, `/ext`-disabled extensions) across container recreate. Mirrors the `devbox-opencode-config` pattern from v1.14.33.
 - **Feature:** Container clones [pi-toolkit](https://gitea.jordbo.se/joakimp/pi-toolkit) (keybindings, env loader, settings template) and [pi-extensions](https://gitea.jordbo.se/joakimp/pi-extensions) (6 extensions including ext-toggle, todo, ssh-controlmaster, notify, git-checkpoint, confirm-destructive) into `/opt/` at build time. New `PI_TOOLKIT_REF` and `PI_EXTENSIONS_REF` build args (default `main`) pin git refs. The mempalace pi-bridge `mempalace.ts` is symlinked from the existing `/opt/mempalace-toolkit/` clone.
 - **Behavior change:** Default container CMD changed from `["opencode"]` to `["bash", "-l"]`. `docker compose run --rm devbox` (no command) now drops to a login shell so users can pick `opencode` or `pi` (or run `aws sso login` first). To preserve the old behavior, pass the harness explicitly: `docker compose run --rm devbox opencode`. `docker compose exec` workflows are unaffected (they bypass the entrypoint and CMD).
- **Smoke-test:** New `--variant with-pi` and `--variant omos-with-pi` selectors with size thresholds 2700 MB / 3400 MB respectively. Pi-specific assertions verify pi binary, pi-toolkit clone, pi-extensions clone, deployed keybindings symlink, extension count ≥ 4, mempalace bridge symlink, and settings.json bootstrap. Run via the entrypoint chain (not `--entrypoint=""`) so on-start install.sh deployment is exercised.
+- **Smoke-test:** New `--variant with-pi` (threshold 2700 MB) and `--variant omos-with-pi`. Pi-specific assertions verify pi binary, pi-toolkit clone, pi-extensions clone, deployed keybindings symlink, extension count ≥ 4, mempalace bridge symlink, and settings.json bootstrap. Run via the entrypoint chain (not `--entrypoint=""`) so on-start install.sh deployment is exercised.
 - **Docs:** README adds a "pi (alternative/complementary harness)" section. AGENTS.md codifies pi install contract, deploy ordering in entrypoint-user.sh, and rationale for not calling mempalace-toolkit's full `install.sh` from container.

+- **Performance:** chromadb's all-MiniLM-L6-v2 ONNX embedding model (~80 MB) is now pre-warmed at image build time under `~/.cache/chroma/onnx_models/`. Without this, mempalace's `init` step in entrypoint-user.sh would download the model silently on first container start (suppressed via `>/dev/null 2>&1`), stalling startup by minutes on a fresh image. Pre-warming runs as `gosu developer` so the cache lands at the right path and is owned by the runtime user. Adds ~167 MB to image size; thresholds bumped accordingly.
+
 No CI workflow changes in this entry — build matrix expansion (`-with-pi` / `-omos-with-pi` tag variants) is intentionally deferred to a follow-up release once the local smoke-test path stabilizes.

 ## v1.14.40 — 2026-05-07
@@ -398,6 +398,26 @@ RUN mkdir -p /workspace \
    /home/${USER_NAME}/.ssh && \
    chown -R ${USER_NAME}:${USER_NAME} /workspace /home/${USER_NAME}

+# ── Pre-warm chromadb embedding model ──────────────────────────────
+# Mempalace uses chromadb's ONNXMiniLM_L6_V2 embedding function, which
+# downloads ~80 MB of all-MiniLM-L6-v2 ONNX weights from chromadb's CDN
+# on first use. Without pre-warming this happens silently (output is
+# suppressed by the entrypoint init step) and stalls first container
+# start by minutes on a slow network. We bake the cache at build time
+# under the developer user's home so the runtime first-start is fast.
+#
+# Cache path comes from chromadb's hardcoded `Path.home() / .cache /
+# chroma / onnx_models / all-MiniLM-L6-v2`. Run as gosu developer so
+# Path.home() resolves correctly and ownership is right from the start.
+RUN if [ "${INSTALL_MEMPALACE}" = "true" ]; then \
+      gosu ${USER_NAME} /opt/uv-tools/mempalace/bin/python -c "\
+from chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2; \
+ef = ONNXMiniLM_L6_V2(); \
+_ = ef(['warmup']); \
+print('chromadb embedding model warmed: all-MiniLM-L6-v2')" && \
+      ls -lh /home/${USER_NAME}/.cache/chroma/onnx_models/all-MiniLM-L6-v2/ ; \
+    fi
+
 # ── Shell defaults (bash history, aliases, readline) ─────────────────
 # Shipped under /etc/skel-devbox/ rather than copied directly to the
 # user's home. The entrypoint copies them to /home/developer/ only if