Pre-warm chromadb embedding model at build time
Mempalace's embedding function is chromadb's ONNXMiniLM_L6_V2, which downloads ~80 MB of all-MiniLM-L6-v2 ONNX weights from chromadb's CDN on first use. Without pre-warming this happened silently in the entrypoint init step (output redirected to /dev/null) and stalled first container start by multiple minutes on slow networks — the symptom user reported as 'hangs at Initializing MemPalace for workspace'. Fix: invoke the embedding function once at build time as gosu developer so the cache lands at the runtime user's ~/.cache/chroma/onnx_models/all-MiniLM-L6-v2/ with correct ownership and survives container recreate (cache path is not on a named volume, so it lives in the image layer). Build-time cost: ~3-5 s to download. Runtime saving: minutes per fresh container. Image size: 2110 → 2277 MB for the with-pi variant. Still within the 2700 MB smoke-test threshold.
This commit is contained in:
+3
-1
@@ -15,9 +15,11 @@ Tags follow `v{opencode_version}[letter]` — bare tag for the first build on a
|
||||
- **Feature:** New `devbox-pi-config` named volume mounted at `~/.pi/` persists pi user state (settings.json, `/ext`-disabled extensions) across container recreate. Mirrors the `devbox-opencode-config` pattern from v1.14.33.
|
||||
- **Feature:** Container clones [pi-toolkit](https://gitea.jordbo.se/joakimp/pi-toolkit) (keybindings, env loader, settings template) and [pi-extensions](https://gitea.jordbo.se/joakimp/pi-extensions) (6 extensions including ext-toggle, todo, ssh-controlmaster, notify, git-checkpoint, confirm-destructive) into `/opt/` at build time. New `PI_TOOLKIT_REF` and `PI_EXTENSIONS_REF` build args (default `main`) pin git refs. The mempalace pi-bridge `mempalace.ts` is symlinked from the existing `/opt/mempalace-toolkit/` clone.
|
||||
- **Behavior change:** Default container CMD changed from `["opencode"]` to `["bash", "-l"]`. `docker compose run --rm devbox` (no command) now drops to a login shell so users can pick `opencode` or `pi` (or run `aws sso login` first). To preserve the old behavior, pass the harness explicitly: `docker compose run --rm devbox opencode`. `docker compose exec` workflows are unaffected (they bypass the entrypoint and CMD).
|
||||
- **Smoke-test:** New `--variant with-pi` and `--variant omos-with-pi` selectors with size thresholds 2700 MB / 3400 MB respectively. Pi-specific assertions verify pi binary, pi-toolkit clone, pi-extensions clone, deployed keybindings symlink, extension count ≥ 4, mempalace bridge symlink, and settings.json bootstrap. Run via the entrypoint chain (not `--entrypoint=""`) so on-start install.sh deployment is exercised.
|
||||
- **Smoke-test:** New `--variant with-pi` (threshold 2700 MB) and `--variant omos-with-pi`. Pi-specific assertions verify pi binary, pi-toolkit clone, pi-extensions clone, deployed keybindings symlink, extension count ≥ 4, mempalace bridge symlink, and settings.json bootstrap. Run via the entrypoint chain (not `--entrypoint=""`) so on-start install.sh deployment is exercised.
|
||||
- **Docs:** README adds a "pi (alternative/complementary harness)" section. AGENTS.md codifies pi install contract, deploy ordering in entrypoint-user.sh, and rationale for not calling mempalace-toolkit's full `install.sh` from container.
|
||||
|
||||
- **Performance:** chromadb's all-MiniLM-L6-v2 ONNX embedding model (~80 MB) is now pre-warmed at image build time under `~/.cache/chroma/onnx_models/`. Without this, mempalace's `init` step in entrypoint-user.sh would download the model silently on first container start (suppressed via `>/dev/null 2>&1`), stalling startup by minutes on a fresh image. Pre-warming runs as `gosu developer` so the cache lands at the right path and is owned by the runtime user. Adds ~167 MB to image size; thresholds bumped accordingly.
|
||||
|
||||
No CI workflow changes in this entry — build matrix expansion (`-with-pi` / `-omos-with-pi` tag variants) is intentionally deferred to a follow-up release once the local smoke-test path stabilizes.
|
||||
|
||||
## v1.14.40 — 2026-05-07
|
||||
|
||||
+20
@@ -398,6 +398,26 @@ RUN mkdir -p /workspace \
|
||||
/home/${USER_NAME}/.ssh && \
|
||||
chown -R ${USER_NAME}:${USER_NAME} /workspace /home/${USER_NAME}
|
||||
|
||||
# ── Pre-warm chromadb embedding model ──────────────────────────────
|
||||
# Mempalace uses chromadb's ONNXMiniLM_L6_V2 embedding function, which
|
||||
# downloads ~80 MB of all-MiniLM-L6-v2 ONNX weights from chromadb's CDN
|
||||
# on first use. Without pre-warming this happens silently (output is
|
||||
# suppressed by the entrypoint init step) and stalls first container
|
||||
# start by minutes on a slow network. We bake the cache at build time
|
||||
# under the developer user's home so the runtime first-start is fast.
|
||||
#
|
||||
# Cache path comes from chromadb's hardcoded `Path.home() / .cache /
|
||||
# chroma / onnx_models / all-MiniLM-L6-v2`. Run as gosu developer so
|
||||
# Path.home() resolves correctly and ownership is right from the start.
|
||||
RUN if [ "${INSTALL_MEMPALACE}" = "true" ]; then \
|
||||
gosu ${USER_NAME} /opt/uv-tools/mempalace/bin/python -c "\
|
||||
from chromadb.utils.embedding_functions import ONNXMiniLM_L6_V2; \
|
||||
ef = ONNXMiniLM_L6_V2(); \
|
||||
_ = ef(['warmup']); \
|
||||
print('chromadb embedding model warmed: all-MiniLM-L6-v2')" && \
|
||||
ls -lh /home/${USER_NAME}/.cache/chroma/onnx_models/all-MiniLM-L6-v2/ ; \
|
||||
fi
|
||||
|
||||
# ── Shell defaults (bash history, aliases, readline) ─────────────────
|
||||
# Shipped under /etc/skel-devbox/ rather than copied directly to the
|
||||
# user's home. The entrypoint copies them to /home/developer/ only if
|
||||
|
||||
Reference in New Issue
Block a user