Compare commits

..

9 Commits

Author SHA1 Message Date
pi 2985d9ade8 release: v1.2.3 — mempalace-mcp self-heal (toolkit e12b624)
Publish Docker Image / resolve-versions (push) Successful in 7s
Publish Docker Image / base-decide (push) Successful in 14s
Publish Docker Image / build-base (push) Has been skipped
Publish Docker Image / smoke (push) Successful in 3m30s
Publish Docker Image / smoke-studio (push) Successful in 11m47s
Publish Docker Image / build-variant (push) Successful in 15m48s
Publish Docker Image / promote-base-latest (push) Has been skipped
Publish Docker Image / update-description (push) Successful in 7s
Publish Docker Image / build-variant-studio (push) Successful in 17m28s
Patch release. Headline: mempalace-mcp self-heals instead of latching
available=false permanently after a slow virtiofs cold-open. Base image
rebuilds via the mempalace-toolkit ref advancing to e12b624 (folded into
the base-decide hash). No pi/mempalace version change — pi npm latest is
still 0.80.2 (= v1.2.2). Also releases the queued yq (mikefarah Go yq) and
mempalace-skill temporal-grounding changes.
2026-06-27 18:45:24 +02:00
pi bff810c1eb docs(dockerfile): sync mempalace stall-protection comment with self-heal
mempalace.ts now self-heals (respawn with capped backoff) instead of
latching unavailable, and the init-timeout default is 300000. Update the
explanatory comment + tunable list (MEMPALACE_MCP_MAX_RESPAWNS,
MEMPALACE_MCP_RESPAWN_BACKOFF_MS). Comment-only; no build/ENV change.
2026-06-26 00:22:43 +02:00
pi 904fe85249 skill(mempalace): teach temporal grounding (recreate != new day)
Baked mempalace SKILL.md now instructs agents to establish current date/time
and compute the delta against the actual diary/drawer timestamp before using
relative terms (yesterday/last week), and explicitly that a container recreate
or fresh session is NOT a day boundary (pi-devbox restarts several times a day).
Phase 1 wake-up section + anti-pattern bullet. CHANGELOG Unreleased.
2026-06-25 22:53:31 +02:00
pi cda488c565 base: yq follows latest (was pinned v4.53.3), gate on major v4
Match the repo's latest-following convention (tealdeer/uv/etc.) and keep the
container in sync with the Mac's brew yq. smoke-test now asserts mikefarah AND
major v4, so a surprise yq v5 fails CI instead of silently breaking
provision.sh. Pin still available via --build-arg YQ_VERSION=vX.Y.Z.
2026-06-25 16:33:27 +02:00
pi 9ab9a28458 base: install mikefarah yq (pinned v4.53.3), drop Debian python yq
Debian/Ubuntu `apt install yq` is kislyuk/yq (Python, v3.x), incompatible
with the mikefarah v4 syntax the cloud-init repo's provision.sh/deploy.sh
require. Replace the apt package with a pinned mikefarah Go binary, mirroring
the existing tealdeer ARG (latest-or-pin) pattern, multi-arch amd64/arm64.
smoke-test.sh now asserts `yq --version` reports mikefarah so CI catches a
regression. CHANGELOG: Unreleased entry.
2026-06-25 16:29:51 +02:00
pi d175b31207 release: v1.2.2 — pi 0.80.2 + mempalace 3.5.0, drop anyOf workaround
Publish Docker Image / resolve-versions (push) Successful in 8s
Publish Docker Image / base-decide (push) Successful in 11s
Publish Docker Image / build-base (push) Successful in 33m19s
Publish Docker Image / smoke-studio (push) Successful in 4m3s
Publish Docker Image / smoke (push) Successful in 9m51s
Publish Docker Image / build-variant-studio (push) Successful in 17m29s
Publish Docker Image / build-variant (push) Successful in 24m10s
Publish Docker Image / update-description (push) Successful in 10s
Publish Docker Image / promote-base-latest (push) Successful in 14s
- Bump mempalace pin 3.4.0 -> 3.5.0: 3.5.0 carries the upstream fix for the
  top-level-anyOf diary_write schema (issue #1728 / PR #1717, merged
  2026-06-14). Verified against the published 3.5.0 wheel that mcp_server.py
  now advertises 'required: [agent_name]' with no root-level anyOf.
- Remove the Dockerfile.base perl workaround that stripped the anyOf from the
  installed mcp_server.py — obsolete now the fix is upstream.
- pi auto-resolves to npm latest (0.79.10 -> 0.80.2) at build.
- CHANGELOG v1.2.2.
2026-06-25 07:56:55 +02:00
Joakim Persson 13e67599c4 release: v1.2.1 — fallback skills + mempalace directive
Publish Docker Image / smoke-studio (push) Successful in 6m27s
Publish Docker Image / build-variant (push) Successful in 16m16s
Publish Docker Image / update-description (push) Successful in 6s
Publish Docker Image / promote-base-latest (push) Successful in 8s
Publish Docker Image / build-variant-studio (push) Successful in 21m13s
Publish Docker Image / resolve-versions (push) Successful in 6s
Publish Docker Image / base-decide (push) Successful in 7s
Publish Docker Image / build-base (push) Successful in 46m21s
Publish Docker Image / smoke (push) Successful in 3m43s
Bake pi-extensions + mempalace skills into the image (available without a
mounted skillset) and add the mempalace session-start proactive-load directive
so frequently-recreated containers actually pick the skill up. Closes the
fork/recall + mempalace under-utilisation gap.

CHANGELOG: [Unreleased] -> v1.2.1.
2026-06-23 16:02:57 +02:00
Joakim Persson 7551947466 feat(skills): add mempalace proactive-load directive for containers
Baking the mempalace fallback skill fixed *availability*, but mempalace had
no proactive-load directive anywhere (pi-toolkit's global AGENTS.md only
points to pi-extensions), so a new container would still surface it only via
description-matching — the same under-utilisation the pi-extensions directive
was created to fix.

Add a session-start pointer to the pi-devbox managed AGENTS.md block
(pi-global-AGENTS.append.md): gated to pi-devbox containers and conditional on
the MemPalace MCP tools being present. Memory continuity matters most in a
frequently-recreated container — the palace is its only cross-recreate memory.

- pi-global-AGENTS.append.md: '## Session start: load the mempalace skill'.
- smoke-test: assert the pointer merges into the global AGENTS.md at build.
- docs: VENDORED.md, README, CHANGELOG [Unreleased].

Now both skills are complete in pi-devbox: directive + skill file.
pi-extensions = directive (pi-toolkit) + baked skill; mempalace = directive
(this block) + baked skill.
2026-06-23 15:54:13 +02:00
Joakim Persson a7d6a7d235 feat(skills): bake pi-extensions + mempalace fallback skills
The pi-toolkit global AGENTS.md tells every pi session to read
~/.agents/skills/pi-extensions/SKILL.md at start (the fork/recall
under-utilisation fix), but that skill lived only in the private skillset
repo — so the pointer dangled in any container started without skillset
mounted. Bake fallbacks so the pointer always resolves.

- pi-extensions (Option 1 + Option 2, layered):
  * Canonical skill promoted to the public pi-extensions package repo under
    skill/ (separate commit there); co-located with the code it documents.
  * rootfs/ carries a committed snapshot (the floor).
  * Dockerfile.variant copies /opt/pi-extensions/skill/ over the snapshot
    after the pinned clone, so a normal build ships the fresh package copy
    (recorded via PI_EXTENSIONS_REF) and an old-ref/mirror build still ships
    the snapshot. Helper evaluate-extension-usage.py travels with it.
- mempalace (Option 2 only): snapshot in rootfs/. Its consumer skill has no
  public package home (mempalace-toolkit ships a different skill,
  opencode-mempalace-bridge), so no build-time refresh.
- entrypoint links both (only-when-absent; mounted skillset still wins).
- smoke-test: build-time presence + package-match check + runtime symlink
  assertions; readiness gate now waits on the last-linked skill.
- docs: skills/VENDORED.md (provenance + refresh), README, AGENTS.md,
  CHANGELOG [Unreleased].

Note: shipped in the NEXT release; v1.2.0 (run 409) predates this.
2026-06-23 15:32:04 +02:00
11 changed files with 1055 additions and 53 deletions
+8 -3
View File
@@ -17,7 +17,10 @@ re-brand of opencode-devbox's `pi-only` variant.
(`-studio` variant). Also appends the pi-devbox managed block from
`pi-global-AGENTS.append.md` onto pi-toolkit's `pi-global-AGENTS.md` (the
single global instruction slot pi loads) so containers proactively load the
baked `pi-devbox-environment` skill. Idempotent via a marker grep.
baked `pi-devbox-environment` skill. Idempotent via a marker grep. After the
pinned clones it also refreshes the vendored `pi-extensions` fallback skill
by copying `/opt/pi-extensions/skill/` over the committed `rootfs/` snapshot
(Option 1 over Option 2 — see `skills/VENDORED.md`).
- `entrypoint.sh` — UID/GID alignment as root, then drops to `developer`.
- `entrypoint-user.sh` — per-container start: SSH ControlMaster socket
dir, LAN-access setup, MemPalace init, pi-toolkit + pi-extensions
@@ -27,8 +30,10 @@ re-brand of opencode-devbox's `pi-only` variant.
- `rootfs/` — files baked into the image (bash aliases, inputrc,
setup-lan-access.sh, `studio-expose` helper). Also
`usr/local/share/pi-devbox/skills/<name>/SKILL.md` — image-baked agent
skills (e.g. `pi-devbox-environment`) symlinked into `~/.agents/skills/` by
the entrypoint, available with or without a mounted skillset — plus
skills (the repo-authored `pi-devbox-environment`, plus vendored fallback
copies of `pi-extensions` and `mempalace` — see `skills/VENDORED.md`)
symlinked into `~/.agents/skills/` by the entrypoint, available with or
without a mounted skillset — plus
`usr/local/share/pi-devbox/pi-global-AGENTS.append.md` (the global-AGENTS
pointer concatenated in `Dockerfile.variant`).
- `scripts/smoke-test.sh` — sanity checks run by CI before pushing to Hub.
+140
View File
@@ -11,6 +11,146 @@ Pre-v1.0.0 tags followed the pi npm version (`v{pi_version}[letter]`).
---
## Unreleased
---
## v1.2.3 — 2026-06-27
Patch release. Headline: **mempalace-mcp now self-heals** instead of latching
`available=false` permanently after a slow cold-open. Also folds in the `yq`
and mempalace-skill changes that were sitting unreleased. **No pi/mempalace
version change** — pi npm `latest` is still `0.80.2` (= v1.2.2) and the
mempalace pin stays `3.5.0`; the base image rebuilds purely because the
`mempalace-toolkit` ref advances to pick up the self-heal extension.
### Fixed
- **mempalace-mcp self-heal — no more permanent `available=false` latch.**
The `mempalace.ts` pi extension (from `mempalace-toolkit`, bumped to
[`e12b624`](https://gitea.jordbo.se/joakimp/mempalace-toolkit/commit/e12b624))
previously tripped its per-request timeout on a slow virtiofs cold-open of
the palace, killed the child, and set `available=false` **forever** (no
respawn) — a pi restart was the only recovery.
- **Bounded respawn with capped exponential backoff** via `ensureAlive()`
(`MEMPALACE_MCP_MAX_RESPAWNS=2`, `MEMPALACE_MCP_RESPAWN_BACKOFF_MS=1000`;
set max to `0` to disable). Both `execute()` and initial startup route
through it. The respawn budget **resets on any successful JSON-RPC
response** (`onStdout`), so a healthy session can't slowly exhaust it.
- **Scoped init timeout** raised `120000 → 300000` ms (`MEMPALACE_MCP_INIT_TIMEOUT_MS`),
affecting **init only** — the per-call timeout stays `60000`
(`MEMPALACE_MCP_TIMEOUT_MS`) — so a genuine cold HNSW deserialize isn't
killed mid-open.
- **Concurrency hardening:** a generation counter prevents a late-exiting
killed process from clobbering a fresh respawn, and an explicit `healthy`
flag replaces the racy `proc != null` check.
- Note: the build-time `smoke-test.sh` verifies the extension is present and
deployed but does **not** exercise respawn behaviour — first live
validation is on a running container.
- **`yq` is now mikefarah's Go yq, not Debian's Python `yq`.** The base image
previously apt-installed `yq`, which on Debian/Ubuntu is the unrelated
kislyuk/`yq` (a jq wrapper, v3.x) — incompatible with the mikefarah v4 syntax
the `cloud-init` repo's `provision.sh`/`deploy.sh` expect. Dropped the apt
package and install the mikefarah binary instead (multi-arch amd64/arm64,
following the repo's `latest` convention like `tealdeer`/`uv`; pin a tag
with `--build-arg YQ_VERSION=vX.Y.Z`). The build-time `smoke-test.sh` gate
asserts `yq --version` reports `mikefarah` **and** major **v4**, so both a
regression to the Python package and a surprise future yq v5 fail CI.
### Changed
- **Baked `mempalace` skill now teaches temporal grounding.** Added a
*Temporal grounding* rule to the image-baked
`skills/mempalace/SKILL.md` (Phase 1 wake-up + a matching anti-pattern):
before using relative time terms ("yesterday", "last week"), establish the
current date/time and compute the delta against the actual diary/drawer
timestamp. Explicitly calls out that a **container recreate or fresh session
is not a day boundary** — pi-devbox restarts several times a day, so two
entries minutes apart can straddle a recreate. Fixes agents mislabelling
same-day sessions as "yesterday".
---
## v1.2.2 — 2026-06-24
Patch release: pick up **pi `0.80.2`** (npm `latest`) and **mempalace `3.5.0`**,
and drop the now-obsolete `diary_write` schema workaround — the upstream fix
shipped.
### Changed
- **mempalace pin `3.4.0``3.5.0`.** mempalace 3.5.0 carries the upstream
fix for the top-level-`anyOf` `diary_write` schema
([issue #1728](https://github.com/MemPalace/mempalace/issues/1728) /
[PR #1717](https://github.com/MemPalace/mempalace/pull/1717), merged
2026-06-14). The advertised schema is now `"required": ["agent_name"]` with
`entry`/`content` enforced at dispatch instead of via a root-level `anyOf`,
which Anthropic's tools API accepts. Verified against the published 3.5.0
wheel's `mcp_server.py` before removing the workaround.
- **pi `0.79.10``0.80.2`**, auto-resolved from npm `latest` at build time
(no pin in the repo; CI's `resolve-versions` job fetches it).
### Removed
- **The `diary_write` top-level-`anyOf` workaround in `Dockerfile.base`.** The
`perl` patch that rewrote the installed `mcp_server.py` (needed while
mempalace 3.3.x/3.4.0 advertised a top-level `anyOf` that Anthropic rejects,
failing tool registration at session start) is gone, since 3.5.0 fixes it at
the source. Keep `MEMPALACE_VERSION` in lockstep with opencode-devbox.
### Notes
- Unrelated to this release: a *stalled* `mempalace-mcp` (e.g. a slow virtiofs
cold-open of `chroma.sqlite3`) surfaces as `mempalace-mcp not available`
because the `mempalace.ts` extension's per-request timeout kills the child
and flips `available=false` until pi is restarted — this is the 2026-06-13
stall-protection behaving as designed, not the `anyOf` bug.
---
## v1.2.1 — 2026-06-22
Patch release: close the fork/recall + mempalace **under-utilisation gap** in
containers started without the private `skillset` repo — bake the
`pi-extensions` and `mempalace` skills into the image and add the missing
mempalace session-start directive. pi version is re-resolved from npm `latest`
at build.
### Added
- **Vendored fallback skills: `pi-extensions` + `mempalace`.** The pi-toolkit
global `AGENTS.md` directs every pi session to read
`~/.agents/skills/pi-extensions/SKILL.md` at start (the fix for fork/recall
under-utilisation). That pointer dangled in a container started **without**
the private `skillset` repo mounted. The image now bakes fallback copies of
both skills under `/usr/local/share/pi-devbox/skills/`, symlinked in by
`entrypoint-user.sh` (only when absent, so a mounted skillset still wins).
- **Proactive-load directive for `mempalace`.** Baking the skill only fixes
*availability*; nothing in pi-toolkit's global `AGENTS.md` told sessions to
load it, so it would still surface only via description-matching. The
pi-devbox managed block (`pi-global-AGENTS.append.md`) now adds a
session-start pointer (gated to pi-devbox containers, conditional on the
MemPalace MCP tools being present) so a new container actually picks the
skill up — memory continuity matters most in a frequently-recreated
container. (`pi-extensions`'s directive already ships in pi-toolkit, so only
its skill file needed baking.)
- **Layered freshness for the `pi-extensions` skill (Option 1 + Option 2).**
The canonical skill was promoted into the **public `pi-extensions` package
repo** under `skill/` (co-located with the extensions it documents). A
committed snapshot in `rootfs/` is the *floor*; `Dockerfile.variant` copies
`/opt/pi-extensions/skill/` (the pinned, manifest-recorded clone) over it at
build, so a normal build ships the fresh package copy and an old-ref/mirror
build still ships the snapshot. `mempalace` is snapshot-only (its consumer
skill has no public package home — the `mempalace-toolkit` repo ships a
*different* skill, `opencode-mempalace-bridge`). Provenance + refresh steps:
`rootfs/usr/local/share/pi-devbox/skills/VENDORED.md`.
- **Smoke-test coverage** for the fallback skills: build-time presence of both
`SKILL.md`s and the `pi-extensions` helper, a check that the baked
`pi-extensions` skill matches the package copy when the clone carries it, and
runtime assertions that both are symlinked into `~/.agents/skills/`.
---
## v1.2.0 — 2026-06-22
Minor release: **image-baked agent skills** — a new base mechanism that ships
+49 -48
View File
@@ -51,7 +51,8 @@ ENV DEBIAN_FRONTEND=noninteractive
# See the bundled `dot-watch` helper for live .dot -> PNG
# re-render (handy with pi-studio's image preview).
# imagemagick — image conversion / resizing for thumbnails, etc. ~50 MB.
# yq — YAML-aware companion to jq.
# (yq is NOT apt-installed: Debian's `yq` is the unrelated Python tool;
# mikefarah's Go yq is installed as a pinned binary further down.)
# socat — TCP relay. Powers `studio-expose`, which bridges
# pi-studio's container-loopback server to the container's
# external interface so a published port can reach it.
@@ -66,7 +67,6 @@ RUN apt-get update && \
openssh-client \
gnupg \
jq \
yq \
ripgrep \
fd-find \
tree \
@@ -289,21 +289,34 @@ RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "x86_64" ;; arm64) echo "aarch64"
# Always installed in the base. Set INSTALL_MEMPALACE=false at base-build
# time to shave ~300 MB.
#
# Stall protection (fixed 2026-06-13): mempalace-mcp is launched by the
# `mempalace.ts` pi extension from mempalace-toolkit (cloned below). That
# extension now applies a per-REQUEST timeout in its JSON-RPC client and
# kills the child on stall, so a virtiofs cold-open of chroma.sqlite3 /
# HNSW load can no longer hang the pi TUI uninterruptibly. Tunables:
# Stall protection (fixed 2026-06-13; self-heal added 2026-06-25):
# mempalace-mcp is launched by the `mempalace.ts` pi extension from
# mempalace-toolkit (cloned below). That extension applies a per-REQUEST
# timeout in its JSON-RPC client and kills the child on stall, so a virtiofs
# cold-open of chroma.sqlite3 / HNSW load can no longer hang the pi TUI
# uninterruptibly. A stall-kill is no longer a permanent latch either: the
# next tool call respawns the server with capped exponential backoff (the
# budget resets on any successful response). Tunables:
# MEMPALACE_MCP_TIMEOUT_MS (default 60000), MEMPALACE_MCP_INIT_TIMEOUT_MS
# (default 120000); 0 disables. A standalone stdio-watchdog shim is NOT
# needed — the extension already owns request/response correlation. See
# CHANGELOG.md "Unreleased > Fixed".
# (default 300000 — generous so a genuine first cold-open isn't killed),
# MEMPALACE_MCP_MAX_RESPAWNS (default 2; 0 disables self-heal),
# MEMPALACE_MCP_RESPAWN_BACKOFF_MS (default 1000); timeouts of 0 disable.
# Defaults live in the extension, so no ENV is needed here. A standalone
# stdio-watchdog shim is NOT needed — the extension already owns
# request/response correlation. See CHANGELOG.md "Unreleased > Fixed".
ARG INSTALL_MEMPALACE=true
# Pin to a known-good version. Bump deliberately, not implicitly: an
# unpinned install silently swept in mempalace 3.3.x/3.4.0 with a broken
# diary_write schema (see workaround RUN below + issue #1728). Pinning
# makes mempalace upgrades a reviewable diff rather than a surprise.
ARG MEMPALACE_VERSION=3.4.0
# diary_write schema. Pinning makes mempalace upgrades a reviewable diff
# rather than a surprise.
#
# 3.5.0 (2026-06) ships the upstream fix for the top-level-anyOf diary_write
# schema (issue #1728 / PR #1717, merged 2026-06-14): the advertised schema
# is now `"required": ["agent_name"]` with entry/content enforced at dispatch,
# which Anthropic's tools API accepts — so the old mcp_server.py perl
# workaround that used to live below is gone. Keep in lockstep with
# opencode-devbox when bumping.
ARG MEMPALACE_VERSION=3.5.0
ENV UV_TOOL_DIR=/opt/uv-tools
ENV UV_TOOL_BIN_DIR=/usr/local/bin
RUN if [ "${INSTALL_MEMPALACE}" = "true" ]; then \
@@ -312,41 +325,9 @@ RUN if [ "${INSTALL_MEMPALACE}" = "true" ]; then \
/opt/uv-tools/mempalace/bin/python -c "import mempalace; print('mempalace', mempalace.__version__ if hasattr(mempalace, '__version__') else 'installed')" ; \
fi
# ── workaround: strip top-level anyOf from mempalace_diary_write schema ──
# Mempalace 3.3.x/3.4.0 advertise diary_write's input_schema with a
# top-level `anyOf: [{required:[entry]}, {required:[content]}]` to express
# "either entry or content must be supplied". Anthropic's tools API rejects
# top-level anyOf/oneOf/allOf, so pi/Claude fail at session start with
# `tools.<n>.custom.input_schema: input_schema does not support oneOf,
# allOf, or anyOf at the top level`.
#
# Patch the advertised schema to require ["agent_name", "entry"] and remove
# the anyOf block. The handler keeps accepting `content` server-side as a
# kwarg alias so existing callers still work.
#
# Idempotent and self-deactivating: once upstream releases the fix the
# regex no longer matches (and the WARN below fires) — that's the signal
# to delete this RUN.
# Upstream status (last checked 2026-06-14):
# issue #1728 — STILL OPEN (root-level anyOf rejected by Anthropic/Codex)
# PR #1735 — CLOSED UNMERGED 2026-06-11; do NOT watch it (dead)
# PR #1717 — open; the current live fix candidate to watch
# mempalace PyPI latest = 3.4.0 (== our pin) → no release contains the fix yet
# https://github.com/MemPalace/mempalace/issues/1728
# https://github.com/MemPalace/mempalace/pull/1717
# TODO: remove this RUN once a mempalace release > 3.4.0 that actually strips
# the root-level anyOf ships on PyPI and is installed by the line above.
# Keep MEMPALACE_VERSION in lockstep with opencode-devbox when bumping.
RUN if [ "${INSTALL_MEMPALACE}" = "true" ]; then \
MP_FILE="$(find /opt/uv-tools/mempalace -path '*/mempalace/mcp_server.py' | head -n1)" && \
if [ -z "$MP_FILE" ]; then echo "mempalace mcp_server.py not found" >&2; exit 1; fi && \
perl -0777 -i -pe 's/(?:[ \t]*\#[^\n]*\n)*[ \t]*"required":\s*\[\s*"agent_name"\s*\]\s*,\s*\n[ \t]*"anyOf":\s*\[\s*\n[ \t]*\{\s*"required":\s*\[\s*"entry"\s*\]\s*\}\s*,\s*\n[ \t]*\{\s*"required":\s*\[\s*"content"\s*\]\s*\}\s*,?\s*\n[ \t]*\]\s*,\s*\n/ "required": ["agent_name", "entry"],\n/s' "$MP_FILE" && \
if grep -q '"required": \["agent_name", "entry"\]' "$MP_FILE"; then \
echo "mempalace diary_write anyOf workaround: applied (or already clean)"; \
else \
echo "WARN: mempalace diary_write anyOf workaround did not match expected schema — upstream may have changed shape" >&2; \
fi ; \
fi
# (The mempalace diary_write top-level-anyOf workaround that patched
# mcp_server.py here was removed in v1.2.2 — fixed upstream in mempalace
# 3.5.0 via issue #1728 / PR #1717 (merged 2026-06-14). See CHANGELOG.md.)
# ── mempalace-toolkit — bash wrappers for session/docs mining ────────
ARG INSTALL_MEMPALACE_TOOLKIT=true
@@ -429,6 +410,26 @@ RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "x86_64" ;; arm64) echo "aarch64"
chmod +x /usr/local/bin/tldr && \
tldr --version
# ── yq (mikefarah) — YAML processor, jq's companion for YAML ─────────
# Installed as the mikefarah Go binary — NOT Debian's `yq` apt package, which
# is the unrelated Python kislyuk/yq (a jq wrapper with different syntax and
# version line, e.g. 3.x). The cloud-init repo's deploy.sh/provision.sh
# require mikefarah yq v4 (the unrelated Debian python yq is v3.x). Follows
# the repo's `latest` convention (like tealdeer/uv/etc.); the smoke test pins
# the contract to major v4, so a future yq v5 fails CI instead of silently
# breaking provision.sh. Pin a specific tag with --build-arg YQ_VERSION=vX.Y.Z.
ARG YQ_VERSION=latest
RUN ARCH=$(case "${TARGETARCH}" in amd64) echo "amd64" ;; arm64) echo "arm64" ;; *) echo "amd64" ;; esac) && \
V="${YQ_VERSION}" && \
if [ "$V" = "latest" ]; then \
V=$(curl -sI --retry 5 --retry-delay 5 --retry-all-errors "https://github.com/mikefarah/yq/releases/latest" | awk 'tolower($1)=="location:" { sub(/\r$/,"",$2); n=split($2,a,"/"); print a[n] }'); \
fi && \
[ -n "$V" ] && \
echo "Installing mikefarah yq ${V}" && \
curl -fsSL --retry 5 --retry-delay 5 --retry-all-errors "https://github.com/mikefarah/yq/releases/download/${V}/yq_linux_${ARCH}" -o /usr/local/bin/yq && \
chmod +x /usr/local/bin/yq && \
yq --version
# ── AWS CLI v2 (for SSO/Bedrock authentication) ─────────────────────
RUN ARCH=$(case "${TARGETARCH}" in \
amd64) echo "x86_64" ;; \
+22
View File
@@ -94,6 +94,28 @@ RUN set -e && \
echo "pi-fork at $(cd /opt/pi-fork && git rev-parse --short HEAD)" && \
echo "pi-observational-memory at $(cd /opt/pi-observational-memory && git rev-parse --short HEAD)"
# ── Image-baked skill refresh: pi-extensions (Option 1 over Option 2) ──
# rootfs ships a VENDORED snapshot of the pi-extensions skill at
# /usr/local/share/pi-devbox/skills/pi-extensions/ (the "floor" — guarantees the
# skill is always in the image). The pi-extensions PACKAGE repo now co-locates
# the canonical skill under skill/, so here — after the pinned clone — we copy
# that over the snapshot. Result: a normal build ships the fresh, package-owned
# copy (pinned + recorded in the manifest via PI_EXTENSIONS_REF); a build whose
# ref predates the skill, or a fork pointing at a mirror without it, still ships
# the committed snapshot. The skill calls ./evaluate-extension-usage.py, so it
# is copied alongside. Idempotent and cache-safe (depends only on the clone).
RUN if [ -f /opt/pi-extensions/skill/SKILL.md ]; then \
cp /opt/pi-extensions/skill/SKILL.md \
/usr/local/share/pi-devbox/skills/pi-extensions/SKILL.md && \
if [ -f /opt/pi-extensions/skill/evaluate-extension-usage.py ]; then \
cp /opt/pi-extensions/skill/evaluate-extension-usage.py \
/usr/local/share/pi-devbox/skills/pi-extensions/evaluate-extension-usage.py ; \
fi && \
echo "refreshed pi-extensions skill from package @ $(cd /opt/pi-extensions && git rev-parse --short HEAD)" ; \
else \
echo "pi-extensions package has no skill/ at this ref — keeping vendored snapshot" ; \
fi
# ── pi-devbox awareness: append our pointer to the global AGENTS.md ──
# pi loads a SINGLE global instruction file (~/.pi/agent/AGENTS.md), which
# pi-toolkit's install.sh re-symlinks to /opt/pi-toolkit/pi-global-AGENTS.md on
+19 -1
View File
@@ -459,6 +459,22 @@ directory, and they compose:
tmux 0-indexing, uv-first Python, and pi-studio reachability, all as
*mechanisms* (deployment-specific hostnames/domains/nameservers are
discovered at runtime, never hardcoded).
- **Vendored fallback skills.** The pi-toolkit global `AGENTS.md` tells every
pi session to read `~/.agents/skills/pi-extensions/SKILL.md` at start (to fix
fork/recall under-utilisation). That pointer would dangle in a container
started *without* the private `skillset` repo, so the image also bakes
fallback copies of **`pi-extensions`** and **`mempalace`**. They are
symlinked only when absent, so a mounted skillset always overrides them. The
`pi-extensions` skill is *layered*: a committed snapshot in `rootfs/` is the
floor, and `Dockerfile.variant` copies the canonical, package-owned copy from
the pinned `pi-extensions` clone (`/opt/pi-extensions/skill/`) over it at
build, so a normal build ships the fresh copy and an old-ref/mirror build
still ships the snapshot. `mempalace` is snapshot-only (its consumer skill
has no public package home), and because pi-toolkit's `AGENTS.md` has no
directive for it, the pi-devbox managed block adds a session-start
*proactive-load* pointer for it (gated to pi-devbox containers, conditional
on the MemPalace MCP tools) so a new container actually loads it. See
`rootfs/usr/local/share/pi-devbox/skills/VENDORED.md`.
- **Skillset repo (optional).** If a `skillset` repo is mounted (at
`$HOME/skillset` or `/workspace/skillset`, or via `SKILLSET_CONTAINER_PATH`),
`deploy-skills.sh` symlinks its skills in too. Image-baked skills are
@@ -472,7 +488,9 @@ pointer fires only inside a pi-devbox container (it checks for
To add another image-baked skill: drop a `SKILL.md` under
`rootfs/usr/local/share/pi-devbox/skills/<name>/`; the `COPY` in
`Dockerfile.base` and the entrypoint symlink loop pick it up automatically.
`Dockerfile.base` and the entrypoint symlink loop pick it up automatically. To
refresh a vendored fallback, see
`rootfs/usr/local/share/pi-devbox/skills/VENDORED.md`.
## SSH and ControlMaster
@@ -18,3 +18,14 @@ be discovered at runtime, never assumed. And interactive shell aliases
(`dssh`, `dscp`, `cat``bat`) do **not** exist in your non-interactive bash
tool, so spell out the underlying command (e.g.
`ssh -F "$HOME/.ssh-local/config" mac …`).
## Session start: load the mempalace skill
If MemPalace MCP tools (e.g. `mempalace_search`, `mempalace_diary_write`) are in
your tool list, **read `~/.agents/skills/mempalace/SKILL.md` before doing
non-trivial work** and follow its protocol: search the palace before answering
about past work, and write a diary entry before the session ends. This is
especially load-bearing here — a pi-devbox container is frequently recreated, so
the palace is your only memory across recreates. Without the habit it is just
storage, not memory. (The skill is the consumer side; feeding the palace is the
separate `opencode-mempalace-bridge` skill, if present.)
@@ -0,0 +1,47 @@
# Vendored fallback skills
Most directories here are **image-baked skills** that `entrypoint-user.sh`
symlinks into `~/.agents/skills/` on container start (only when a skill of the
same name is not already present, so a mounted `skillset` repo or a user
override always wins).
| skill | owner | how it gets here |
|-------|-------|------------------|
| `pi-devbox-environment` | pi-devbox (this repo) | authored here; the canonical copy |
| `pi-extensions` | the `pi-extensions` package repo (`skill/`) | **vendored fallback** + refreshed at build |
| `mempalace` | the `skillset` repo | **vendored fallback** (snapshot only) |
## Why fallbacks exist
The pi-toolkit global `AGENTS.md` tells every pi session to read
`~/.agents/skills/pi-extensions/SKILL.md` at start (to fix fork/recall
under-utilisation). That pointer dangles in a container started **without** the
private `skillset` repo mounted. Baking the skill closes that *availability*
gap. `mempalace` is baked for the same reason (memory continuity); since
nothing in pi-toolkit's `AGENTS.md` points to it, the pi-devbox managed block
(`pi-global-AGENTS.append.md`) also adds the matching *proactive-load*
directive ("load the mempalace skill at session start") so a new container
actually picks it up rather than relying on description-matching.
`pi-extensions`'s directive already ships in pi-toolkit's `AGENTS.md`, so only
its skill file needed baking.
## Freshness model (layered — see Dockerfile.variant)
- **`pi-extensions`** — Option 1 + Option 2. The committed copy here is the
*floor*; at build time `Dockerfile.variant` copies `/opt/pi-extensions/skill/`
(the pinned, package-owned source) over it, so a normal build ships the fresh
package copy and a stale-ref / mirror build still ships the snapshot. Keep
`evaluate-extension-usage.py` alongside `SKILL.md` — the skill calls it via
`./`.
- **`mempalace`** — Option 2 only. The `mempalace` *consumer* skill lives only
in the private `skillset` repo (the `mempalace-toolkit` repo ships a
*different* skill, `opencode-mempalace-bridge`), so there is no public
package source to copy from. This snapshot is refreshed manually per release.
## Refreshing the snapshots
cp <skillset>/skills/pi-extensions/SKILL.md pi-extensions/SKILL.md
cp <skillset>/skills/pi-extensions/evaluate-extension-usage.py pi-extensions/
cp <skillset>/skills/mempalace/SKILL.md mempalace/SKILL.md
Snapshot provenance at last refresh: skillset `8e8db64`, pi-extensions pkg `a7f3044`.
@@ -0,0 +1,325 @@
---
name: mempalace
description: MemPalace agent memory protocol. Use on every session to maintain continuity across conversations — search before answering about past work, write diary entries before session ends, and mine new projects into the palace. Load this skill at session start.
---
# MemPalace Agent Memory Protocol
## Overview
MemPalace gives you persistent memory across sessions via an MCP server. It stores project knowledge (mined from files), conversation summaries (diary entries), and entity relationships (knowledge graph). Without this protocol, you have tools but no habits — and memory without habits is just storage.
**Core principle:** Storage is not memory. Storage + protocol = memory.
## When to Load This Skill
- At the **start of every session** (proactively, before the user asks)
- When the user mentions **past conversations, decisions, or work**
- When working on a **new project or repository** for the first time
- When the user asks about **people, projects, or relationships**
## Session Lifecycle
### Phase 1: Wake Up (session start)
Run these immediately when a session begins, before responding to the user:
1. **Load palace overview:**
```
mempalace_status
```
This returns wing/room counts, the AAAK spec, and the memory protocol reminder.
2. **Read your recent diary:**
```
mempalace_diary_read(agent_name="<your_agent_name>", last_n=5)
```
Scan for context about recent sessions — what was worked on, what matters, what's pending.
3. **Check the knowledge graph** for the user or active project if relevant:
```
mempalace_kg_query(entity="<project_or_person>")
```
Do NOT announce this to the user. Just do it silently to orient yourself.
### Temporal grounding — compute time deltas, don't guess
Diary entries and drawers carry real timestamps (`timestamp`, `created_at`).
Before describing *when* something happened — "yesterday", "earlier today",
"last week", "a while back" — **establish the current date/time first and
compute the delta against the actual timestamp.** Get "now" from the injected
session date or by running `date` in a shell; never infer it.
**A container recreate or a fresh session is NOT a day boundary.** A pi-devbox
container is frequently restarted — often several times within the *same* day —
and each restart begins a new session with a fresh wake-up. Do not reason "new
session ⇒ last session was yesterday": two diary entries 90 minutes apart can
straddle a container recreate. The only authoritative clock is the timestamp on
the memory, not the session/container boundary.
**Practical rule:** prefer explicit, checkable phrasing — e.g. "earlier today,
~8h ago (both 2026-06-25)" — over a vague relative term. If you catch yourself
about to write "yesterday" / "last week", subtract `now entry.timestamp` and
state the computed result. (Remember timestamps may be UTC while the wall clock
is local — reconcile the offset before computing the delta.) Note too that
session feeders can lag up to a week (see *Multi-harness palace*), so a recent
absence in `wing_conversations` is not proof nothing happened.
### Phase 2: Active Session (during work)
#### Search Before You Speak
Before answering questions about past work, decisions, people, or projects:
```
mempalace_search(query="<keywords>", wing="<project>")
```
**Never guess about facts that might be in the palace.** Wrong is worse than slow. Say "let me check" and query.
#### Mine New Projects
When working on a new codebase for the first time:
1. Check if it's already mined:
```
mempalace_list_wings
```
2. **Decide what to mine — docs first, code never (by default).**
The palace is for *context and intent*, not code recall. Code is better read from the working tree via `Read`/`Grep`/`glob` — always authoritative, never stale. Embedding source code produces thousands of low-signal drawers (e.g. `def __init__(self, ...)` across every class) that pollute search for years.
**Mine by default:**
- `*.md`, `*.rst`, `*.txt` — docs, READMEs, CHANGELOGs, architecture notes
- `AGENTS.md`, `CLAUDE.md`, `CONTRIBUTING.md`, design/decision docs — highest signal per byte
- `*.sh`, `Dockerfile`, `Makefile`, entrypoints — small, intent-bearing
- `*.yml`, `*.yaml`, `*.toml`, selective `*.json` (`docker-compose`, `pyproject`, `mkdocs.yml`, CI workflows) — skip lockfiles
**Do NOT mine by default:**
- `*.py`, `*.ts`, `*.tsx`, `*.js`, `*.go`, `*.rs`, `*.java`, `*.cpp`, `*.c`, `*.rb` — raw source code
- Test files, fixtures, generated code
- `node_modules/`, `.venv/`, `__pycache__/`, `.mypy_cache/`, `.pytest_cache/`, `.ruff_cache/` (the miner respects `.gitignore` but double-check)
Exception: if a code file *is* the documentation (e.g. a heavily-commented reference script, or a protocol definition), file it manually via `mempalace_add_drawer`.
3. **Before mining**, inspect the repo to estimate drawer count:
```bash
# Quick audit — what will actually get mined?
find <dir> -type f \
-not -path '*/.git/*' -not -path '*/node_modules/*' \
-not -path '*/.venv/*' -not -path '*/__pycache__/*' \
\( -name '*.md' -o -name '*.sh' -o -name '*.yml' -o -name '*.yaml' \
-o -name '*.toml' -o -name 'Dockerfile*' -o -name 'Makefile' \) | wc -l
```
A docs-heavy repo should produce ~510 drawers per file. If a mine produces >15 drawers/file on average, code leaked in — investigate.
4. Run the mine:
```bash
mempalace init --yes <directory>
mempalace mine <directory> --agent <your_agent_name>
```
The miner currently lacks a `--docs-only` or `--exclude-ext` flag (as of v3.3.3). Until it does, either:
- (a) Add a `mempalace.yaml` at the repo root with explicit include globs, OR
- (b) Mine everything, then surgically remove code-sourced drawers via SQL on `~/.mempalace/palace/chroma.sqlite3` (delete by `embedding_metadata.source_file LIKE '%.py'`), followed by `mempalace repair --yes`.
5. If the CLI miner misses a file you *do* want (e.g., `.zsh`, an undocumented extension), file it manually:
```
mempalace_add_drawer(wing="<project>", room="<aspect>", content="<verbatim content>", source_file="<path>")
```
6. After mining, reconnect to pick up the new embeddings:
```
mempalace_reconnect
```
If search errors occur after mining ("Error finding id"), repair the index:
```bash
mempalace repair --yes
```
#### Track Facts in the Knowledge Graph
When you learn new facts about people, projects, or relationships:
```
mempalace_kg_add(subject="ProjectX", predicate="uses", object="PostgreSQL")
mempalace_kg_add(subject="Alice", predicate="owns", object="ProjectX", valid_from="2026-01-15")
```
When facts change (ended, no longer true):
```
mempalace_kg_invalidate(subject="Alice", predicate="works_at", object="OldCorp", ended="2026-03-01")
```
#### Cross-Reference with Tunnels
When content in one project relates to another, create a tunnel:
```
mempalace_create_tunnel(
source_wing="project_api", source_room="endpoints",
target_wing="project_db", target_room="schema",
label="API endpoints map to these DB tables"
)
```
#### Feeding opencode session history (opencode + mempalace-toolkit only)
MemPalace has no upstream integration with [opencode](https://github.com/anomalyco/opencode) as of v3.3.3 — `hooks_cli.py` only supports `claude-code` and `codex` harnesses. Opencode persists every turn in a local SQLite DB at `~/.local/share/opencode/opencode.db`, but nothing moves that data into the palace automatically.
On a machine with opencode + the [`mempalace-toolkit`](https://gitea.jordbo.se/joakimp/mempalace-toolkit) installed, session history is fed into `wing_conversations` via `mempalace-session` — either manually, or on a weekly systemd user timer / cron schedule shipped in `mempalace-toolkit/contrib/`. If this is missing, opencode conversations exist only in the local SQLite DB and are invisible to `mempalace_search`.
**How to tell if it's set up:**
```
mempalace_list_wings
```
If `wing_conversations` exists and has a drawer count comparable to the user's opencode session count, session feeding is working. If it's empty or suspiciously small, suggest:
1. Check if the toolkit is installed: `which mempalace-session`.
2. If installed, suggest running `mempalace-session --dry-run` to preview and `mempalace-session` to file.
3. If not installed, point the user at `gitea.jordbo.se/joakimp/mempalace-toolkit` for setup.
**Don't try to paper over the gap by dumping turn-level content into the palace manually via `mempalace_add_drawer`** — that reinvents what `mempalace-session` does with normalization and dedup. Use the tool.
Full routine (triggers, cadence, automation) is in the [`opencode-mempalace-bridge`](https://gitea.jordbo.se/joakimp/mempalace-toolkit) skill and the toolkit's `ARCHITECTURE.md` §5. The two skills pair: this one (`mempalace`) covers using the palace; that one (`opencode-mempalace-bridge`) covers feeding it from opencode.
### Phase 3: Wind Down (session end)
**Always write a diary entry before the session ends.** This is the most important habit.
```
mempalace_diary_write(
agent_name="<your_agent_name>",
entry="<AAAK compressed summary>",
topic="session-summary"
)
```
#### Why still write diaries when sessions may be mined automatically?
On machines running opencode + `mempalace-toolkit`, every session is mined into `wing_conversations` on a weekly (or user-defined) schedule. A common and incorrect conclusion: *"since every turn is captured automatically, writing a diary entry is redundant."* It isn't.
Session mining captures **what was said** (every turn, verbatim). A diary captures **what the session meant** — editorial judgment by the agent who lived it:
- Lessons learned, patterns noticed, pending items rolled forward
- Meta-observations that were never said aloud during the session
- Aggregate counts (commits shipped, bugs fixed, hours spent)
- A compressed, recency-scannable summary for the *next* agent's wake-up
Mining raw turns cannot surface these because the words don't exist verbatim — they're the agent's reflection at wind-down. Think of the split as *release notes* (diary) vs. *git log with diffs* (session mine): a repo keeps both because they answer different questions. So does the palace.
**Practical rule:** automated mining does not replace Phase 3. Both systems cover each other's failure modes — a skipped diary is recovered from the raw turns; a missed mine is recovered from the diary summary. For the full treatment (comparison table, retrieval patterns, token economics), see [`mempalace-toolkit/ARCHITECTURE.md` §5 → "Diary vs session mine: why keep both?"](https://gitea.jordbo.se/joakimp/mempalace-toolkit/src/branch/main/ARCHITECTURE.md#diary-vs-session-mine-why-keep-both).
#### AAAK Diary Format
Write diary entries in compressed AAAK format for efficiency. Structure:
```
SESSION:<date>|<what.you.worked.on>|
TASKS:
1.<task.description>→<outcome>|
2.<task.description>→<outcome>|
DISCOVERED:<unexpected.findings>|
ENTITIES:<people.or.projects.encountered>|
<importance: one to five stars>
```
Example:
```
SESSION:2026-04-28|api.refactor+db.migration|
TASKS:
1.refactored.auth.endpoints→split.into.3.modules|
2.added.user.roles.migration→postgres.enum.type|
DISCOVERED:legacy.session.table.unused.since.v2|
ENTITIES:ProjectX;Alice(reviewer)|
***
```
Rules:
- Use dots instead of spaces within phrases
- Use pipes as field separators
- Use arrows for cause/effect or transitions
- Stars indicate session importance (one to five)
- Keep it tight — a future agent should get the gist in seconds
#### What to Capture
Prioritize recording:
- **Decisions made** and their rationale
- **Discoveries** — things that surprised you or that a future session needs to know
- **Unfinished work** — what's pending, what was deferred
- **User preferences** observed during the session
- **Entities encountered** — people, projects, tools, services
### Phase 4: Fact Updates
If facts changed during the session, update the knowledge graph before writing the diary:
```
mempalace_kg_invalidate(subject="...", predicate="...", object="...", ended="<today>")
mempalace_kg_add(subject="...", predicate="...", object="...", valid_from="<today>")
```
## Palace Structure
### Wings
Wings are top-level categories, typically one per project or domain:
- Named after the project directory (e.g., `cli_utils`, `opencode_devbox`)
- Agent diaries live in `wing_<agent_name>` (e.g., `wing_orchestrator`, `wing_pi`)
#### Multi-harness palace
A single palace can be fed by multiple coding-agent harnesses. On this machine the palace is shared between **opencode** and **pi** (Mario Zechner's pi-coding-agent). Implications:
- **`wing_conversations` mixes sources.** Both harnesses' session feeders write into the same wing. To tell them apart, look at the `source_file` metadata on each drawer:
- `pi_<uuid>.jsonl` → pi session
- `<slug>_ses_<id>.jsonl` → opencode session
- The first chunk of each session also carries a `| source: opencode` or `| source: pi` marker in the synthetic header line.
- **Other wings may belong to other harnesses.** For example `wing_pi` is pi's diary, not opencode's. Don't assume every diary entry was written by you — check `agent_name` on the entry.
- **Session feeders run on different schedules.** Pi sessions are fed Tue 03:00, opencode sessions Mon 03:00. Recent sessions from either harness can lag the palace by up to a week, so absence-of-evidence in `wing_conversations` is not evidence-of-absence for recent work.
- **Reading another harness's diary is useful.** When orienting after a gap, `mempalace_diary_read agent_name=pi` (or whichever sibling agent has been active) often gives a fresher picture than waiting for the conversations feeder to catch up.
### Rooms
Rooms are aspects within a wing:
- `fzf`, `scripts`, `configuration`, `general` — whatever the miner detects
- Diary entries go into rooms by topic tag
### Drawers
Drawers hold verbatim content — never summarized, always searchable.
### Tunnels
Cross-wing connections linking related content across projects.
### Knowledge Graph
Entity-relationship triples with temporal validity. Query with `mempalace_kg_query`, browse with `mempalace_kg_timeline`.
## Troubleshooting
| Problem | Fix |
|---|---|
| "No palace found" | Run `mempalace init <dir>` then `mempalace mine <dir>` |
| "Error finding id" after mining | Run `mempalace repair --yes` then `mempalace_reconnect` |
| Search returns irrelevant results | Use `max_distance=1.0` for stricter matching; add `wing` filter |
| Miner skips file types | File manually with `mempalace_add_drawer` or use `--no-gitignore` |
| Stale results after external changes | Call `mempalace_reconnect` |
## Anti-Patterns
- **Don't guess when you can search.** If a question touches past work, search first.
- **Don't infer elapsed time from session or container boundaries.** A restart isn't a new day. Compare the actual timestamp (`timestamp` / `created_at`) against the current date/time before saying "yesterday", "last week", etc.
- **Don't skip the diary.** A session without a diary entry is a session forgotten.
- **Don't summarize drawer content.** File verbatim — the embedding model needs the original words.
- **Don't mine .git directories or node_modules.** The CLI miner respects .gitignore by default.
- **Don't create duplicate drawers.** Use `mempalace_check_duplicate` before adding manually.
- **Don't treat the palace as a task list.** It's for knowledge and context, not todos.
@@ -0,0 +1,298 @@
---
name: pi-extensions
description: >-
Use the pi extensions (pi-fork, pi-observational-memory, ssh-controlmaster) effectively in the pi coding agent harness. Load this skill only when running inside pi (detection - `fork` and `recall` are present in your tool list, or `pi --ssh` was used to start the session). pi-fork dispatches focused subtasks to forked agents at fast/balanced/deep effort tiers; pi-observational-memory compacts long sessions into recallable observations + reflections; ssh-controlmaster rewires pi's read/write/edit/bash tools to execute on a remote host over a multiplexed SSH connection. This skill covers tier selection, task design, boundary discipline, when to use recall, and remote-pi mechanics.
---
# Pi Extensions: pi-fork, pi-observational-memory, ssh-controlmaster
## When to Load This Skill
Load only when **both** of these are true:
1. You are running inside the **pi coding agent harness** (not Claude Code, not opencode, not any other harness).
2. The `fork` and/or `recall` tools appear in your available tool list, **or** the session was started with `pi --ssh ...`.
If you do not see those tools, this skill does not apply — skip it. Other harnesses do not have these extensions and the patterns below will not work there.
This skill is most useful at the start of any non-trivial session where you may need to dispatch parallel subtasks, where the conversation is likely to compact (sessions running > ~80k tokens), or where pi is operating against a remote host.
## Pi extension landscape (where the wiring lives)
Pi has **two distinct extension locations** and it's easy to look in the wrong one:
| Location | Mechanism | Examples |
|---|---|---|
| `~/.pi/agent/extensions/*.ts` (or `.ts.off`) | **Local extensions** — TypeScript files, usually symlinks into `/opt/pi-extensions/extensions/` or similar. Toggled via `/ext` slash command. | `ssh-controlmaster`, `git-checkpoint`, `notify`, `todo`, `mempalace`, `mcp-loader`, `ext-toggle`, `confirm-destructive` |
| `~/.pi/agent/git/<host>/<owner>/<repo>/` | **Package extensions** — git-cloned npm packages registered via the `packages` array in `~/.pi/agent/settings.json`. | `pi-fork` (`github.com/elpapi42/pi-fork`), `pi-observational-memory` (`github.com/elpapi42/pi-observational-memory`, **default branch `master`** — a `main` branch does not exist, so `pi install git:...` resolves against `master`) |
When the user asks how to use "the X extension", **check both locations**`find ~/.pi/agent -maxdepth 4 -name "*X*"` covers both. The `/ext` slash command shows the local-extensions list with enable/disable state. There is also a distinct skill-bundled-script category (e.g. `ci-release-watcher`'s `ssh-control-master-setup.sh`) which is **not** a pi extension at all — it's a helper script inside a skill. Don't conflate the three.
## Why These Extensions Belong Together
pi-fork and pi-observational-memory are symbiotic. **pi-fork burns context** (each fork dispatches a focused subtask whose detailed exploration would otherwise pollute your main thread). **pi-observational-memory preserves context** (when the main thread eventually compacts, observations + reflections survive the fold and can be recalled by ID). Aggressive forking only works long-term if the surviving summary is high-fidelity, and OM only earns its keep when it's preserving genuinely valuable distilled work.
ssh-controlmaster is orthogonal but composes cleanly: when pi is operating remotely, fork still spawns local sub-agents (each fork *itself* doesn't ssh), but their `bash`/`read`/`write`/`edit` calls do — see Part 3 caveats.
---
## Part 1: pi-fork
### Effort tier mapping
Configured in `~/.pi/agent/settings.json` under `pi-fork.effortProfiles`. The conventional mapping is:
| Tier | Model | Use for |
|---|---|---|
| `fast` | haiku | mechanical edits, narrow lookups, file-listing, single-fact verification, simple syntactic checks |
| `balanced` | sonnet (default) | normal exploration, implementation, testing, code review, option analysis |
| `deep` | opus | architecture decisions, security analysis, concurrency reasoning, ambiguous debugging, high-risk reviews, runbook drafting where subtle mistakes are costly |
**Rule of thumb:** start at `balanced` unless you have a specific reason to go up or down. Going too cheap on a deep task wastes a fork; going too expensive on a mechanical task is just slow.
### When to fork vs. do it yourself
Fork when **any** of:
- The task requires reading many files whose contents you don't need to keep in your main context afterwards (the fork returns a dense summary; raw file contents stay in the fork's context and are discarded).
- You want to run multiple analyses in **parallel** (especially: comparing N options, where independent reasoning is itself a signal — see "parallel forks" below).
- The task is well-scoped enough to specify completely up front and well-bounded enough that returning a dense report is more useful than continuing the dialogue.
- You are about to do something that would burn a lot of tokens on tool calls (long file reads, many bash invocations) whose output you will mostly discard.
Don't fork when:
- The work fits in your current context budget without crowding out what comes next.
- The task is exploratory and you'll need to iterate based on what you find (forking turns iteration into round-trips with full task-spec rewrites).
- You need to make decisions during the work that depend on context only the main thread has.
### Task design: the four things a fork brief must contain
1. **Verified context up front.** Do not say "go look at the codebase and figure out X". Pass the facts you already know — file paths, version numbers, observed behavior, prior decisions. The fork should be reasoning *from* context, not *finding* context. Discovery work costs the fork tokens that don't come back to you.
2. **A specific deliverable.** "Analyze X" is too vague. "Return a comparison table of A/B/C across these 8 axes, plus a recommendation with reasoning, plus a concrete next step" gives the fork a shape to fill.
3. **Decision authority.** State explicitly what the fork may and may not do: "report only, no edits" / "may write to /tmp/, no commits" / "may edit files in /workspace/foo, may not commit" / unspecified (the fork will infer conservatively). **State this even when it seems obvious.** See "Boundary discipline" below.
4. **What "unsure" looks like.** Tell the fork to surface ambiguities back to you rather than resolve them silently. "Things I'm unsure about" sections at the end of fork output are gold — they're where a confident-sounding wrong answer would otherwise hide.
### Parallel forks for option-comparison
When facing a "which approach should we take" question with 24 candidate approaches, dispatching the candidates as parallel forks is high-leverage:
- They reason **independently**. No fork sees the others' work.
- **Convergence is signal.** If three forks at different effort tiers reach the same recommendation citing different evidence, that's a strong validation that doesn't depend on any one model's bias.
- **Divergence is also signal.** If one disagrees, read its reasoning carefully — it may have spotted something the others missed, or it may have a tier-specific weakness worth knowing.
Sample shape for an option-comparison call:
- Fork 1 (deep) — detailed runbook for option A, with timing/risk/rollback
- Fork 2 (balanced) — comparison table A vs B vs C across N axes, with a recommendation
- Fork 3 (fast) — focused sub-question (e.g., "which container image / library version / CLI flag")
This costs more than a single fork but the cross-validation is often worth it for decisions you'll execute on prod systems.
### Boundary discipline (observed behavior)
Forks **mostly** honor explicit decision-authority instructions, but not infallibly. Observed pattern from real sessions:
- **Pure analysis tasks** (no write authority, "report only") — high compliance. Forks reliably return analysis without editing files or committing.
- **Write-capable tasks with a "don't do X" carve-out** — compliance is high but not perfect. Forks have been observed to override "don't edit/commit" instructions when they judge the action obvious and mechanically correct. The override usually produces technically sound work, but it violates the boundary.
**Practical rules:**
- State decision authority explicitly, every time, even when "report only" feels redundant.
- For high-stakes write authority, verify the fork's actions afterwards (`git status`, `git log -1`, file diffs) rather than assuming compliance.
- If a boundary violation is unacceptable (e.g., compliance review, sandboxed exploration, "don't touch prod"), do not give the fork write tools at all — keep it strictly in analysis mode.
- The fact that the fork was "right anyway" is not the same as the fork having followed instructions.
### Anti-patterns
- **Forking trivial work.** A fork has overhead. If the task takes < 30 seconds in your main thread, just do it.
- **Vague briefs.** "Look into the database thing" returns vague output. The fork is not telepathic.
- **Forking iterative work.** Forks are one-shot. If you need to iterate, you'll re-spec the task each time — usually worse than doing it yourself.
- **Recursive forking** (forks spawning forks). Disabled by default and should stay disabled unless you have a specific batch-fanout use case.
- **Treating fork output as ground truth without verification.** Especially for cited code/commit hashes/URLs — forks can hallucinate these like any LLM. Spot-check decisive evidence.
---
## Part 2: pi-observational-memory
### How it actually works
Observational memory (OM v3, "session-ledger" architecture) runs an **observer agent** in the background as your conversation grows. When token thresholds are crossed (defaults: observe at 10k, reflect at 20k, compact at 81k), the observer distills the recent transcript into:
- **Observations** — timestamped events, each with a 12-character hex ID like `[3682ebfad7af]`. Compact one-liners describing what happened in the conversation.
- **Reflections** — durable, long-lived facts about the user, project, decisions, and constraints. Some reflections include observation IDs as evidence pointers.
When compaction fires, the raw transcript is folded away and replaced with a structured summary block containing the observations + reflections. **You — the next turn of the same agent — receive that summary block as your starting context.** That's the recovery mechanism.
**Storage is in-transcript, not on disk.** Do not grep for `observations.jsonl` or similar files; you will not find them. The artifact lives in the model's input context window.
Configuration lives in `~/.pi/agent/settings.json` under `observational-memory`. Tune `observeAfterTokens`, `reflectAfterTokens`, `compactAfterTokens`, and `observationsPoolMaxTokens` if observations feel sparse or noisy. The default 81k compaction threshold is well-calibrated for typical multi-task sessions.
### The `recall` tool
`recall(<12-char-hex-id>)` resolves a specific observation or reflection ID back to the original source context — the exact bash output, file contents, tool call results, commit message, or transcript fragment that the observation was distilled from.
**Use recall when:**
- You are about to make a decision that depends materially on a compacted observation or reflection whose details are unclear.
- You need exact wording, paths, commands, errors, commits, or user constraints behind a remembered claim.
- A broad reflection is relevant but you need its supporting observations to act safely.
- The user asks "why do you believe X" or "what supports that memory".
**Do not use recall for:**
- Semantic search (it's keyed by ID, not topic — you must already have a specific 12-char hex ID).
- Browsing the transcript out of curiosity.
- Preemptive lookup of every ID in your context "just in case".
Recall costs tokens. Use it when exact source context will materially change your next action.
> **Calibration note (from a real ~1-month trial, 2026-05/06):** across 20 logged container sessions, `recall` was invoked **0 times** while obsmem passively carried 529 observations across 6 compactions. Zero recall is a *warning sign*, not a badge of efficiency — it means decisions after a compaction were made on the distilled one-liner alone, without ever re-checking the source. The injected summary is **lossy by design**. Default habit to adopt: when you are about to **edit code, ship a change, or assert a fact** that rests on a `[high]`/`[critical]` observation or a reflection you did not produce *this* turn, `recall` its ID **first**. One recall before a load-bearing action is cheap; redoing finished work or contradicting a prior correction is not.
### Reading the compaction summary
When you see a block like `The conversation history before this point was compacted into the following summary:` at the start of a session or turn, that's OM output. Standard structure:
- **Reflections** at the top: stable facts. Some have IDs in brackets.
- **Observations** below, chronological: timestamped events with IDs in brackets and importance markers (`[high]`, `[critical]`, etc.).
When entries conflict, **the most recent observation reflects the latest known state.** Work that prior observations describe as completed should not be redone unless the user explicitly asks to revisit it.
### Anti-patterns
- **Treating compacted memory as definitive without recall** when stakes are high. Compaction is lossy; the observation may have lost a constraint that was on the line above it in the original transcript.
- **Recalling every ID preemptively.** Wasteful. Recall on demand.
- **Assuming the disk holds OM artifacts.** It doesn't. Don't waste time looking.
- **Ignoring the summary block** when starting a session. It's there because the prior session was real work — read it before answering questions about past work.
---
## Quick Reference
```
fork(task=..., effort=fast|balanced|deep)
- state decision authority explicitly
- pass verified context up front
- specify deliverable shape
- ask for "unsure about" section
recall(id=<12-char-hex>)
- only when stakes justify the cost
- id must already be visible in your context
- not a search tool
```
```
~/.pi/agent/settings.json
pi-fork.effortProfiles — model + thinking-depth per tier
pi-fork.defaultEffort — usually "balanced"
observational-memory.* — token thresholds, model, agentMaxTurns
observational-memory.debugLog: true — opt-in NDJSON telemetry at
~/.pi/agent/observational-memory/debug/<session>.ndjson (off by default)
```
### Installing on a fresh machine (host)
These are git-sourced pi packages (pi-fork is **not** on npm). Add to the
`packages` array in `~/.pi/agent/settings.json`, or:
```
pi install git:github.com/elpapi42/pi-fork
pi install git:github.com/elpapi42/pi-observational-memory # default branch: master (no main)
# obsmem is also published: pi install npm:pi-observational-memory
```
Restart pi after install. Enable `observational-memory.debugLog` if you want
the next window instrumented.
### Evaluating usage
`evaluate-extension-usage.py` (bundled next to this skill) mines pi session
transcripts for fork/recall counts and obsmem compaction stats. Run it per
machine (transcripts live at `~/.pi/agent/sessions/`) for a combined
host+container picture:
```
./evaluate-extension-usage.py # ~/.pi/agent/sessions
./evaluate-extension-usage.py /path/a /path/b # multiple roots
```
---
## Part 3: ssh-controlmaster
### What it does
When pi is launched with `--ssh`, this extension **rewires pi's `read`, `write`, `edit`, and `bash` tools to execute on the remote machine**, multiplexed over a single SSH ControlMaster socket. Pi is still running locally — the LLM, the UI, the MCP servers, the fork dispatcher all live on your local box — but anything those tools touch on the filesystem is the *remote's* filesystem.
This is fundamentally different from running pi locally and using `bash` to ssh inside it: with `--ssh`, the tool layer itself is remoted, so the LLM thinks it's working in the remote's `cwd` (the system prompt is rewritten to say so).
### Usage
```bash
# Key-based auth (preferred), remote cwd defaults to remote $HOME
pi --ssh lagret
# Pin to a specific remote directory
pi --ssh lagret:/volume1/docker/portainer/compose/119
# Password auth (input is NOT masked when typing)
pi --ssh user@host --ssh-ask-pass
```
The `lagret` form requires a `Host lagret` block in `~/.ssh/config` or a resolvable hostname. The status bar shows `SSH ⚡ own master <host>:<cwd>` or `SSH ⚡ system master <host>:<cwd>` once connected.
### How it cooperates with system SSH config
It reads `ssh -G <host>` to learn the effective config, then:
| `~/.ssh/config` for the host | Behavior |
|---|---|
| `ControlMaster auto` or `yes` with a `ControlPath` | Reuses the system master socket. Does **not** tear it down on pi exit ("it was the system's to manage before pi arrived"). |
| No ControlMaster configured (or explicitly `no`) | Creates its own master at `/tmp/pi-cm-<pid>.sock` with `ControlPersist=yes`. Tears it down on pi `session_shutdown`. |
This means it composes cleanly with the system-wide `ssh-control-master-setup.sh` helper from the `ci-release-watcher` skill: if that script has already configured `~/.ssh/config` for the host, `pi --ssh` rides on the existing master rather than opening a parallel connection.
### Caveats and edge cases
- **Local vs remote tool boundary.** Only `read`/`write`/`edit`/`bash` are remoted. **MCP servers are still local**`mempalace` files drawers and diary entries against the local palace even when your shell work happens remotely. Same for `fork`, `recall`, `todo`, and any other custom tool. This is usually what you want (palace memory survives across remote sessions) but worth knowing.
- **fork over ssh.** Forks spawn locally and inherit the same `--ssh` mode by virtue of the parent's tool wiring; the fork's bash calls hit the same ControlMaster. Forks burn the same SSH socket, not a parallel one — multiplexing wins again.
- **macOS Unix socket path limit.** The own-master socket lives at `/tmp/pi-cm-<pid>.sock` to stay under macOS's ~104-char limit. If you have a non-default `TMPDIR` long enough to blow this, ssh will fail to start the master.
- **Password auth password visibility.** From the source: *"input is NOT masked — the password is visible while typing."* The password is written to a chmod-700 SSH_ASKPASS script in `/tmp` and deleted after the master establishes; not persisted, but on-screen during entry.
- **Remote bash environment.** The remote shell is whatever `ssh user@host '<cmd>'` invokes — typically a non-login non-interactive bash. Don't expect `~/.bashrc` aliases or PATH manipulations from `~/.profile`. Pin tool paths or invoke via `bash -lc '...'` if you need login-shell behavior.
- **Path translation is naive.** The extension does `path.replace(localCwd, remoteCwd)` to translate paths in tool calls. If the LLM emits an absolute remote path that doesn't share the local-cwd prefix, the path is passed through unchanged — usually fine but pathological for paths that happen to contain the local-cwd substring.
### When to use it
- Editing configs on a NAS / homelab host without scp ping-pong (`pi --ssh lagret:/volume1/...`)
- Operating against a host whose tools/data you need but whose disk is too slow to mount via SSHFS
- Investigating runner state, container configs, etc., on a remote host as if local
- Multi-step remote work where opening a fresh ssh connection per step would burn your CGNAT flow budget
### Anti-patterns
- **Using `pi --ssh` for one-off shell work.** Just `ssh` directly. The extension shines when there are dozens of tool calls per session.
- **Filing palace drawers expecting them on the remote.** They go to the local palace. If you want palace artifacts on the remote host, ssh into the remote and run pi *there* against its local palace.
- **Forgetting `--ssh` in followup sessions.** Status bar is the canary — if you don't see `SSH ⚡` you're operating locally despite intending remote. Easy mistake on a fresh terminal.
### Reaching the devbox host from inside the container (`dssh` / `dscp`)
Distinct from `pi --ssh` above. When the **pi-devbox container** runs under OrbStack / Docker Desktop on macOS, it can SSH back to its own host. The entrypoint's `setup-lan-access.sh` regenerates `~/.ssh-local/config` on **every container start** (the in-container `~/.ssh` is mounted read-only, so a sidecar config + `known_hosts` + `ControlPath` under `~/.ssh-local/` is used instead).
```bash
# Interactive shells get aliases (from ~/.bash_aliases):
dssh host 'cmd' # = ssh -F ~/.ssh-local/config host
dscp file host:/path # = scp -F ~/.ssh-local/config ...
```
**The agent's `bash` tool is non-interactive — those aliases are NOT loaded.** Use the explicit form:
```bash
ssh -F ~/.ssh-local/config host 'cmd'
scp -F ~/.ssh-local/config <src> host:<dst>
```
- Host aliases `host` and `mac` both resolve to `host.docker.internal` (user varies per host machine — check `~/.ssh-local/config` for the active `User` value, key `~/.ssh-local/devbox_jump_ed25519`, `ControlMaster auto` / `ControlPersist 4h`).
- The config chains `Include ~/.config/devbox-shell/ssh-lan.conf` then `Include ~/.ssh/config`, so LAN targets are reachable too (add `ProxyJump host` to those entries).
- **Use it for:** enabling/inspecting the host's pi config (`~/.pi/agent/settings.json`), running `evaluate-extension-usage.py` against the host's `~/.pi/agent/sessions/` for a combined host+container metric, or copying host transcripts into the container. The host's pi runs natively there; its palace, sessions, and extensions are separate from the container's.
---
## Cross-Skill Notes
- **mempalace** is for cross-session persistent memory (diary, knowledge graph, drawer storage). OM is for **within-session** context survival across compaction. They complement each other: write a diary entry at session end *and* let OM compact your work-in-progress mid-session.
- **systematic-debugging** and **test-driven-development** skills pair well with deep-tier forks: a deep fork can carry out a focused debugging investigation or write a failing test suite without polluting your main context.
- **ci-release-watcher** ships a `scripts/ssh-control-master-setup.sh` helper that configures system-wide SSH ControlMaster in `~/.ssh/config`. That's a separate mechanism from the `ssh-controlmaster` pi extension — they compose, they don't overlap. Use the script for persistent host-wide multiplexing, the extension for per-pi-session remote operation.
@@ -0,0 +1,117 @@
#!/usr/bin/env python3
"""Evaluate pi-fork / pi-observational-memory usage from pi session transcripts.
Mines pi's session .jsonl transcripts and reports:
- per-tool call counts (highlighting `fork` and `recall`)
- per-session fork/recall breakdown
- obsmem passive activity: compaction events, observations carried,
relevance-tier distribution, tokensBefore
Works on any machine. Point it at one or more session roots; by default it
scans ~/.pi/agent/sessions (the standard pi location, host or container).
Usage:
./evaluate-extension-usage.py # ~/.pi/agent/sessions
./evaluate-extension-usage.py /path/to/sessions ... # explicit roots
./evaluate-extension-usage.py --host HOST /path ... # label a root (for combined host+container runs)
For a true host+container picture, run once per machine (or copy each
machine's ~/.pi/agent/sessions here) and pass all roots together.
"""
import json, sys, os, glob, re, collections, argparse
TIER_RE = re.compile(r'\[(low|medium|high|critical)\]')
OBS_LINE_RE = re.compile(r'^\[[0-9a-f]{12}\] ', re.M)
def walk_tools(x, counter):
if isinstance(x, dict):
tn = x.get("toolName")
if tn:
counter[tn] += 1
for v in x.values():
walk_tools(v, counter)
elif isinstance(x, list):
for v in x:
walk_tools(v, counter)
def analyze(roots):
files = []
for r in roots:
if os.path.isfile(r) and r.endswith(".jsonl"):
files.append(r)
else:
files += glob.glob(os.path.join(r, "**", "*.jsonl"), recursive=True)
files = sorted(set(files))
tool_total = collections.Counter()
per_session = []
compactions = []
for f in files:
tc = collections.Counter()
with open(f, errors="ignore") as fh:
for ln in fh:
ln = ln.strip()
if not ln:
continue
try:
o = json.loads(ln)
except Exception:
continue
walk_tools(o, tc)
if o.get("type") == "compaction":
s = o.get("summary", "") or ""
compactions.append({
"file": os.path.basename(f),
"tokensBefore": o.get("tokensBefore"),
"observations": len(OBS_LINE_RE.findall(s)),
"tiers": dict(collections.Counter(TIER_RE.findall(s))),
})
tool_total.update(tc)
per_session.append((os.path.basename(f)[:10], tc.get("fork", 0),
tc.get("recall", 0), sum(tc.values())))
return files, tool_total, per_session, compactions
def main():
ap = argparse.ArgumentParser()
ap.add_argument("roots", nargs="*",
default=[os.path.expanduser("~/.pi/agent/sessions")])
args = ap.parse_args()
files, tool_total, per_session, comp = analyze(args.roots)
if not files:
print("No .jsonl transcripts found under:", args.roots, file=sys.stderr)
sys.exit(1)
print(f"=== {len(files)} transcripts under {args.roots} ===\n")
print("Tool call totals:")
for t, c in tool_total.most_common():
mark = " <== pi-fork" if t == "fork" else (" <== obsmem recall" if t == "recall" else "")
print(f" {c:6d} {t}{mark}")
fk = tool_total["fork"]; rc = tool_total["recall"]
fk_sess = sum(1 for p in per_session if p[1])
rc_sess = sum(1 for p in per_session if p[2])
print(f"\npi-fork: {fk} calls across {fk_sess} sessions")
print(f"recall: {rc} calls across {rc_sess} sessions"
+ (" (!) zero recall over the window — see SKILL.md calibration note" if rc == 0 else ""))
if comp:
tot_obs = sum(c["observations"] for c in comp)
tb = [c["tokensBefore"] for c in comp if c["tokensBefore"]]
print(f"\nobsmem passive: {len(comp)} compactions, {tot_obs} observations carried"
+ (f", avg tokensBefore {sum(tb)//len(tb):,}" if tb else ""))
agg = collections.Counter()
for c in comp:
agg.update(c["tiers"])
if agg:
print(" relevance tiers:", dict(agg))
else:
print("\nobsmem passive: no compaction events found "
"(short sessions, or obsmem not active on these transcripts)")
if __name__ == "__main__":
main()
+19 -1
View File
@@ -76,7 +76,7 @@ run "mempalace-mcp" "mempalace-mcp --help"
run "pandoc" "pandoc --version"
run "graphviz (dot)" "dot -V"
run "imagemagick" "magick --version"
run "yq" "yq --version"
run "yq (mikefarah v4)" "yq --version | grep -qE 'mikefarah.*version v4'"
run "tldr (tealdeer)" "tldr --version"
run "socat" "socat -V"
run "studio-expose helper" "test -x /usr/local/bin/studio-expose"
@@ -86,6 +86,20 @@ run "global-AGENTS append snippet present" \
"test -f /usr/local/share/pi-devbox/pi-global-AGENTS.append.md"
run "pi-devbox block merged into pi-global-AGENTS.md" \
"grep -q 'pi-devbox:managed-block' /opt/pi-toolkit/pi-global-AGENTS.md"
run "mempalace session-start pointer merged into global AGENTS.md" \
"grep -q 'load the mempalace skill' /opt/pi-toolkit/pi-global-AGENTS.md"
# Vendored fallback skills (so a no-skillset container still resolves the
# AGENTS.md 'read the pi-extensions skill' pointer).
run "image-baked pi-extensions fallback skill" \
"test -f /usr/local/share/pi-devbox/skills/pi-extensions/SKILL.md"
run "pi-extensions skill ships its helper" \
"test -f /usr/local/share/pi-devbox/skills/pi-extensions/evaluate-extension-usage.py"
run "image-baked mempalace fallback skill" \
"test -f /usr/local/share/pi-devbox/skills/mempalace/SKILL.md"
# Layered freshness: when the pinned pi-extensions clone carries the skill, the
# baked copy must be the fresh package copy (Option 1), not the stale snapshot.
run "pi-extensions skill refreshed from package when present" \
"if [ -f /opt/pi-extensions/skill/SKILL.md ]; then cmp -s /opt/pi-extensions/skill/SKILL.md /usr/local/share/pi-devbox/skills/pi-extensions/SKILL.md; else true; fi"
# ── tmux 0-indexing (required for pi-studio variants) ─────────────────
echo ""
@@ -163,6 +177,8 @@ for i in $(seq 1 45); do
test -L /home/developer/.pi/agent/keybindings.json && \
test -L /home/developer/.pi/agent/extensions/mempalace.ts && \
test -L /home/developer/.agents/skills/pi-devbox-environment && \
test -L /home/developer/.agents/skills/pi-extensions && \
test -L /home/developer/.agents/skills/mempalace && \
count=$(ls -1 /home/developer/.pi/agent/extensions/*.ts 2>/dev/null | wc -l) && \
[ "$count" -ge 4 ]
' >/dev/null 2>&1; then
@@ -185,6 +201,8 @@ exec_test "extensions ≥ 4 (pi-extensions)" 'count=$(ls -1 $HOME/.pi/age
exec_test "mempalace.ts bridge" 'test -L $HOME/.pi/agent/extensions/mempalace.ts && echo ok'
exec_test "settings.json bootstrapped" 'test -f $HOME/.pi/agent/settings.json && echo ok'
exec_test "pi-devbox-environment skill linked" 'test -L $HOME/.agents/skills/pi-devbox-environment && test -f $HOME/.agents/skills/pi-devbox-environment/SKILL.md && echo ok'
exec_test "pi-extensions skill linked (fallback)" 'test -L $HOME/.agents/skills/pi-extensions && test -f $HOME/.agents/skills/pi-extensions/SKILL.md && echo ok'
exec_test "mempalace skill linked (fallback)" 'test -L $HOME/.agents/skills/mempalace && test -f $HOME/.agents/skills/mempalace/SKILL.md && echo ok'
# pi-fork + pi-observational-memory are registered by entrypoint-user.sh via
# `pi install /opt/<pkg>`, which runs slightly after the keybindings marker.