Files
mempalace-toolkit/extensions/pi/README.md
T
pi e12b624cf7 feat(pi-ext): self-healing respawn + scoped init timeout for mempalace-mcp
A stall-kill (or any crash) of mempalace-mcp was a permanent latch:
available flipped off and stayed off until pi restart. Now the next tool
call transparently respawns the server and retries.

- ensureAlive(): bounded respawn with capped exponential backoff
  (MEMPALACE_MCP_MAX_RESPAWNS, default 2; MEMPALACE_MCP_RESPAWN_BACKOFF_MS,
  default 1000). Respawn budget resets on any successful JSON-RPC response,
  so a recovered server regains full patience while a persistently-broken
  one hits the cap and stays down (no hot-loop).
- Init timeout default raised 120000 -> 300000 (scoped to init only): a
  genuine virtiofs cold-open shouldn't be killed mid-progress only to
  respawn and re-pay the same cost. Per-call timeout stays 60000.
- Concurrency hardening: generation counter so a late exit from a killed
  old process can't tear down a fresh respawn; explicit healthy flag
  replaces racy proc!=null liveness check.
- README: document self-heal, new env vars, and why generous-init +
  bounded-respawn compose rather than overlap.
2026-06-26 00:22:21 +02:00

253 lines
10 KiB
Markdown

# pi ↔ MemPalace MCP bridge
The canonical source of `~/.pi/agent/extensions/mempalace.ts` — the TypeScript
extension that wires [MemPalace](https://github.com/MemPalace/mempalace)'s MCP
server into the [pi coding-agent](https://github.com/earendil-works/pi)
harness. Installs wake-up context injection, per-tool schema passthrough,
and a `/mempalace-diary` slash-command.
This directory **only** holds the bridge. Pi's own base config (keybindings,
environment loader, settings template) lives in the sibling
[`pi-toolkit`](https://gitea.jordbo.se/joakimp/pi-toolkit) repo — split out
2026-05-05 so [`opencode-devbox`](https://gitea.jordbo.se/joakimp/opencode-devbox)
can build slim containers that include pi without dragging in mempalace's
dependencies (~300 MB).
**Jump to:**
- [What it does](#what-it-does)
- [The `Type.Unsafe` gotcha](#the-typeunsafe-gotcha)
- [Deploying pi with mempalace on a new machine](#deploying-pi-with-mempalace-on-a-new-machine)
- [Fail-soft, identity, debugging](#fail-soft)
---
## What it does
1. **Spawns `mempalace-mcp`** as a subprocess and does the MCP stdio
JSON-RPC handshake (`initialize` + `notifications/initialized` +
`tools/list`).
2. **Registers each MCP tool** as a pi tool with its real `inputSchema`
passed through via `Type.Unsafe(...)` (see gotcha below).
3. **Wake-up auto-injection** (`before_agent_start`, one-shot per fresh
session): calls `mempalace_status` + `mempalace_diary_read` and
injects the result as a `mempalace-wakeup` system message so the
agent orients itself the way `~/.agents/skills/mempalace/SKILL.md`
describes. Skipped on resume/fork (context is already in the thread).
4. **Manual wind-down** via a `/mempalace-diary [topic]` slash command:
sends a prompt asking the LLM to call `mempalace_diary_write` with
an AAAK-formatted entry summarizing the session. Not fully auto
because pi sessions are typically short/tactical and
`session_shutdown` fires too late to drive another LLM turn.
## Fail-soft
If `mempalace-mcp` can't be spawned (PATH missing, binary crashes at
startup, …) the extension logs to stderr and returns early. pi keeps
working without palace tools rather than refusing to start.
## Identity
`agent_name` for diary calls comes from `$MEMPALACE_AGENT_NAME`, defaulting
to `"pi"`. First diary write against that identity creates `wing_<name>`
in the palace. Set the env var if you want to run pi under a distinct
identity on a given machine (e.g. `pi-laptop` vs `pi-server`).
## Stall protection (per-request timeout)
Every JSON-RPC request to `mempalace-mcp` carries a timeout. Without it, a
wedged server (classically: an OrbStack/virtiofs cold-open of a large
`chroma.sqlite3` or an HNSW load) leaves the awaiting promise pending
*forever*, which freezes the pi TUI — ESC cancels the LLM stream, not a
pending tool `execute()`. On timeout the extension rejects the request
**and** kills the stalled child (SIGTERM→SIGKILL), so pi gets a clear
error instead of hanging. This is a per-REQUEST timeout, not a process-lifetime
one — the long-lived server is only killed when a request genuinely stalls.
- `MEMPALACE_MCP_TIMEOUT_MS` — tool-call/request timeout. Default `60000`.
Kept short on purpose: a *query* taking this long is genuinely wedged.
- `MEMPALACE_MCP_INIT_TIMEOUT_MS``initialize` + `tools/list` handshake
timeout. Default `300000`. Deliberately generous: a genuine first
cold-open over virtiofs can legitimately take minutes, and killing a
still-progressing init only to respawn and re-pay the same cold cost is
strictly worse than waiting.
- Set either to `0` to disable (legacy unbounded behavior).
### Self-heal (respawn instead of a permanent latch)
A stall-kill (or any crash) used to be a **permanent** latch: `available`
flipped off and stayed off until you restarted pi. It is now self-healing —
the next tool call transparently respawns `mempalace-mcp` and retries.
- Respawns use **capped exponential backoff** so a persistently-broken
server can't hot-loop: `MEMPALACE_MCP_MAX_RESPAWNS` attempts (default
`2`; set `0` to disable self-heal and keep the old fail-fast latch),
with `MEMPALACE_MCP_RESPAWN_BACKOFF_MS` (default `1000`) doubled per
attempt.
- The budget **resets on any successful JSON-RPC response** — proof the
server is actually live — so a server that recovers regains full
patience, while one that keeps dying hits the cap and stays down (then
restart pi).
- Why the long init timeout and bounded respawn compose rather than
overlap: once a server has opened the palace once, the OS page cache is
warm, so respawn cold-opens are fast. The long init timeout prevents
killing a healthy *first* cold-open; the respawn handles a genuinely
dead server cheaply afterwards. (Note the HNSW deserialize is CPU work
that isn't page-cacheable across spawns, which is exactly why we can't
rely on respawn-warming alone and keep the generous init budget.)
- The initial startup is tolerant too: if the very first `start()` fails,
the extension runs the same bounded respawn before falling back to
fail-soft (pi keeps working without palace tools).
## Debugging
- `MEMPALACE_EXT_DEBUG=1` — surface `mempalace-mcp` stderr into pi's
stderr. Without this, stderr is drained silently so a misbehaving
server doesn't flood the TUI.
- If a tool call fails with a generic "Internal tool error", spawn
`mempalace-mcp` manually with raw JSON-RPC on stdin to read the
server-side error — much faster than guessing.
## The `Type.Unsafe` gotcha
Earlier versions of this extension registered every MCP tool with
`parameters: Type.Object({}, { additionalProperties: true })`, which
discarded each tool's real `inputSchema`. The LLM then saw no parameter
names and had to guess, leading to bugs like `mempalace_diary_read`
being called with `agent=` instead of the required `agent_name=` and
crashing the Python server with `TypeError: missing 1 required
positional argument`.
The fix (≈ lines 160-170) is to wrap the incoming JSON Schema with
`Type.Unsafe<...>(tool.inputSchema)`. TypeBox schemas are plain JSON
Schema at runtime plus a `Symbol` marker, so wrapping an
externally-sourced schema with `Unsafe` is sufficient — no conversion
to a full TypeBox tree is needed, and the LLM now sees every tool's
real parameter names.
If you ever need to re-loosen the schema for debugging, fall back to
the `Type.Object({}, { additionalProperties: true })` default only for
that specific tool, not globally.
---
## Deploying pi with mempalace on a new machine
This is the "pi + memory" recipe. For pi without mempalace, see
[`pi-toolkit`'s README](https://gitea.jordbo.se/joakimp/pi-toolkit/src/branch/main/README.md#deploying-pi-on-a-new-machine).
### 0. Prerequisites
- Shell: zsh + oh-my-zsh recommended (both toolkits install loaders into
`~/.oh-my-zsh/custom/`; bash works too, installers print the manual
`source` snippet).
- `git`, `node` ≥ 20, `uv`, `tmux` ≥ 3.2, pi installed upstream.
- AWS credentials reachable via `AWS_PROFILE` — only if using
`amazon-bedrock` as pi's provider.
### 1. Dotfiles (if you keep one)
Brings `~/.config/pi/.env` (AWS creds, git-crypt encrypted), tmux CSI-u
extended keys, and other machine state:
```bash
git clone <your-dotfiles> ~/src/dotfiles
cd ~/src/dotfiles
git-crypt unlock <key>
./provision.sh --profile <profile> # or your equivalent tool
```
### 2. Install pi upstream
```bash
brew install pi-coding-agent # macOS
# or see https://github.com/earendil-works/pi for Linux
pi --help # creates ~/.pi/agent/
```
### 3. Install pi-toolkit (base pi config)
```bash
git clone ssh://git@gitea.jordbo.se:2222/joakimp/pi-toolkit.git ~/pi-toolkit
cd ~/pi-toolkit && ./install.sh
```
Symlinks `keybindings.json`, copies `pi-env.zsh` into
`~/.oh-my-zsh/custom/`, and prints the `settings.json` bootstrap command.
### 4. Bootstrap pi settings
```bash
cp ~/pi-toolkit/settings.example.json ~/.pi/agent/settings.json
$EDITOR ~/.pi/agent/settings.json # eu./us./anthropic: prefix
```
### 5. Install mempalace CLI + this toolkit
```bash
uv tool install mempalace
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit && ./install.sh
```
Detects pi, symlinks `mempalace.ts` into `~/.pi/agent/extensions/`.
Also detects pi-toolkit artifacts and prints a green check (or a warning
telling you to install pi-toolkit first if you skipped step 3).
### 6. Register mempalace MCP with opencode (if applicable)
Skip if this box is pi-only. Otherwise:
- Install [`opencode-toolkit`](https://gitea.jordbo.se/joakimp/opencode-toolkit) so `~/.config/opencode/.env` is sourced into every shell (GitHub / Gitea / other MCP server tokens).
- Register the mempalace MCP server in `~/.config/opencode/opencode.json` — see [root README § Registering mempalace with opencode](../../README.md#registering-mempalace-with-opencode-or-other-mcp-clients).
### 7. First run
```bash
exec zsh
pi # should start with defaults; wake-up injection shows palace status
```
If the wake-up doesn't print, run `MEMPALACE_EXT_DEBUG=1 pi` to surface
`mempalace-mcp` stderr.
### Verification checklist
```bash
# MCP bridge in place
ls -la ~/.pi/agent/extensions/mempalace.ts # → this repo
# pi-toolkit artifacts also in place
ls -la ~/.pi/agent/keybindings.json # → pi-toolkit
ls -la ~/.oh-my-zsh/custom/pi-env.zsh # cp from pi-toolkit
# Env loaded
zsh -ic 'echo $AWS_PROFILE $AWS_REGION'
# Palace reachable
mempalace status
```
### Uninstall
```bash
cd ~/mempalace-toolkit && ./install.sh --uninstall --yes # bridge only
cd ~/pi-toolkit && ./install.sh --uninstall --yes # pi base config
# Leaves pi itself, mempalace CLI, and ~/.config/pi/.env alone.
```
---
## File layout
```
mempalace-toolkit/
└── extensions/
└── pi/
├── README.md ← this file
└── mempalace.ts ← symlinked into ~/.pi/agent/extensions/
```
Pi base config (keybindings, env loader, settings template) lives in
[`pi-toolkit`](https://gitea.jordbo.se/joakimp/pi-toolkit). `install.sh`
detects pi via `~/.pi/agent/extensions/` and runs a `check_pi_toolkit`
probe that warns if pi-toolkit's artifacts are missing.