e12b624cf7
A stall-kill (or any crash) of mempalace-mcp was a permanent latch: available flipped off and stayed off until pi restart. Now the next tool call transparently respawns the server and retries. - ensureAlive(): bounded respawn with capped exponential backoff (MEMPALACE_MCP_MAX_RESPAWNS, default 2; MEMPALACE_MCP_RESPAWN_BACKOFF_MS, default 1000). Respawn budget resets on any successful JSON-RPC response, so a recovered server regains full patience while a persistently-broken one hits the cap and stays down (no hot-loop). - Init timeout default raised 120000 -> 300000 (scoped to init only): a genuine virtiofs cold-open shouldn't be killed mid-progress only to respawn and re-pay the same cost. Per-call timeout stays 60000. - Concurrency hardening: generation counter so a late exit from a killed old process can't tear down a fresh respawn; explicit healthy flag replaces racy proc!=null liveness check. - README: document self-heal, new env vars, and why generous-init + bounded-respawn compose rather than overlap.
253 lines
10 KiB
Markdown
253 lines
10 KiB
Markdown
# pi ↔ MemPalace MCP bridge
|
|
|
|
The canonical source of `~/.pi/agent/extensions/mempalace.ts` — the TypeScript
|
|
extension that wires [MemPalace](https://github.com/MemPalace/mempalace)'s MCP
|
|
server into the [pi coding-agent](https://github.com/earendil-works/pi)
|
|
harness. Installs wake-up context injection, per-tool schema passthrough,
|
|
and a `/mempalace-diary` slash-command.
|
|
|
|
This directory **only** holds the bridge. Pi's own base config (keybindings,
|
|
environment loader, settings template) lives in the sibling
|
|
[`pi-toolkit`](https://gitea.jordbo.se/joakimp/pi-toolkit) repo — split out
|
|
2026-05-05 so [`opencode-devbox`](https://gitea.jordbo.se/joakimp/opencode-devbox)
|
|
can build slim containers that include pi without dragging in mempalace's
|
|
dependencies (~300 MB).
|
|
|
|
**Jump to:**
|
|
- [What it does](#what-it-does)
|
|
- [The `Type.Unsafe` gotcha](#the-typeunsafe-gotcha)
|
|
- [Deploying pi with mempalace on a new machine](#deploying-pi-with-mempalace-on-a-new-machine)
|
|
- [Fail-soft, identity, debugging](#fail-soft)
|
|
|
|
---
|
|
|
|
## What it does
|
|
|
|
1. **Spawns `mempalace-mcp`** as a subprocess and does the MCP stdio
|
|
JSON-RPC handshake (`initialize` + `notifications/initialized` +
|
|
`tools/list`).
|
|
2. **Registers each MCP tool** as a pi tool with its real `inputSchema`
|
|
passed through via `Type.Unsafe(...)` (see gotcha below).
|
|
3. **Wake-up auto-injection** (`before_agent_start`, one-shot per fresh
|
|
session): calls `mempalace_status` + `mempalace_diary_read` and
|
|
injects the result as a `mempalace-wakeup` system message so the
|
|
agent orients itself the way `~/.agents/skills/mempalace/SKILL.md`
|
|
describes. Skipped on resume/fork (context is already in the thread).
|
|
4. **Manual wind-down** via a `/mempalace-diary [topic]` slash command:
|
|
sends a prompt asking the LLM to call `mempalace_diary_write` with
|
|
an AAAK-formatted entry summarizing the session. Not fully auto
|
|
because pi sessions are typically short/tactical and
|
|
`session_shutdown` fires too late to drive another LLM turn.
|
|
|
|
## Fail-soft
|
|
|
|
If `mempalace-mcp` can't be spawned (PATH missing, binary crashes at
|
|
startup, …) the extension logs to stderr and returns early. pi keeps
|
|
working without palace tools rather than refusing to start.
|
|
|
|
## Identity
|
|
|
|
`agent_name` for diary calls comes from `$MEMPALACE_AGENT_NAME`, defaulting
|
|
to `"pi"`. First diary write against that identity creates `wing_<name>`
|
|
in the palace. Set the env var if you want to run pi under a distinct
|
|
identity on a given machine (e.g. `pi-laptop` vs `pi-server`).
|
|
|
|
## Stall protection (per-request timeout)
|
|
|
|
Every JSON-RPC request to `mempalace-mcp` carries a timeout. Without it, a
|
|
wedged server (classically: an OrbStack/virtiofs cold-open of a large
|
|
`chroma.sqlite3` or an HNSW load) leaves the awaiting promise pending
|
|
*forever*, which freezes the pi TUI — ESC cancels the LLM stream, not a
|
|
pending tool `execute()`. On timeout the extension rejects the request
|
|
**and** kills the stalled child (SIGTERM→SIGKILL), so pi gets a clear
|
|
error instead of hanging. This is a per-REQUEST timeout, not a process-lifetime
|
|
one — the long-lived server is only killed when a request genuinely stalls.
|
|
|
|
- `MEMPALACE_MCP_TIMEOUT_MS` — tool-call/request timeout. Default `60000`.
|
|
Kept short on purpose: a *query* taking this long is genuinely wedged.
|
|
- `MEMPALACE_MCP_INIT_TIMEOUT_MS` — `initialize` + `tools/list` handshake
|
|
timeout. Default `300000`. Deliberately generous: a genuine first
|
|
cold-open over virtiofs can legitimately take minutes, and killing a
|
|
still-progressing init only to respawn and re-pay the same cold cost is
|
|
strictly worse than waiting.
|
|
- Set either to `0` to disable (legacy unbounded behavior).
|
|
|
|
### Self-heal (respawn instead of a permanent latch)
|
|
|
|
A stall-kill (or any crash) used to be a **permanent** latch: `available`
|
|
flipped off and stayed off until you restarted pi. It is now self-healing —
|
|
the next tool call transparently respawns `mempalace-mcp` and retries.
|
|
|
|
- Respawns use **capped exponential backoff** so a persistently-broken
|
|
server can't hot-loop: `MEMPALACE_MCP_MAX_RESPAWNS` attempts (default
|
|
`2`; set `0` to disable self-heal and keep the old fail-fast latch),
|
|
with `MEMPALACE_MCP_RESPAWN_BACKOFF_MS` (default `1000`) doubled per
|
|
attempt.
|
|
- The budget **resets on any successful JSON-RPC response** — proof the
|
|
server is actually live — so a server that recovers regains full
|
|
patience, while one that keeps dying hits the cap and stays down (then
|
|
restart pi).
|
|
- Why the long init timeout and bounded respawn compose rather than
|
|
overlap: once a server has opened the palace once, the OS page cache is
|
|
warm, so respawn cold-opens are fast. The long init timeout prevents
|
|
killing a healthy *first* cold-open; the respawn handles a genuinely
|
|
dead server cheaply afterwards. (Note the HNSW deserialize is CPU work
|
|
that isn't page-cacheable across spawns, which is exactly why we can't
|
|
rely on respawn-warming alone and keep the generous init budget.)
|
|
- The initial startup is tolerant too: if the very first `start()` fails,
|
|
the extension runs the same bounded respawn before falling back to
|
|
fail-soft (pi keeps working without palace tools).
|
|
|
|
## Debugging
|
|
|
|
- `MEMPALACE_EXT_DEBUG=1` — surface `mempalace-mcp` stderr into pi's
|
|
stderr. Without this, stderr is drained silently so a misbehaving
|
|
server doesn't flood the TUI.
|
|
- If a tool call fails with a generic "Internal tool error", spawn
|
|
`mempalace-mcp` manually with raw JSON-RPC on stdin to read the
|
|
server-side error — much faster than guessing.
|
|
|
|
## The `Type.Unsafe` gotcha
|
|
|
|
Earlier versions of this extension registered every MCP tool with
|
|
`parameters: Type.Object({}, { additionalProperties: true })`, which
|
|
discarded each tool's real `inputSchema`. The LLM then saw no parameter
|
|
names and had to guess, leading to bugs like `mempalace_diary_read`
|
|
being called with `agent=` instead of the required `agent_name=` and
|
|
crashing the Python server with `TypeError: missing 1 required
|
|
positional argument`.
|
|
|
|
The fix (≈ lines 160-170) is to wrap the incoming JSON Schema with
|
|
`Type.Unsafe<...>(tool.inputSchema)`. TypeBox schemas are plain JSON
|
|
Schema at runtime plus a `Symbol` marker, so wrapping an
|
|
externally-sourced schema with `Unsafe` is sufficient — no conversion
|
|
to a full TypeBox tree is needed, and the LLM now sees every tool's
|
|
real parameter names.
|
|
|
|
If you ever need to re-loosen the schema for debugging, fall back to
|
|
the `Type.Object({}, { additionalProperties: true })` default only for
|
|
that specific tool, not globally.
|
|
|
|
---
|
|
|
|
## Deploying pi with mempalace on a new machine
|
|
|
|
This is the "pi + memory" recipe. For pi without mempalace, see
|
|
[`pi-toolkit`'s README](https://gitea.jordbo.se/joakimp/pi-toolkit/src/branch/main/README.md#deploying-pi-on-a-new-machine).
|
|
|
|
### 0. Prerequisites
|
|
|
|
- Shell: zsh + oh-my-zsh recommended (both toolkits install loaders into
|
|
`~/.oh-my-zsh/custom/`; bash works too, installers print the manual
|
|
`source` snippet).
|
|
- `git`, `node` ≥ 20, `uv`, `tmux` ≥ 3.2, pi installed upstream.
|
|
- AWS credentials reachable via `AWS_PROFILE` — only if using
|
|
`amazon-bedrock` as pi's provider.
|
|
|
|
### 1. Dotfiles (if you keep one)
|
|
|
|
Brings `~/.config/pi/.env` (AWS creds, git-crypt encrypted), tmux CSI-u
|
|
extended keys, and other machine state:
|
|
|
|
```bash
|
|
git clone <your-dotfiles> ~/src/dotfiles
|
|
cd ~/src/dotfiles
|
|
git-crypt unlock <key>
|
|
./provision.sh --profile <profile> # or your equivalent tool
|
|
```
|
|
|
|
### 2. Install pi upstream
|
|
|
|
```bash
|
|
brew install pi-coding-agent # macOS
|
|
# or see https://github.com/earendil-works/pi for Linux
|
|
pi --help # creates ~/.pi/agent/
|
|
```
|
|
|
|
### 3. Install pi-toolkit (base pi config)
|
|
|
|
```bash
|
|
git clone ssh://git@gitea.jordbo.se:2222/joakimp/pi-toolkit.git ~/pi-toolkit
|
|
cd ~/pi-toolkit && ./install.sh
|
|
```
|
|
|
|
Symlinks `keybindings.json`, copies `pi-env.zsh` into
|
|
`~/.oh-my-zsh/custom/`, and prints the `settings.json` bootstrap command.
|
|
|
|
### 4. Bootstrap pi settings
|
|
|
|
```bash
|
|
cp ~/pi-toolkit/settings.example.json ~/.pi/agent/settings.json
|
|
$EDITOR ~/.pi/agent/settings.json # eu./us./anthropic: prefix
|
|
```
|
|
|
|
### 5. Install mempalace CLI + this toolkit
|
|
|
|
```bash
|
|
uv tool install mempalace
|
|
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
|
|
cd ~/mempalace-toolkit && ./install.sh
|
|
```
|
|
|
|
Detects pi, symlinks `mempalace.ts` into `~/.pi/agent/extensions/`.
|
|
Also detects pi-toolkit artifacts and prints a green check (or a warning
|
|
telling you to install pi-toolkit first if you skipped step 3).
|
|
|
|
### 6. Register mempalace MCP with opencode (if applicable)
|
|
|
|
Skip if this box is pi-only. Otherwise:
|
|
- Install [`opencode-toolkit`](https://gitea.jordbo.se/joakimp/opencode-toolkit) so `~/.config/opencode/.env` is sourced into every shell (GitHub / Gitea / other MCP server tokens).
|
|
- Register the mempalace MCP server in `~/.config/opencode/opencode.json` — see [root README § Registering mempalace with opencode](../../README.md#registering-mempalace-with-opencode-or-other-mcp-clients).
|
|
|
|
### 7. First run
|
|
|
|
```bash
|
|
exec zsh
|
|
pi # should start with defaults; wake-up injection shows palace status
|
|
```
|
|
|
|
If the wake-up doesn't print, run `MEMPALACE_EXT_DEBUG=1 pi` to surface
|
|
`mempalace-mcp` stderr.
|
|
|
|
### Verification checklist
|
|
|
|
```bash
|
|
# MCP bridge in place
|
|
ls -la ~/.pi/agent/extensions/mempalace.ts # → this repo
|
|
|
|
# pi-toolkit artifacts also in place
|
|
ls -la ~/.pi/agent/keybindings.json # → pi-toolkit
|
|
ls -la ~/.oh-my-zsh/custom/pi-env.zsh # cp from pi-toolkit
|
|
|
|
# Env loaded
|
|
zsh -ic 'echo $AWS_PROFILE $AWS_REGION'
|
|
|
|
# Palace reachable
|
|
mempalace status
|
|
```
|
|
|
|
### Uninstall
|
|
|
|
```bash
|
|
cd ~/mempalace-toolkit && ./install.sh --uninstall --yes # bridge only
|
|
cd ~/pi-toolkit && ./install.sh --uninstall --yes # pi base config
|
|
# Leaves pi itself, mempalace CLI, and ~/.config/pi/.env alone.
|
|
```
|
|
|
|
---
|
|
|
|
## File layout
|
|
|
|
```
|
|
mempalace-toolkit/
|
|
└── extensions/
|
|
└── pi/
|
|
├── README.md ← this file
|
|
└── mempalace.ts ← symlinked into ~/.pi/agent/extensions/
|
|
```
|
|
|
|
Pi base config (keybindings, env loader, settings template) lives in
|
|
[`pi-toolkit`](https://gitea.jordbo.se/joakimp/pi-toolkit). `install.sh`
|
|
detects pi via `~/.pi/agent/extensions/` and runs a `check_pi_toolkit`
|
|
probe that warns if pi-toolkit's artifacts are missing.
|