mempalace-toolkit/README.md

# mempalace-toolkit

Producer-side tooling for [MemPalace](https://github.com/MemPalace/mempalace) — bridges that feed opencode session history and project documentation into the palace. Pairs with the consumer-side [`mempalace` agent skill](https://github.com/MemPalace/mempalace).

**What this repo contains:**

- `bin/mempalace-session` — exports [opencode](https://github.com/anomalyco/opencode) session history from its local SQLite DB to Claude Code JSONL, then mines it via `mempalace mine --mode convos`.
- `bin/mempalace-docs` — mines project directories into MemPalace while excluding source code, keeping the palace signal-dense.
- [`ARCHITECTURE.md`](ARCHITECTURE.md) — **canonical spec**: architecture diagram, component details, setup recipe, operational notes, upstream-retirement roadmap.
- [`SKILL.md`](SKILL.md) — the companion agent skill, symlinked into `~/.agents/skills/opencode-mempalace-bridge/` on install.

**If you're just trying to get this working on a new machine → jump to [Setup](#setup).**
**If you want the full architecture story → read [`ARCHITECTURE.md`](ARCHITECTURE.md).**

---

## Why this exists

MemPalace is the agent memory layer. Its stock CLI has two gaps that bite on a machine running opencode with a docs-first palace policy:

1. **`mempalace mine` floods the palace with source code** — every `__init__` fragment, every generated file, hundreds of low-signal drawers per project. `mempalace-docs` fixes this by staging only documentation-class files (`*.md`, `*.yml`, `Dockerfile`, etc.) before mining.
2. **`mempalace mine --mode convos` can't read opencode's SQLite DB** — only file-based chat formats (Claude Code JSONL, Claude.ai JSON, ChatGPT, Slack, Codex). Opencode persists every turn in `~/.local/share/opencode/opencode.db` and has no upstream hook into mempalace's auto-save. `mempalace-session` fixes this by exporting each session to Claude Code JSONL before mining.

Both wrappers follow the same **stage-to-cache-then-mine** idiom. Neither reimplements the miner; they curate input and delegate.

Long-term, both should retire:
- `mempalace-docs` → retires when [MemPalace PR #1213](https://github.com/MemPalace/mempalace/pull/1213) (`exclude_patterns` in `mempalace.yaml`) merges.
- `mempalace-session` → retires when opencode session-stopping hooks ([PR #16598](https://github.com/anomalyco/opencode/pull/16598) et al.) merge **and** `hooks_cli.py` gains an `opencode` harness. Until both land, this repo fills the gap.

See [`ARCHITECTURE.md`](ARCHITECTURE.md) §6 for the full upstream roadmap.

---

## Setup

### Prerequisites

- [MemPalace](https://github.com/MemPalace/mempalace) CLI v3.3.3+ — **see [Installing mempalace itself](#installing-mempalace-itself-prerequisite) below if you haven't already**.
- Python 3 (stdlib `sqlite3` only — no extra deps)
- [opencode](https://github.com/anomalyco/opencode) with an active session DB at `~/.local/share/opencode/opencode.db` *(only needed for `mempalace-session`)*

### Installing mempalace itself (prerequisite)

mempalace-toolkit wraps the mempalace CLI but does not bundle it. The upstream [MemPalace repo](https://github.com/MemPalace/mempalace) documents `pip install mempalace` as the install method; `uv tool install` is cleaner and is the flow used in production containers like [opencode-devbox](https://gitea.jordbo.se/joakimp/opencode-devbox).

**Why uv over pip:**
- Isolated venv per tool — mempalace's dependencies (chromadb, embedding model runtime, …) don't leak into system Python or your project venvs.
- No PEP 668 fight — modern Debian / Ubuntu / Homebrew Python all refuse `pip install` into the system site-packages. `uv tool install` sidesteps this entirely.
- The shim (`~/.local/bin/mempalace` by default) is a thin wrapper that automatically activates the isolated venv on invocation, so `mempalace` is available from any bash or zsh terminal without manual `source venv/bin/activate`.

**Install uv** if it's not already on the machine:

```bash
# macOS / Linux, official installer — puts uv in ~/.local/bin
curl -LsSf https://astral.sh/uv/install.sh | sh

# Or: Homebrew on macOS
brew install uv

# Verify
uv --version
```

#### Personal machine (recommended default)

```bash
# Installs mempalace into an isolated venv under ~/.local/share/uv/tools/mempalace/,
# puts the `mempalace` shim into ~/.local/bin/.
uv tool install mempalace

# Make sure ~/.local/bin is on $PATH (uv prints this if it isn't)
export PATH="$HOME/.local/bin:$PATH"   # add to ~/.bashrc or ~/.zshrc

# Verify
mempalace --version     # should print the installed version
which mempalace         # should point into ~/.local/bin/
```

After this, `mempalace` works the same from any bash or zsh terminal — interactive shell, script, cron, systemd user service, launchd agent, all fine.

To upgrade later: `uv tool upgrade mempalace` (or `--all`).
To uninstall: `uv tool uninstall mempalace`.

#### System-wide / container install (opencode-devbox pattern)

For a Docker image or a multi-user box where the shim should live on the system `PATH` rather than in each user's `~/.local/bin`, use `UV_TOOL_DIR` + `UV_TOOL_BIN_DIR` to relocate both the venv and the shim:

```bash
# In the Dockerfile — this is the pattern used by opencode-devbox
ENV UV_TOOL_DIR=/opt/uv-tools
ENV UV_TOOL_BIN_DIR=/usr/local/bin

RUN mkdir -p /opt/uv-tools && \
    uv tool install --no-cache mempalace && \
    /opt/uv-tools/mempalace/bin/python -c "import mempalace; print('mempalace installed')"
```

After this:
- `/opt/uv-tools/mempalace/` — the isolated venv.
- `/usr/local/bin/mempalace` — the CLI shim (globally on `PATH`, works for every user).

The last `python -c` line in the RUN step is a build-time sanity check: if the install silently failed, the build fails here rather than at runtime.

See [opencode-devbox/Dockerfile](https://gitea.jordbo.se/joakimp/opencode-devbox/src/branch/main/Dockerfile) §"MemPalace install" for the full production version (adds `INSTALL_MEMPALACE=true` build arg so the install can be skipped to shave ~300 MB off the image).

#### MCP server wrapper (required for MCP clients on a system install)

MCP clients (opencode, Claude Code, Kiro) spawn the mempalace MCP server as a subprocess. On a *personal-machine* install the command is just `mempalace-mcp` — the uv tool shim finds the venv's Python automatically.

**Pitfall that bit us during the first opencode-devbox attempt:** on a *system install* with `UV_TOOL_DIR=/opt/uv-tools`, the system `python3` cannot import `mempalace` because the modules live in the isolated venv under `/opt/uv-tools/mempalace/lib/...`, not in system site-packages. Any MCP config that reads

```json
{ "command": ["python3", "-m", "mempalace.mcp_server"] }
```

will fail at spawn with `ModuleNotFoundError: No module named 'mempalace'` — and because MCP failures are reported as "server unavailable" rather than surfacing the stderr, the root cause is easy to miss.

**Fix:** ship a thin wrapper on `PATH` that exec's the venv's own Python. opencode-devbox ships this as `/usr/local/bin/mempalace-mcp-server`:

```sh
#!/bin/sh
# Launcher for the MemPalace MCP server on a uv-tool install.
# System python3 cannot import mempalace from the isolated venv,
# so exec the venv's python directly with the mcp_server module.
exec /opt/uv-tools/mempalace/bin/python -m mempalace.mcp_server "$@"
```

…and MCP configs reference the wrapper instead:

```json
{ "command": ["mempalace-mcp-server"] }
```

If you're on a personal-machine install (default `uv tool install` paths), you don't need the wrapper — `mempalace-mcp` is already a shim that does the right thing. The wrapper is specifically the workaround for the non-default `UV_TOOL_DIR` setup.

See [opencode-devbox/AGENTS.md](https://gitea.jordbo.se/joakimp/opencode-devbox/src/branch/main/AGENTS.md) ("Critical conventions" → "MemPalace install path") for the authoritative reference.

#### Verification checklist

After any install (personal or system-wide), confirm:

```bash
# CLI reachable from PATH
which mempalace                    # → a shim path
mempalace --version                # → v3.3.3+ without import errors

# CLI can import its own modules (catches venv vs site-packages mismatch)
mempalace status 2>&1 | head -3    # → either palace stats or "No palace found" — not a Python traceback

# MCP server reachable (system install — only relevant if you set up the wrapper)
which mempalace-mcp-server         # personal install: skip, uses `mempalace-mcp` directly
mempalace-mcp-server --help 2>&1 | head -5   # should show MCP server help, not import error
```

If any of these produce `ModuleNotFoundError`, you've hit the venv-mismatch pitfall. Re-read the MCP wrapper section above.

### Install mempalace-toolkit

```bash
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit
./install.sh
```

The installer symlinks `bin/*` into `~/.local/bin/` and optionally installs the agent skill into `~/.agents/skills/opencode-mempalace-bridge/`.

Ensure `~/.local/bin` is on `$PATH`:

```bash
export PATH="$HOME/.local/bin:$PATH"
```

**If `install.sh` reports `Skipping <name> — already exists`:** there's a leftover symlink or file at `~/.local/bin/<name>` from a previous install (e.g. the pre-split `cli_utils` days). The installer prints the exact `rm && ./install.sh` command to fix it — remove the stale entry and re-run. It will never clobber an existing file without the user explicitly removing it first.

### First mine

```bash
# Mine opencode session history into wing_conversations (no init needed)
mempalace-session --dry-run      # preview qualifying sessions
mempalace-session                # do it (~20 min per 60 sessions)

# Mine a project (docs only). If you want to pre-init the project with a
# custom wing name or entity config, run `mempalace init --yes <dir>` first;
# otherwise `mempalace-docs` derives the wing from the directory name.
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project
```

> **Note:** mempalace has no one-time global init. The palace itself is created lazily on first write (at `~/.mempalace/palace/`). `mempalace init <dir>` is a *per-project* command that sets up a `mempalace.yaml` + entity list for a specific source directory — optional, not a prerequisite for either wrapper.

### Keeping it fresh (automation)

Manual invocation is fine while you're actively driving the machine, but long-running devboxes benefit from a weekly automated mine. [`contrib/`](contrib/) ships ready-to-install templates:

- **systemd user timer** (recommended on Linux): survives reboots, catches missed runs, logs to `journalctl`.
- **launchd user agent** (recommended on macOS): native-equivalent — logs to `~/Library/Logs/`, single-instance guarantees, `ProcessType=Background` throttling.
- **cron**: simplest, works on BSD and systemd-less distros. No user-unit awareness needed.

Quick-start (Linux / systemd, weekly Mon 03:00 local):

```bash
mkdir -p ~/.config/systemd/user
cp contrib/systemd/*.{service,timer} ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now mempalace-session.timer
sudo loginctl enable-linger "$USER"   # optional, for headless boxes
```

Quick-start (macOS / launchd, same schedule):

```bash
sed "s|USER|$USER|g" contrib/launchd/se.jordbo.mempalace-session.plist \
  > ~/Library/LaunchAgents/se.jordbo.mempalace-session.plist
mkdir -p ~/Library/Logs
launchctl bootstrap "gui/$(id -u)" ~/Library/LaunchAgents/se.jordbo.mempalace-session.plist
launchctl enable "gui/$(id -u)/se.jordbo.mempalace-session"
```

See [`contrib/README.md`](contrib/README.md) for full install/verify/uninstall recipes, tuning, and devbox/container caveats. The full operational routine (triggers, cadence, verification) is in [`ARCHITECTURE.md`](ARCHITECTURE.md) §5.

### Containerized (devbox) notes

On a Docker-based devbox, the palace and opencode DB should live on named volumes so they survive container recreate:

- `devbox-palace` → `~/.mempalace/palace`
- `devbox-data` → `~/.local/share/opencode`

This repo is typically bind-mounted from the host, so code survives recreate and syncs via git. After a container recreate, `~/.local/bin` is wiped — just re-run `./install.sh` (idempotent) to relink.

---

## `mempalace-docs`

Docs-only MemPalace miner. Stages documentation files into a cache dir and runs `mempalace mine` against the cache — never against the raw project dir.

```bash
mempalace-docs <directory>                     # mine with wing = dirname
mempalace-docs <directory> --wing my_project   # override wing name
mempalace-docs <directory> --agent alice       # record agent on drawers
mempalace-docs <directory> --dry-run           # list files, don't file
mempalace-docs <directory> --no-repair         # skip post-mine repair
mempalace-docs --help
```

**What gets mined:** `*.md`, `*.mdx`, `*.rst`, `*.txt`, `*.yml`, `*.yaml`, `*.toml`, `*.json`, `*.sh`, `*.bash`, `*.zsh`, `*.fish`, `Dockerfile*`, `Makefile*`, `*.conf`, `*.cfg`, `*.ini`, `LICENSE*`, `COPYING*`, `NOTICE*`.

**What gets skipped:** `.py`, `.ts`, `.tsx`, `.js`, `.jsx`, `.go`, `.rs`, `.java`, `.cpp`, `.c`, `.rb`, `.kt`, `.swift`, build output directories (`.git`, `.venv`, `node_modules`, `__pycache__`, `.mypy_cache`, `.pytest_cache`, `.ruff_cache`, `dist`, `build`, `.next`, `target`, `coverage`), lockfiles.

**Rationale:** the palace is for *context and intent*. Agents already have `grep`/`glob`/`Read` for code — always authoritative, never stale. Embedding source code creates a parallel, lossier, drift-prone copy that pollutes semantic search for years.

---

## `mempalace-session`

Opencode → MemPalace session bridge. Reads `~/.local/share/opencode/opencode.db`, transforms each session into Claude Code JSONL, and files via `mempalace mine --mode convos`.

```bash
mempalace-session                                 # mine all sessions (≥3 msgs)
mempalace-session --wing my_convos                # custom wing (default: wing_conversations)
mempalace-session --session ses_abc123            # one session only
mempalace-session --since 2026-04-01              # only sessions updated on/after date
mempalace-session --min-messages 6                # stricter short-session filter
mempalace-session --db /custom/path/opencode.db   # non-default DB location
mempalace-session --dry-run                       # export + list, skip mine
mempalace-session --no-repair                     # skip post-mine index repair
mempalace-session --help
```

**What gets exported per session:**

- Synthetic header injected as the first user turn (`[session: <title> | <dir> | <date>]`) so the palace can find sessions by topic, not just by ID.
- Each message → Claude Code JSONL line (`{"type": "user"|"assistant", "message": {"content": ...}}`).
- Tool calls → `tool_use` blocks. Known tools (`Bash`, `Read`, `Grep`, `Edit`, `Write`) get formatted summaries; unknown tools are JSON-serialized.
- Tool outputs → `tool_result` blocks in a follow-up human message, folded back into the assistant turn by the mempalace normalizer.
- `step-start` / `step-finish` parts are dropped as noise. `reasoning` parts are kept with a `[reasoning]` prefix.

**Dedup:** staging at `~/.cache/mempalace-session/<wing>/` with deterministic per-session filenames (`<slug>_<id>.jsonl`). The convos miner keys on `source_file`, so re-runs skip unchanged sessions. To force re-mining a session, delete its JSONL from the staging dir.

**`--dry-run` is dedup-aware.** Each session is tagged `[NEW]` (would be filed) or `[SKIP]` (already in the palace), and the summary breaks down the count:

```
Exported 62 session(s) to ~/.cache/mempalace-session/wing_conversations
  0 new   → will be filed on mine
  62 already filed → will be skipped (dedup by source_file)

--dry-run: no new sessions to mine. A real run would skip all 62.
```

If the palace is unreachable (fresh install, moved, permission-denied) the wrapper falls back to "everything is new" — the real mine step delegates dedup to `mempalace mine --mode convos`, which is always the source of truth. So running `mempalace-session` twice in a row is never destructive or wasteful: the second run's only cost is the post-mine HNSW `repair` step (~5 min on a ~5k-drawer palace).

**Filter:** sessions with fewer than `--min-messages` messages (default 3) are skipped — drops throwaway `/exit`'d sessions that would otherwise flood the palace. On a reference 140-session corpus, 78 were filtered this way.

**Cost profile:** ~20 minutes per 60-session batch. Scales roughly linearly with message count. Dedup re-run: mine step instant, only the post-mine `repair` runs (~5 min on 5k drawers).

---

## Companion agent skill

Installing this repo symlinks `SKILL.md` into `~/.agents/skills/opencode-mempalace-bridge/SKILL.md`, where it's auto-discovered by opencode (and by Claude Code / Kiro if you run `agents-sync` from [`cli_utils`](https://gitea.jordbo.se/joakimp/cli_utils)).

The skill is the *short-form checklist* for agents — when to use which wrapper, failure modes, setup recipes, anti-patterns. The canonical reference is always [`ARCHITECTURE.md`](ARCHITECTURE.md); the skill points there for deep context.

The skill pairs with the consumer-side [`mempalace` skill](https://github.com/MemPalace/mempalace) — that one covers using the palace (search, diary, KG); this one covers feeding it.

**Colocated skill pattern.** The skill lives here (not in [`skillset`](https://gitea.jordbo.se/joakimp/skillset)) because it moves in lockstep with the wrappers it documents. `install.sh` drops a `.skill-source` marker file in the deployed skill directory so sibling tooling (skillset's `deploy-skills.sh`, cli_utils's `agents-sync.zsh`) can tell the directory is externally owned. See [`AGENTS.md`](AGENTS.md) for the full convention and how to adopt it for future colocated skills.

---

## See also

- [`ARCHITECTURE.md`](ARCHITECTURE.md) — canonical spec: diagrams, setup recipe, failure modes, upstream roadmap.
- [`AGENTS.md`](AGENTS.md) — repo conventions for AI agents modifying this codebase.
- [MemPalace](https://github.com/MemPalace/mempalace) — the memory layer itself.
- [opencode](https://github.com/anomalyco/opencode) — the agent harness this bridges.
- [cli_utils](https://gitea.jordbo.se/joakimp/cli_utils) — sibling repo with shell quality-of-life tools (origin of these wrappers before the 2026-04-30 split).