Producer-side MemPalace tooling: two bash wrappers that bridge opencode session history and project documentation into the palace. Originally developed in cli_utils (2026-04-28); split into its own repo on 2026-04-30 because the conceptual fit was weak — cli_utils is interactive shell tooling, while this is agent memory infrastructure with its own architecture, dependency surface, and growth trajectory. Contents: - bin/mempalace-docs — docs-only mining wrapper (originally a2ddcc9 in cli_utils), bridges the gap until MemPalace PR #1213 (exclude_patterns) merges upstream. - bin/mempalace-session — opencode → palace session bridge (originally dacca0e in cli_utils). Reads ~/.local/share/opencode/opencode.db, exports each session to Claude Code JSONL, mines via 'mempalace mine --mode convos'. Bridges the gap until opencode session-stopping hooks + an opencode harness in hooks_cli.py land upstream. - ARCHITECTURE.md — canonical spec: architecture diagram, component details, setup recipe, operational notes, upstream-retirement roadmap. Originally a4cf314 in cli_utils. - SKILL.md — companion agent skill (producer side). Pairs with the consumer-side mempalace skill. Symlinked into ~/.agents/skills/opencode-mempalace-bridge/ by install.sh. - install.sh — idempotent installer, also handles --uninstall. - AGENTS.md — repo conventions. History of the individual files is not preserved in this split; see cli_utils (gitea.jordbo.se/joakimp/cli_utils) commits a2ddcc9, dacca0e, and a4cf314 for the original authorship context.
15 KiB
MemPalace Feeding Architecture
This repository wires opencode and arbitrary project directories into MemPalace via two thin wrappers in bin/. This document explains why they exist and how they fit together.
Audience: someone setting up a new machine (or reviewing what's already set up) and asking "how does the palace actually get fed?". Pairs with the mempalace agent skill, which covers the consumer side (searching, diary, KG). This document covers the producer side.
1. The problem
MemPalace is a persistent memory layer for AI agents — vector search over drawers (chunks of verbatim content), a knowledge graph, and per-agent diaries, all behind an MCP server. To be useful it has to be fed: project docs, conversation transcripts, session summaries.
The stock mempalace CLI has two feeders:
| Feeder | What it ingests | Gap |
|---|---|---|
mempalace mine (default mode) |
Any "readable" file in a directory (code + docs + misc) | Mines source code indiscriminately → embedding index floods with low-signal __init__ fragments. |
mempalace mine --mode convos |
Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack, Codex JSONL | No opencode support. No SQLite support. Opencode persists its history in SQLite, not JSONL. |
And one auto-save path:
| Feeder | Harnesses supported | Gap |
|---|---|---|
hooks_cli.py (session-stop hooks) |
claude-code, codex |
No opencode harness → /exit mid-session leaves no diary entry behind. |
So on a machine using opencode + the "docs-first palace hygiene" policy, three gaps bite:
- Mining a project floods the palace with source code we don't want.
- Opencode session history is trapped in SQLite, invisible to
mine --mode convos. - There's no auto-save on session stop — any persistence is best-effort heuristic.
The two wrappers in bin/ close gaps 1 and 2. Gap 3 is upstream work (see §6).
2. The architecture
Project dirs (/workspace/*) Opencode SQLite DB
├── *.md ~/.local/share/opencode/opencode.db
├── *.yaml ├── session (id, title, directory, time_created/updated)
├── Dockerfile ├── message (session_id, data JSON w/ role)
└── … └── part (message_id, data JSON w/ type: text|tool|…)
│ │
│ │
┌─────▼──────────┐ ┌────▼──────────────┐
│ mempalace-docs │ │ mempalace-session │
│ (bin/) │ │ (bin/) │
│ │ │ │
│ stage docs │ │ export each │
│ only via cp -p │ │ session as Claude │
│ to cache dir │ │ Code JSONL to │
│ │ │ cache dir │
└─────┬──────────┘ └────┬──────────────┘
│ │
│ ~/.cache/mempalace-docs/<wing>/ │ ~/.cache/mempalace-session/<wing>/
│ │
┌─────▼──────────┐ ┌────▼──────────────┐
│ mempalace mine │ │ mempalace mine │
│ │ │ --mode convos │
└─────┬──────────┘ └────┬──────────────┘
│ │
└───────────────────┬──────────────────────┘
│
┌──────▼─────────┐
│ ChromaDB │
│ ~/.mempalace/ │
│ palace/ │
└──────┬─────────┘
│
MCP server (mempalace_*)
│
AI agents (opencode, claude code, codex, …)
Shared idiom: stage-to-cache-then-mine.
Neither wrapper reimplements the mempalace miner. They each:
- Curate input (filter / transform / rename).
- Write it to a deterministic path under
~/.cache/…/<wing>/withmtimepreserved (viacp -por explicitos.utime). - Delegate actual embedding + filing to
mempalace mine, which already dedups onsource_filepath.
This keeps the wrappers thin. A third wrapper following the same idiom would justify factoring a shared helper library — two does not.
3. Component details
bin/mempalace-docs (268 lines) — docs-first mining
Input: a project directory.
Output: palace drawers in wing_<directory-name> (or --wing override), only from documentation-class files.
What it files: *.md, *.mdx, *.rst, *.txt, *.yml, *.yaml, *.toml, selective *.json, shell scripts, Dockerfiles, Makefiles, license/notice files.
What it drops: source code (.py, .ts, .go, .rs, …), lockfiles, .git, .venv, node_modules, __pycache__, build output.
Implementation notes:
- Reads
mempalace.yaml(if present) to discover the actual wing name — avoids drift if someone renamed the wing after init. - Uses
cp -p(not symlinks) because the miner skips symlinks (miner.pyline 828). - Auto-purges pre-existing drawers whose
source_fileis under the workspace path before re-mining, to prevent doubling on re-runs. - Upstream PR #1213 will add
exclude_patternstomempalace.yaml— when merged, this wrapper should shrink to a thin shim.
bin/mempalace-session (349 lines) — opencode → palace bridge
Input: the opencode SQLite DB (default ~/.local/share/opencode/opencode.db).
Output: palace drawers in wing_conversations (or --wing override), one JSONL file per qualifying session.
Transform pipeline, per session:
- Read
sessionrow (id,title,directory,time_created,time_updated). - Inject synthetic header as first user turn:
[session: <title> | <directory> | <YYYY-MM-DD>]→ makes title/dir/date semantically searchable. - For each
messageordered byid:- Read JSON
data→ getrole(user/assistant). - For each
partunder the message, read JSONdata→ dispatch ontype:text→ text block.tool→ Claude Codetool_useblock + deferredtool_resultas synthetic human message (the mempalace normalizer folds it back into the assistant turn via itsis_tool_onlybranch).step-start/step-finish→ dropped as noise.reasoning→ kept, prefixed with[reasoning].
- Read JSON
- Serialize as Claude Code JSONL (
{"type": "user"|"assistant", "message": {"content": [...]}}) — the one convos format the miner already understands. - Stage at
~/.cache/mempalace-session/<wing>/<slug>_<id>.jsonlwithmtime=session.time_updated(deterministic, stable under dedup).
Filters:
--min-messages N(default 3) — drops throwaway/exit'd sessions that would flood the palace.--since YYYY-MM-DD— incremental catch-up.--session <id>— one-shot mode.
Then: invokes mempalace mine --mode convos against the cache dir, followed by mempalace repair (unless --no-repair).
4. Setup recipe (new machine)
Assumes: opencode already installed, ~/.local/share/opencode/opencode.db exists, mempalace CLI installed (v3.3.3+).
# 1. Clone mempalace-toolkit (holds the two wrappers in bin/)
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit
# 2. Install (symlinks bin/* into ~/.local/bin, adds loader to rc file)
./install.sh
# 3. Ensure ~/.local/bin is on PATH (installer warns if not)
export PATH="$HOME/.local/bin:$PATH"
# 4. Initialize palace if needed (one-time, platform-wide)
mempalace init --yes
# 5. Mine opencode history into the palace
mempalace-session --dry-run # preview scope
mempalace-session # do it for real (~20 min for ~60 sessions)
# 6. Mine project docs (per project)
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project
# 7. Restart any MCP-connected agent, or call mempalace_reconnect from inside one
Containerized setup (devbox)
The devbox uses two named Docker volumes so these persist across container recreate:
devbox-palace→~/.mempalace/palace(the palace itself)devbox-data→~/.local/share/opencode(opencode's SQLite DB)
Code at /workspace/mempalace-toolkit is a bind mount from the host — survives container recreate and syncs via gitea. Staging directories (~/.cache/mempalace-{docs,session}/) are ephemeral but cheap to rebuild.
After container recreate, just re-run ./install.sh (idempotent) to relink bin/ into the fresh ~/.local/bin/.
5. Operational notes
Dedup behavior
Both wrappers dedup via mempalace mine's built-in key:
mempalace-docs: keys onsource_filepath +mtime→ edit a doc, it re-mines; unchanged files are skipped.mempalace-session: keys onsource_filepath alone (convos miner doesn't check mtime) → a session's JSONL filename is<slug>_<id>.jsonl, stable per session, so re-runs skip already-filed sessions. To force re-mining, delete the staging dir.
Verified: a second full mempalace-session run immediately after the first produces 0 new drawers. The only cost is the post-mine repair step (index rebuild — ~5 min on 5k drawers).
When to re-mine
mempalace-docs: after significant doc changes in a project.mempalace-session: opportunistically. Every few days catches new opencode sessions. Or wire to cron / systemd timer for true auto-save coverage (not yet done).
Cost profile (reference)
Measured on a ~10-day opencode corpus of 140 sessions / 1491 messages / 4656 parts:
- Dry run: seconds.
- Full mine: 21 minutes (38 min user CPU). Produced 2378 drawers from 62 qualifying sessions.
- Dedup re-run: mine step instant; only the repair runs (~5 min).
Scaling is roughly linear in message count. Budget ~20 minutes per 60-session batch.
Common failure modes
| Symptom | Cause | Fix |
|---|---|---|
mempalace-session: command not found after container recreate |
~/.local/bin wiped with container |
cd ~/mempalace-toolkit && ./install.sh |
| Search errors "Error finding id" post-mine | Stale HNSW index | mempalace repair --yes + mempalace_reconnect from MCP |
| Drawers doubled after re-mining a project | Someone renamed the wing or ran raw mempalace mine alongside the wrapper |
Inspect embedding_metadata in chroma.sqlite3; purge duplicates by source prefix, then mempalace repair |
| Sessions missing from palace | Session has fewer than --min-messages messages |
Lower the threshold or --session <id> explicitly |
6. Upstream roadmap
These gaps should ideally close upstream, making the wrappers thinner or obsolete:
- MemPalace PR #1213 —
exclude_patternsinmempalace.yaml. When merged,mempalace-docsshrinks to a thin shim (or disappears) since exclude-by-extension becomes a first-class config. - Opencode session hooks — PR #16598 (session.stopping), PR #16769 (shutdown), PR #15224 (session.start), issue #23503 (session.turn.completed). When at least one merges, opencode can fire hooks mempalace can receive.
- Opencode harness in
hooks_cli.py— mempalace's hooks CLI only knowsclaude-code+codextoday. Addingopencodewould let the auto-save diary path work on opencode too. Pairs with #2 above. - SQLite mode for
mempalace mine --mode convos— if upstream ever adds direct SQLite ingest for opencode,mempalace-sessionloses its reason to exist (the export-to-JSONL dance goes away).
When #1 merges, retire mempalace-docs to a thin shim. When #2 + #3 land together, mempalace-session becomes a manual-only fallback (cron / backfill) while hooks handle live saves.
7. See also
README.md— human-facing quickstart + per-tool usage reference.AGENTS.md— repo conventions for AI agents modifying this codebase.SKILL.md— agent skill (producer side), symlinked into~/.agents/skills/opencode-mempalace-bridge/byinstall.sh.~/.agents/skills/mempalace/SKILL.md— agent skill for the consumer side (searching, diary, KG). Pair withSKILL.mdin this repo.cli_utils— sibling repo: shell quality-of-life tools. Origin of these wrappers before the 2026-04-30 split.