Initial commit — split out from cli_utils

Producer-side MemPalace tooling: two bash wrappers that bridge opencode
session history and project documentation into the palace. Originally
developed in cli_utils (2026-04-28); split into its own repo on
2026-04-30 because the conceptual fit was weak — cli_utils is
interactive shell tooling, while this is agent memory infrastructure
with its own architecture, dependency surface, and growth trajectory.

Contents:
- bin/mempalace-docs — docs-only mining wrapper (originally a2ddcc9 in
  cli_utils), bridges the gap until MemPalace PR #1213 (exclude_patterns)
  merges upstream.
- bin/mempalace-session — opencode → palace session bridge (originally
  dacca0e in cli_utils). Reads ~/.local/share/opencode/opencode.db,
  exports each session to Claude Code JSONL, mines via
  'mempalace mine --mode convos'. Bridges the gap until opencode
  session-stopping hooks + an opencode harness in hooks_cli.py land
  upstream.
- ARCHITECTURE.md — canonical spec: architecture diagram, component
  details, setup recipe, operational notes, upstream-retirement
  roadmap. Originally a4cf314 in cli_utils.
- SKILL.md — companion agent skill (producer side). Pairs with the
  consumer-side mempalace skill. Symlinked into
  ~/.agents/skills/opencode-mempalace-bridge/ by install.sh.
- install.sh — idempotent installer, also handles --uninstall.
- AGENTS.md — repo conventions.

History of the individual files is not preserved in this split; see
cli_utils (gitea.jordbo.se/joakimp/cli_utils) commits a2ddcc9, dacca0e,
and a4cf314 for the original authorship context.
This commit is contained in:
Joakim Persson
2026-04-30 05:30:04 +00:00
commit 954c3f2ebb
8 changed files with 1438 additions and 0 deletions
+15
View File
@@ -0,0 +1,15 @@
# Per-project mempalace state — shouldn't live in the tool repo
mempalace.yaml
entities.json
.mempalace/
# Editor / OS noise
*.swp
*.swo
.DS_Store
# Local caches
.mypy_cache/
.ruff_cache/
__pycache__/
*.pyc
+79
View File
@@ -0,0 +1,79 @@
# AGENTS.md
## What this is
Producer-side tooling for [MemPalace](https://github.com/MemPalace/mempalace). Two thin wrappers in `bin/` plus the companion agent skill. Pairs with the consumer-side `mempalace` skill.
Read [`ARCHITECTURE.md`](ARCHITECTURE.md) first — it's the canonical spec for what this repo does and why.
## Structure
```
install.sh # Idempotent installer — symlinks bin/* into ~/.local/bin
# and SKILL.md into ~/.agents/skills/opencode-mempalace-bridge/
ARCHITECTURE.md # Canonical spec: diagrams, setup recipe, ops notes, upstream roadmap
README.md # Human-facing quickstart + per-tool usage reference
SKILL.md # Agent skill (symlinked into ~/.agents/skills/ on install)
bin/
mempalace-docs # Docs-only MemPalace miner (bash wrapper)
mempalace-session # Opencode session → MemPalace bridge (bash + inline Python)
```
## Conventions
- **Standalone executables** in `bin/` with `#!/usr/bin/env bash` shebang, no extension, `chmod +x`. Must work in non-interactive contexts (agent processes, cron, CI).
- **Thin wrappers only.** Neither tool reimplements the mempalace miner. Both follow the **stage-to-cache-then-mine** idiom: curate input to `~/.cache/…/<wing>/`, then delegate to `mempalace mine`.
- **Idempotent + dry-runnable.** Every tool supports `--dry-run`. Second invocation on unchanged input is a no-op (dedup via `source_file` path, optionally + `mtime`).
- **No external Python deps.** Stdlib only (`sqlite3`, `json`, `pathlib`). Inline in the bash wrapper via heredoc.
- Argument parsing: `--help`/`-h` first, then mode flags, then positional args.
- Comment sections use `# ── Section Name ──────` style (matches sibling `cli_utils` repo).
## Adding a new wrapper
A third wrapper would justify factoring a shared helper library. Until then, copy the pattern from `mempalace-session` (richest example):
1. Create `bin/<name>` with `#!/usr/bin/env bash` + `chmod +x`.
2. Implement `--help`, `--dry-run`, `--no-repair` flags.
3. Stage to `~/.cache/<name>/<wing>/` with deterministic filenames.
4. Invoke `mempalace mine ...` (choose `--mode convos` if input is chat-like).
5. End with `mempalace repair` unless `--no-repair`.
6. Update `README.md` with usage + rationale.
7. Update `install.sh`? No — `bin/*` is auto-linked.
8. Update `ARCHITECTURE.md` if the wrapper fills a new architectural gap.
9. Update `SKILL.md` if agents should know when to invoke it.
## Testing
Manual only. Integration-shaped:
```bash
# Smoke test — does it parse args and list what would happen?
./bin/mempalace-session --help
./bin/mempalace-session --dry-run
# Real test on a single session (safe, deterministic)
./bin/mempalace-session --session ses_<id> --dry-run
./bin/mempalace-session --session ses_<id> # file into palace
mempalace_search "a phrase from that session" # verify visibility
./bin/mempalace-session --session ses_<id> # re-run → should skip
```
For `mempalace-docs`, test on a small repo (e.g. this one) first:
```bash
./bin/mempalace-docs "$PWD" --dry-run
```
## Gotchas
- `install.sh` is idempotent but interactive — use `--yes` in non-interactive contexts.
- `~/.local/bin` must be on `$PATH`. The installer warns if not.
- The companion skill lives at `~/.agents/skills/opencode-mempalace-bridge/SKILL.md` and is a **symlink into this repo**. Editing that file edits `SKILL.md` here. To propagate to Claude Code / Kiro, run `agents-sync` from [`cli_utils`](https://gitea.jordbo.se/joakimp/cli_utils).
- The opencode DB path defaults to `~/.local/share/opencode/opencode.db`. Override via `$OPENCODE_DB` or `--db`.
- The mempalace miner **skips symlinks** (as of v3.3.3 — `miner.py` line ~828). That's why the wrappers use `cp -p` / explicit file writes for staging, not symlinks.
- The convos miner dedups on `source_file` path only (no mtime check). Staging filenames must be stable per session; deleting a staged JSONL forces a re-mine.
- The docs miner dedups on `source_file` path + `mtime`. That's why staging uses `cp -p` (preserves mtime).
## History
Split out from [`cli_utils`](https://gitea.jordbo.se/joakimp/cli_utils) on 2026-04-30. The wrappers originated there but the conceptual fit was weak (`cli_utils` is interactive shell tools; these are agent memory infrastructure). Some older diary entries and KG facts in the palace reference the original paths.
+233
View File
@@ -0,0 +1,233 @@
# MemPalace Feeding Architecture
This repository wires [opencode](https://github.com/anomalyco/opencode) and arbitrary project directories into [MemPalace](https://github.com/MemPalace/mempalace) via two thin wrappers in `bin/`. This document explains why they exist and how they fit together.
**Audience:** someone setting up a new machine (or reviewing what's already set up) and asking "how does the palace actually get fed?". Pairs with the `mempalace` agent skill, which covers the *consumer* side (searching, diary, KG). This document covers the *producer* side.
---
## 1. The problem
MemPalace is a persistent memory layer for AI agents — vector search over drawers (chunks of verbatim content), a knowledge graph, and per-agent diaries, all behind an MCP server. To be useful it has to be *fed*: project docs, conversation transcripts, session summaries.
The stock mempalace CLI has two feeders:
| Feeder | What it ingests | Gap |
| ------------------------------------- | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| `mempalace mine` (default mode) | Any "readable" file in a directory (code + docs + misc) | Mines source code indiscriminately → embedding index floods with low-signal `__init__` fragments. |
| `mempalace mine --mode convos` | Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack, Codex JSONL | No opencode support. No SQLite support. Opencode persists its history in SQLite, not JSONL. |
And one auto-save path:
| Feeder | Harnesses supported | Gap |
| ------------------------------------- | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| `hooks_cli.py` (session-stop hooks) | `claude-code`, `codex` | No `opencode` harness → `/exit` mid-session leaves no diary entry behind. |
So on a machine using opencode + the "docs-first palace hygiene" policy, three gaps bite:
1. Mining a project floods the palace with source code we don't want.
2. Opencode session history is trapped in SQLite, invisible to `mine --mode convos`.
3. There's no auto-save on session stop — any persistence is best-effort heuristic.
The two wrappers in `bin/` close gaps **1** and **2**. Gap **3** is upstream work (see §6).
---
## 2. The architecture
```
Project dirs (/workspace/*) Opencode SQLite DB
├── *.md ~/.local/share/opencode/opencode.db
├── *.yaml ├── session (id, title, directory, time_created/updated)
├── Dockerfile ├── message (session_id, data JSON w/ role)
└── … └── part (message_id, data JSON w/ type: text|tool|…)
│ │
│ │
┌─────▼──────────┐ ┌────▼──────────────┐
│ mempalace-docs │ │ mempalace-session │
│ (bin/) │ │ (bin/) │
│ │ │ │
│ stage docs │ │ export each │
│ only via cp -p │ │ session as Claude │
│ to cache dir │ │ Code JSONL to │
│ │ │ cache dir │
└─────┬──────────┘ └────┬──────────────┘
│ │
│ ~/.cache/mempalace-docs/<wing>/ │ ~/.cache/mempalace-session/<wing>/
│ │
┌─────▼──────────┐ ┌────▼──────────────┐
│ mempalace mine │ │ mempalace mine │
│ │ │ --mode convos │
└─────┬──────────┘ └────┬──────────────┘
│ │
└───────────────────┬──────────────────────┘
┌──────▼─────────┐
│ ChromaDB │
│ ~/.mempalace/ │
│ palace/ │
└──────┬─────────┘
MCP server (mempalace_*)
AI agents (opencode, claude code, codex, …)
```
**Shared idiom:** *stage-to-cache-then-mine*.
Neither wrapper reimplements the mempalace miner. They each:
1. Curate input (filter / transform / rename).
2. Write it to a deterministic path under `~/.cache/…/<wing>/` with `mtime` preserved (via `cp -p` or explicit `os.utime`).
3. Delegate actual embedding + filing to `mempalace mine`, which already dedups on `source_file` path.
This keeps the wrappers thin. A third wrapper following the same idiom would justify factoring a shared helper library — two does not.
---
## 3. Component details
### `bin/mempalace-docs` (268 lines) — docs-first mining
**Input:** a project directory.
**Output:** palace drawers in `wing_<directory-name>` (or `--wing` override), only from documentation-class files.
What it files: `*.md`, `*.mdx`, `*.rst`, `*.txt`, `*.yml`, `*.yaml`, `*.toml`, selective `*.json`, shell scripts, Dockerfiles, Makefiles, license/notice files.
What it drops: source code (`.py`, `.ts`, `.go`, `.rs`, …), lockfiles, `.git`, `.venv`, `node_modules`, `__pycache__`, build output.
**Implementation notes:**
- Reads `mempalace.yaml` (if present) to discover the actual wing name — avoids drift if someone renamed the wing after init.
- Uses `cp -p` (not symlinks) because the miner skips symlinks (`miner.py` line 828).
- Auto-purges pre-existing drawers whose `source_file` is under the workspace path before re-mining, to prevent doubling on re-runs.
- Upstream [PR #1213](https://github.com/MemPalace/mempalace/pull/1213) will add `exclude_patterns` to `mempalace.yaml` — when merged, this wrapper should shrink to a thin shim.
### `bin/mempalace-session` (349 lines) — opencode → palace bridge
**Input:** the opencode SQLite DB (default `~/.local/share/opencode/opencode.db`).
**Output:** palace drawers in `wing_conversations` (or `--wing` override), one JSONL file per qualifying session.
**Transform pipeline, per session:**
1. Read `session` row (`id`, `title`, `directory`, `time_created`, `time_updated`).
2. Inject synthetic header as first user turn: `[session: <title> | <directory> | <YYYY-MM-DD>]` → makes title/dir/date semantically searchable.
3. For each `message` ordered by `id`:
- Read JSON `data` → get `role` (`user` / `assistant`).
- For each `part` under the message, read JSON `data` → dispatch on `type`:
- `text` → text block.
- `tool` → Claude Code `tool_use` block + deferred `tool_result` as synthetic human message (the mempalace normalizer folds it back into the assistant turn via its `is_tool_only` branch).
- `step-start` / `step-finish` → dropped as noise.
- `reasoning` → kept, prefixed with `[reasoning]`.
4. Serialize as Claude Code JSONL (`{"type": "user"|"assistant", "message": {"content": [...]}}`) — the one convos format the miner already understands.
5. Stage at `~/.cache/mempalace-session/<wing>/<slug>_<id>.jsonl` with `mtime` = `session.time_updated` (deterministic, stable under dedup).
**Filters:**
- `--min-messages N` (default 3) — drops throwaway `/exit`'d sessions that would flood the palace.
- `--since YYYY-MM-DD` — incremental catch-up.
- `--session <id>` — one-shot mode.
**Then:** invokes `mempalace mine --mode convos` against the cache dir, followed by `mempalace repair` (unless `--no-repair`).
---
## 4. Setup recipe (new machine)
Assumes: opencode already installed, `~/.local/share/opencode/opencode.db` exists, `mempalace` CLI installed (v3.3.3+).
```bash
# 1. Clone mempalace-toolkit (holds the two wrappers in bin/)
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit
# 2. Install (symlinks bin/* into ~/.local/bin, adds loader to rc file)
./install.sh
# 3. Ensure ~/.local/bin is on PATH (installer warns if not)
export PATH="$HOME/.local/bin:$PATH"
# 4. Initialize palace if needed (one-time, platform-wide)
mempalace init --yes
# 5. Mine opencode history into the palace
mempalace-session --dry-run # preview scope
mempalace-session # do it for real (~20 min for ~60 sessions)
# 6. Mine project docs (per project)
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project
# 7. Restart any MCP-connected agent, or call mempalace_reconnect from inside one
```
### Containerized setup (devbox)
The devbox uses two named Docker volumes so these persist across container recreate:
- `devbox-palace``~/.mempalace/palace` (the palace itself)
- `devbox-data``~/.local/share/opencode` (opencode's SQLite DB)
Code at `/workspace/mempalace-toolkit` is a bind mount from the host — survives container recreate and syncs via gitea. Staging directories (`~/.cache/mempalace-{docs,session}/`) are ephemeral but cheap to rebuild.
**After container recreate**, just re-run `./install.sh` (idempotent) to relink `bin/` into the fresh `~/.local/bin/`.
---
## 5. Operational notes
### Dedup behavior
Both wrappers dedup via `mempalace mine`'s built-in key:
- `mempalace-docs`: keys on `source_file` path + `mtime` → edit a doc, it re-mines; unchanged files are skipped.
- `mempalace-session`: keys on `source_file` path alone (convos miner doesn't check mtime) → a session's JSONL filename is `<slug>_<id>.jsonl`, stable per session, so re-runs skip already-filed sessions. To force re-mining, delete the staging dir.
**Verified:** a second full `mempalace-session` run immediately after the first produces 0 new drawers. The only cost is the post-mine `repair` step (index rebuild — ~5 min on 5k drawers).
### When to re-mine
- `mempalace-docs`: after significant doc changes in a project.
- `mempalace-session`: opportunistically. Every few days catches new opencode sessions. Or wire to cron / systemd timer for true auto-save coverage (not yet done).
### Cost profile (reference)
Measured on a ~10-day opencode corpus of 140 sessions / 1491 messages / 4656 parts:
- Dry run: seconds.
- Full mine: **21 minutes** (38 min user CPU). Produced 2378 drawers from 62 qualifying sessions.
- Dedup re-run: mine step instant; only the repair runs (~5 min).
Scaling is roughly linear in message count. Budget ~20 minutes per 60-session batch.
### Common failure modes
| Symptom | Cause | Fix |
| ---------------------------------------------- | ----------------------------------------------------- | --------------------------------------------------------- |
| `mempalace-session: command not found` after container recreate | `~/.local/bin` wiped with container | `cd ~/mempalace-toolkit && ./install.sh` |
| Search errors "Error finding id" post-mine | Stale HNSW index | `mempalace repair --yes` + `mempalace_reconnect` from MCP |
| Drawers doubled after re-mining a project | Someone renamed the wing or ran raw `mempalace mine` alongside the wrapper | Inspect `embedding_metadata` in `chroma.sqlite3`; purge duplicates by source prefix, then `mempalace repair` |
| Sessions missing from palace | Session has fewer than `--min-messages` messages | Lower the threshold or `--session <id>` explicitly |
---
## 6. Upstream roadmap
These gaps should ideally close upstream, making the wrappers thinner or obsolete:
1. **[MemPalace PR #1213](https://github.com/MemPalace/mempalace/pull/1213)** — `exclude_patterns` in `mempalace.yaml`. When merged, `mempalace-docs` shrinks to a thin shim (or disappears) since exclude-by-extension becomes a first-class config.
2. **Opencode session hooks** — [PR #16598](https://github.com/anomalyco/opencode/pull/16598) (session.stopping), [PR #16769](https://github.com/anomalyco/opencode/pull/16769) (shutdown), [PR #15224](https://github.com/anomalyco/opencode/pull/15224) (session.start), [issue #23503](https://github.com/anomalyco/opencode/issues/23503) (session.turn.completed). When at least one merges, opencode can fire hooks mempalace can receive.
3. **Opencode harness in `hooks_cli.py`** — mempalace's hooks CLI only knows `claude-code` + `codex` today. Adding `opencode` would let the auto-save diary path work on opencode too. Pairs with #2 above.
4. **SQLite mode for `mempalace mine --mode convos`** — if upstream ever adds direct SQLite ingest for opencode, `mempalace-session` loses its reason to exist (the export-to-JSONL dance goes away).
When #1 merges, retire `mempalace-docs` to a thin shim. When #2 + #3 land together, `mempalace-session` becomes a manual-only fallback (cron / backfill) while hooks handle live saves.
---
## 7. See also
- [`README.md`](README.md) — human-facing quickstart + per-tool usage reference.
- [`AGENTS.md`](AGENTS.md) — repo conventions for AI agents modifying this codebase.
- [`SKILL.md`](SKILL.md) — agent skill (producer side), symlinked into `~/.agents/skills/opencode-mempalace-bridge/` by `install.sh`.
- `~/.agents/skills/mempalace/SKILL.md` — agent skill for the **consumer** side (searching, diary, KG). Pair with `SKILL.md` in this repo.
- [`cli_utils`](https://gitea.jordbo.se/joakimp/cli_utils) — sibling repo: shell quality-of-life tools. Origin of these wrappers before the 2026-04-30 split.
+153
View File
@@ -0,0 +1,153 @@
# mempalace-toolkit
Producer-side tooling for [MemPalace](https://github.com/MemPalace/mempalace) — bridges that feed opencode session history and project documentation into the palace. Pairs with the consumer-side [`mempalace` agent skill](https://github.com/MemPalace/mempalace).
**What this repo contains:**
- `bin/mempalace-session` — exports [opencode](https://github.com/anomalyco/opencode) session history from its local SQLite DB to Claude Code JSONL, then mines it via `mempalace mine --mode convos`.
- `bin/mempalace-docs` — mines project directories into MemPalace while excluding source code, keeping the palace signal-dense.
- [`ARCHITECTURE.md`](ARCHITECTURE.md) — **canonical spec**: architecture diagram, component details, setup recipe, operational notes, upstream-retirement roadmap.
- [`SKILL.md`](SKILL.md) — the companion agent skill, symlinked into `~/.agents/skills/opencode-mempalace-bridge/` on install.
**If you're just trying to get this working on a new machine → jump to [Setup](#setup).**
**If you want the full architecture story → read [`ARCHITECTURE.md`](ARCHITECTURE.md).**
---
## Why this exists
MemPalace is the agent memory layer. Its stock CLI has two gaps that bite on a machine running opencode with a docs-first palace policy:
1. **`mempalace mine` floods the palace with source code** — every `__init__` fragment, every generated file, hundreds of low-signal drawers per project. `mempalace-docs` fixes this by staging only documentation-class files (`*.md`, `*.yml`, `Dockerfile`, etc.) before mining.
2. **`mempalace mine --mode convos` can't read opencode's SQLite DB** — only file-based chat formats (Claude Code JSONL, Claude.ai JSON, ChatGPT, Slack, Codex). Opencode persists every turn in `~/.local/share/opencode/opencode.db` and has no upstream hook into mempalace's auto-save. `mempalace-session` fixes this by exporting each session to Claude Code JSONL before mining.
Both wrappers follow the same **stage-to-cache-then-mine** idiom. Neither reimplements the miner; they curate input and delegate.
Long-term, both should retire:
- `mempalace-docs` → retires when [MemPalace PR #1213](https://github.com/MemPalace/mempalace/pull/1213) (`exclude_patterns` in `mempalace.yaml`) merges.
- `mempalace-session` → retires when opencode session-stopping hooks ([PR #16598](https://github.com/anomalyco/opencode/pull/16598) et al.) merge **and** `hooks_cli.py` gains an `opencode` harness. Until both land, this repo fills the gap.
See [`ARCHITECTURE.md`](ARCHITECTURE.md) §6 for the full upstream roadmap.
---
## Setup
### Prerequisites
- [MemPalace](https://github.com/MemPalace/mempalace) CLI v3.3.3+
- Python 3 (stdlib `sqlite3` only — no extra deps)
- [opencode](https://github.com/anomalyco/opencode) with an active session DB at `~/.local/share/opencode/opencode.db` *(only needed for `mempalace-session`)*
### Install
```bash
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit
./install.sh
```
The installer symlinks `bin/*` into `~/.local/bin/` and optionally installs the agent skill into `~/.agents/skills/opencode-mempalace-bridge/`.
Ensure `~/.local/bin` is on `$PATH`:
```bash
export PATH="$HOME/.local/bin:$PATH"
```
### First mine
```bash
# One-time palace init (if not done)
mempalace init --yes
# Mine opencode session history into wing_conversations
mempalace-session --dry-run # preview qualifying sessions
mempalace-session # do it (~20 min per 60 sessions)
# Mine a project (docs only)
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project
```
### Containerized (devbox) notes
On a Docker-based devbox, the palace and opencode DB should live on named volumes so they survive container recreate:
- `devbox-palace``~/.mempalace/palace`
- `devbox-data``~/.local/share/opencode`
This repo is typically bind-mounted from the host, so code survives recreate and syncs via git. After a container recreate, `~/.local/bin` is wiped — just re-run `./install.sh` (idempotent) to relink.
---
## `mempalace-docs`
Docs-only MemPalace miner. Stages documentation files into a cache dir and runs `mempalace mine` against the cache — never against the raw project dir.
```bash
mempalace-docs <directory> # mine with wing = dirname
mempalace-docs <directory> --wing my_project # override wing name
mempalace-docs <directory> --agent alice # record agent on drawers
mempalace-docs <directory> --dry-run # list files, don't file
mempalace-docs <directory> --no-repair # skip post-mine repair
mempalace-docs --help
```
**What gets mined:** `*.md`, `*.mdx`, `*.rst`, `*.txt`, `*.yml`, `*.yaml`, `*.toml`, `*.json`, `*.sh`, `*.bash`, `*.zsh`, `*.fish`, `Dockerfile*`, `Makefile*`, `*.conf`, `*.cfg`, `*.ini`, `LICENSE*`, `COPYING*`, `NOTICE*`.
**What gets skipped:** `.py`, `.ts`, `.tsx`, `.js`, `.jsx`, `.go`, `.rs`, `.java`, `.cpp`, `.c`, `.rb`, `.kt`, `.swift`, build output directories (`.git`, `.venv`, `node_modules`, `__pycache__`, `.mypy_cache`, `.pytest_cache`, `.ruff_cache`, `dist`, `build`, `.next`, `target`, `coverage`), lockfiles.
**Rationale:** the palace is for *context and intent*. Agents already have `grep`/`glob`/`Read` for code — always authoritative, never stale. Embedding source code creates a parallel, lossier, drift-prone copy that pollutes semantic search for years.
---
## `mempalace-session`
Opencode → MemPalace session bridge. Reads `~/.local/share/opencode/opencode.db`, transforms each session into Claude Code JSONL, and files via `mempalace mine --mode convos`.
```bash
mempalace-session # mine all sessions (≥3 msgs)
mempalace-session --wing my_convos # custom wing (default: wing_conversations)
mempalace-session --session ses_abc123 # one session only
mempalace-session --since 2026-04-01 # only sessions updated on/after date
mempalace-session --min-messages 6 # stricter short-session filter
mempalace-session --db /custom/path/opencode.db # non-default DB location
mempalace-session --dry-run # export + list, skip mine
mempalace-session --no-repair # skip post-mine index repair
mempalace-session --help
```
**What gets exported per session:**
- Synthetic header injected as the first user turn (`[session: <title> | <dir> | <date>]`) so the palace can find sessions by topic, not just by ID.
- Each message → Claude Code JSONL line (`{"type": "user"|"assistant", "message": {"content": ...}}`).
- Tool calls → `tool_use` blocks. Known tools (`Bash`, `Read`, `Grep`, `Edit`, `Write`) get formatted summaries; unknown tools are JSON-serialized.
- Tool outputs → `tool_result` blocks in a follow-up human message, folded back into the assistant turn by the mempalace normalizer.
- `step-start` / `step-finish` parts are dropped as noise. `reasoning` parts are kept with a `[reasoning]` prefix.
**Dedup:** staging at `~/.cache/mempalace-session/<wing>/` with deterministic per-session filenames (`<slug>_<id>.jsonl`). The convos miner keys on `source_file`, so re-runs skip unchanged sessions. To force re-mining a session, delete its JSONL from the staging dir.
**Filter:** sessions with fewer than `--min-messages` messages (default 3) are skipped — drops throwaway `/exit`'d sessions that would otherwise flood the palace. On a reference 140-session corpus, 78 were filtered this way.
**Cost profile:** ~20 minutes per 60-session batch. Scales roughly linearly with message count. Dedup re-run: mine step instant, only the post-mine `repair` runs (~5 min on 5k drawers).
---
## Companion agent skill
Installing this repo symlinks `SKILL.md` into `~/.agents/skills/opencode-mempalace-bridge/SKILL.md`, where it's auto-discovered by opencode (and by Claude Code / Kiro if you run `agents-sync` from [`cli_utils`](https://gitea.jordbo.se/joakimp/cli_utils)).
The skill is the *short-form checklist* for agents — when to use which wrapper, failure modes, setup recipes, anti-patterns. The canonical reference is always [`ARCHITECTURE.md`](ARCHITECTURE.md); the skill points there for deep context.
The skill pairs with the consumer-side [`mempalace` skill](https://github.com/MemPalace/mempalace) — that one covers using the palace (search, diary, KG); this one covers feeding it.
---
## See also
- [`ARCHITECTURE.md`](ARCHITECTURE.md) — canonical spec: diagrams, setup recipe, failure modes, upstream roadmap.
- [`AGENTS.md`](AGENTS.md) — repo conventions for AI agents modifying this codebase.
- [MemPalace](https://github.com/MemPalace/mempalace) — the memory layer itself.
- [opencode](https://github.com/anomalyco/opencode) — the agent harness this bridges.
- [cli_utils](https://gitea.jordbo.se/joakimp/cli_utils) — sibling repo with shell quality-of-life tools (origin of these wrappers before the 2026-04-30 split).
+157
View File
@@ -0,0 +1,157 @@
---
name: opencode-mempalace-bridge
description: Set up the producer side of MemPalace — feed opencode session history and project docs into the palace via the cli_utils wrappers. Use when provisioning a new machine, when the user asks how palace feeding works, when opencode sessions aren't showing up in searches, or when a project needs docs-only mining. Pairs with the `mempalace` skill (consumer side).
---
# Opencode ↔ MemPalace Bridge (producer side)
## Overview
The `mempalace` skill covers *using* the palace (search, diary, KG). This skill covers *feeding* it — specifically, how to wire opencode session history and project docs into the palace on a new machine or after a container recreate.
**Authoritative source:** `/workspace/cli_utils/ARCHITECTURE.md` (also at `<cli_utils>/ARCHITECTURE.md` in the gitea repo). When in doubt, read that file — it's the canonical spec. This skill is the short-form checklist.
**Core idea:** two thin wrappers in `cli_utils/bin/` close gaps in the stock mempalace CLI:
| Gap | Wrapper |
| ---------------------------------------------------------------------------------------- | -------------------- |
| `mempalace mine` floods the palace with source code we don't want | `mempalace-docs` |
| `mempalace mine --mode convos` can't read opencode's SQLite DB | `mempalace-session` |
Both follow the same **stage-to-cache-then-mine** idiom — they curate input into `~/.cache/…/<wing>/`, then delegate to `mempalace mine`.
## When to Load This Skill
- User asks "how does the palace get fed?" or mentions setting up mempalace on a new machine.
- Opencode conversations are missing from palace searches (`wing_conversations` is empty or stale).
- A project needs to be mined but you want *docs only, no source code*.
- User asks about `mempalace-docs` or `mempalace-session`.
- After a container recreate on a devbox — the wrappers need reinstall.
- Planning to retire either wrapper once upstream PRs merge (see §6 of ARCHITECTURE.md).
## Setup Recipe (new machine)
Prerequisites: `opencode` installed with an active DB at `~/.local/share/opencode/opencode.db`, `mempalace` CLI v3.3.3+, Python 3 (stdlib `sqlite3` only — no extra deps).
```bash
# 1. Clone cli_utils (holds the two wrappers in bin/)
git clone <gitea-url>/cli_utils ~/cli_utils
cd ~/cli_utils
# 2. Install — symlinks bin/* into ~/.local/bin, adds loader to rc file
./install.sh
# 3. Verify ~/.local/bin is on PATH
which mempalace-session mempalace-docs
# 4. Initialize palace (one-time, platform-wide)
mempalace init --yes
# 5. Mine opencode session history into wing_conversations
mempalace-session --dry-run # preview: which sessions qualify?
mempalace-session # do it (~20 min per 60 sessions)
# 6. Mine project docs per project (docs only — no source code)
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project
# 7. If a long-lived MCP session is open, reconnect it
# (from inside the MCP client): mempalace_reconnect
```
### Containerized (devbox) specifics
Named Docker volumes preserve state across container recreate:
- `devbox-palace``~/.mempalace/palace`
- `devbox-data``~/.local/share/opencode`
Bind mount `/workspace/cli_utils` from the host — code survives recreate, syncs via gitea.
**After container recreate:** `~/.local/bin` is ephemeral. Just re-run `./install.sh` (idempotent) — everything else already persists.
## Key Operational Rules
### Always dry-run first on a cold system
```bash
mempalace-session --dry-run # shows qualifying sessions
mempalace-docs <dir> --dry-run # shows files that would be mined
```
A docs-heavy repo should produce ~510 drawers per file. >15 drawers/file on average = code leaked in; investigate.
### Dedup is free — re-running is safe
- `mempalace-docs`: dedup keyed on `source_file` path + `mtime`. Unchanged files skipped.
- `mempalace-session`: dedup keyed on `source_file` path alone (no mtime check for convos). Staging filenames are deterministic per session (`<slug>_<id>.jsonl`), so re-runs skip already-filed sessions.
Second run immediately after first → 0 new drawers, only the post-mine `repair` step runs (~5 min on 5k drawers).
### Incremental catch-up
```bash
mempalace-session --since 2026-04-20 # only recent sessions
mempalace-session --session ses_abc123 # one specific session
```
### Force re-mine
```bash
rm -rf ~/.cache/mempalace-session/<wing>/ # nukes staging dir
mempalace-session # stages + mines fresh
```
Staging is ephemeral by design; the palace is the source of truth.
## Failure Modes & Fixes
| Symptom | Cause | Fix |
| ---------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------- |
| `mempalace-session: command not found` | `~/.local/bin` wiped (container recreate) | `cd ~/cli_utils && ./install.sh` |
| Sessions missing from palace | Fewer messages than `--min-messages` (default 3)| Lower threshold or `--session <id>` explicitly |
| "Error finding id" on search after mining | Stale HNSW index | `mempalace repair --yes` + `mempalace_reconnect` |
| Drawers doubled for a project | Someone ran raw `mempalace mine` alongside wrapper, or renamed wing mid-flight | Inspect `embedding_metadata` in `chroma.sqlite3`, purge duplicates by source prefix, then `mempalace repair` |
| Post-mine ChromaDB search returns stale results in MCP | MCP server caches old index | Call `mempalace_reconnect` from MCP |
| Opencode DB not at default path | Non-standard `XDG_DATA_HOME` or opencode config | `export OPENCODE_DB=/custom/path/opencode.db` or `--db` |
## What to File Under Which Wing
| Content type | Wing (convention) | Room | Tool |
| ----------------------------------- | ------------------------------ | ---------------- | ----------------------- |
| Opencode session transcripts | `wing_conversations` | auto (keyword) | `mempalace-session` |
| Project docs (md, yaml, Dockerfile) | `wing_<project-name>` | auto | `mempalace-docs` |
| Per-agent session diaries | `wing_<agent-name>` | `diary` | `mempalace_diary_write` (from the consumer-side `mempalace` skill) |
| Ad-hoc verbatim facts | any | any | `mempalace_add_drawer` |
## Cost Profile (reference)
From a 10-day opencode corpus (140 sessions / 1491 msgs / 4656 parts):
- Dry run: seconds.
- Full mine: ~21 min wall / ~38 min user CPU → 2378 drawers from 62 qualifying sessions.
- Dedup re-run: mine instant, repair ~5 min.
Budget **~20 minutes per 60-session batch**. Scales roughly linearly with message count.
## Anti-Patterns
- **Don't run `mempalace mine` directly on a project.** Use `mempalace-docs` — otherwise source code floods the palace.
- **Don't try to point `mempalace mine --mode convos` at `opencode.db` directly.** The convos miner reads files (txt/md/json/jsonl) only — no SQLite support. Use `mempalace-session` to export first.
- **Don't delete staging dirs unnecessarily.** They're dedup anchors; deleting means a forced re-mine of everything in that wing.
- **Don't forget `mempalace_reconnect`** after a mine from inside a live MCP session — otherwise search hits the stale index.
- **Don't mine with `--min-messages 0` or `1`** — 78 out of 140 sessions in reference corpus were throwaway `/exit`'d sessions that would flood the palace with noise. Default 3 is sensible.
## Upstream Roadmap (when to retire these wrappers)
- **[MemPalace PR #1213](https://github.com/MemPalace/mempalace/pull/1213)** merges → `mempalace-docs` becomes redundant (exclude patterns in `mempalace.yaml`). Retire to thin shim or delete.
- **Opencode session-stopping hooks merge** ([PR #16598](https://github.com/anomalyco/opencode/pull/16598) et al.) **AND** `hooks_cli.py` gains `opencode` harness → live auto-save works; `mempalace-session` becomes a manual-only backfill tool (cron / historic import).
- **SQLite mode lands in `mempalace mine --mode convos`** → `mempalace-session` loses its reason to exist entirely.
Check `ARCHITECTURE.md` §6 in `cli_utils/` for current upstream status before doing any retirement work.
## See Also
- `<cli_utils>/ARCHITECTURE.md`**canonical spec** (diagrams, implementation notes, full troubleshooting).
- `<cli_utils>/README.md` — per-tool usage reference.
- `~/.agents/skills/mempalace/SKILL.md` — consumer-side skill (search, diary, KG) — *pair this skill with that one*.
+268
View File
@@ -0,0 +1,268 @@
#!/usr/bin/env bash
# mempalace-docs — mine a project into MemPalace with docs-only filtering
#
# Works around the fact that upstream `mempalace mine` has a hardcoded
# READABLE_EXTENSIONS list that includes .py / .ts / .js / .go / .rs etc,
# which pollutes the palace with low-signal code-fragment drawers.
#
# Strategy: stage a copy of only docs/config/script files into /tmp, then
# run `mempalace mine` against that staging dir. Wing is derived from the
# source directory name (override with --wing).
#
# Once MemPalace PR #1213 (exclude_patterns in mempalace.yaml) lands, this
# wrapper becomes a thin shim over `mempalace mine` with a default
# exclude_patterns injected.
#
# Usage:
# mempalace-docs <directory>
# mempalace-docs <directory> --wing <name>
# mempalace-docs <directory> --agent <name>
# mempalace-docs <directory> --dry-run
# mempalace-docs --help
#
# Exit codes:
# 0 success
# 1 usage / argument error
# 2 source directory missing
# 3 mempalace CLI not installed
# 4 mine failed
#
# Dependencies: bash, find, cp, mempalace (v3.3.3+)
set -euo pipefail
# ── Defaults ─────────────────────────────────────────────────────────
AGENT="${USER:-mempalace}"
WING=""
SRC=""
DRY_RUN=0
NO_REPAIR=0
# File patterns to include. Docs + config + intent-bearing scripts.
# Everything else (code) is excluded by omission.
INCLUDE_GLOBS=(
'*.md' '*.mdx' '*.rst' '*.txt'
'*.yml' '*.yaml' '*.toml'
'*.json' # includes package.json, pyproject companions; lockfiles filtered below
'*.sh' '*.bash' '*.zsh' '*.fish'
'Dockerfile*' 'Makefile*' 'Containerfile*'
'*.conf' '*.cfg' '*.ini'
'LICENSE*' 'COPYING*' 'NOTICE*' 'AUTHORS*' 'CONTRIBUTORS*'
)
# Path segments to always skip (in addition to .gitignore).
SKIP_DIRS=(
'.git' '.venv' 'venv' '__pycache__' 'node_modules'
'.mypy_cache' '.pytest_cache' '.ruff_cache' '.tox' '.nox'
'dist' 'build' '.next' '.nuxt' 'target' 'coverage'
'.DS_Store'
)
# Filename patterns to skip even if caught by an include glob.
SKIP_FILES=(
'package-lock.json' 'yarn.lock' 'pnpm-lock.yaml' 'poetry.lock'
'Cargo.lock' 'Gemfile.lock' 'composer.lock'
'.gitignore' '.dockerignore'
)
# ── Usage ────────────────────────────────────────────────────────────
usage() {
cat <<'EOF'
mempalace-docs — mine a project into MemPalace, docs/config/scripts only
Usage:
mempalace-docs <directory> [options]
Options:
--wing <name> Override wing name (default: source directory name)
--agent <name> Agent name recorded on drawers (default: $USER)
--dry-run List files that would be mined; do not file
--no-repair Skip `mempalace repair` after mining
-h, --help Show this help
What gets mined:
Docs: *.md *.mdx *.rst *.txt
Config: *.yml *.yaml *.toml *.json *.conf *.cfg *.ini
Scripts: *.sh *.bash *.zsh *.fish Dockerfile* Makefile*
Legal: LICENSE* COPYING* NOTICE* AUTHORS*
What gets skipped (by design):
Source code: .py .ts .tsx .js .jsx .go .rs .java .cpp .c .rb .kt .swift
Caches / deps: .git .venv venv node_modules __pycache__ .mypy_cache
.pytest_cache .ruff_cache dist build .next target coverage
Lockfiles: package-lock.json yarn.lock poetry.lock Cargo.lock ...
Rationale:
The palace is for context and intent. Agents read code directly via
grep/glob/Read — mining it creates a parallel, lossier, drift-prone
copy that pollutes semantic search.
This wrapper is a bridge until MemPalace PR #1213 (exclude_patterns)
lands upstream.
EOF
}
# ── Parse args ───────────────────────────────────────────────────────
while [[ $# -gt 0 ]]; do
case "$1" in
-h|--help) usage; exit 0 ;;
--wing) WING="${2:-}"; shift 2 ;;
--agent) AGENT="${2:-}"; shift 2 ;;
--dry-run) DRY_RUN=1; shift ;;
--no-repair) NO_REPAIR=1; shift ;;
--) shift; break ;;
-*) echo "error: unknown option: $1" >&2; usage >&2; exit 1 ;;
*) if [[ -z "$SRC" ]]; then SRC="$1"; shift; else echo "error: unexpected arg: $1" >&2; exit 1; fi ;;
esac
done
if [[ -z "$SRC" ]]; then usage >&2; exit 1; fi
if [[ ! -d "$SRC" ]]; then
echo "error: not a directory: $SRC" >&2; exit 2
fi
if ! command -v mempalace >/dev/null 2>&1; then
echo "error: mempalace CLI not found in PATH" >&2; exit 3
fi
SRC="$(cd "$SRC" && pwd)"
# Determine wing name with the following precedence:
# 1. explicit --wing flag (user override)
# 2. `wing:` value in $SRC/mempalace.yaml (respect existing project config)
# 3. sanitized source directory basename (hyphens → underscores, matching
# mempalace's convention for implicit wing names)
if [[ -z "$WING" && -f "$SRC/mempalace.yaml" ]]; then
WING="$(awk -F': *' '/^wing:/ { gsub(/["\x27 ]/,"",$2); print $2; exit }' "$SRC/mempalace.yaml" 2>/dev/null || true)"
fi
if [[ -z "$WING" ]]; then
WING="$(basename "$SRC" | tr '-' '_')"
fi
# ── Build staging directory ──────────────────────────────────────────
# Use a deterministic, per-wing cache path so re-runs produce the same
# source_file paths the miner saw last time. This is critical: mempalace
# dedup keys on source_file + source_mtime, so a mktemp path would cause
# every run to re-file the entire wing.
CACHE_ROOT="${XDG_CACHE_HOME:-$HOME/.cache}/mempalace-docs"
STAGE="$CACHE_ROOT/$WING"
mkdir -p "$CACHE_ROOT"
rm -rf "$STAGE"
mkdir -p "$STAGE"
# Only clean up the per-wing stage on exit — leave $CACHE_ROOT itself
# alone in case other wings are staging concurrently.
trap 'rm -rf "$STAGE"' EXIT INT TERM
# Build find expression
find_cmd=(find "$SRC" -type f)
# Prune unwanted dirs
for d in "${SKIP_DIRS[@]}"; do
find_cmd+=('!' -path "*/$d/*" '!' -path "*/$d")
done
# Include only matching names
find_cmd+=('(' -false)
for g in "${INCLUDE_GLOBS[@]}"; do
find_cmd+=('-o' '-name' "$g")
done
find_cmd+=(')')
# Gather matches, then filter skip_files
mapfile -t matches < <("${find_cmd[@]}")
filtered=()
for f in "${matches[@]}"; do
base="$(basename "$f")"
skip=0
for sf in "${SKIP_FILES[@]}"; do
if [[ "$base" == "$sf" ]]; then skip=1; break; fi
done
[[ $skip -eq 0 ]] && filtered+=("$f")
done
count="${#filtered[@]}"
if [[ $count -eq 0 ]]; then
echo "no matching files found in $SRC" >&2
exit 0
fi
if [[ $DRY_RUN -eq 1 ]]; then
echo "Would mine $count files into wing '$WING':"
printf ' %s\n' "${filtered[@]}" | sed "s#^ $SRC/# #"
exit 0
fi
# Copy into staging, preserving mtime (critical for mempalace dedup —
# the miner compares stored mtime against the staged copy's mtime).
for f in "${filtered[@]}"; do
rel="${f#$SRC/}"
dest="$STAGE/$rel"
mkdir -p "$(dirname "$dest")"
cp -p "$f" "$dest"
done
# Purge any drawers in this wing that came from the original source
# directory. The miner records source_file = absolute path from the
# staging dir; this differs from a prior `mempalace mine <source>` run,
# so without this purge the wing would accumulate duplicates every time
# we switch between upstream `mempalace mine` and this wrapper.
# We only purge source_file paths matching $SRC/*, leaving other wings
# and other sources alone.
python3 - "$WING" "$SRC" <<'PY'
import sqlite3, sys, os
wing, src = sys.argv[1], sys.argv[2].rstrip("/")
db_path = os.path.expanduser("~/.mempalace/palace/chroma.sqlite3")
if not os.path.exists(db_path):
sys.exit(0)
db = sqlite3.connect(db_path)
cur = db.cursor()
# Find embedding ids in target wing whose source_file is under $SRC/
q = """
SELECT DISTINCT w.id
FROM embedding_metadata w
JOIN embedding_metadata s ON w.id = s.id AND s.key = 'source_file'
WHERE w.key = 'wing'
AND w.string_value = ?
AND (s.string_value LIKE ? OR s.string_value LIKE ?)
"""
pats = (f"{src}/%", f"{src}")
ids = [r[0] for r in cur.execute(q, (wing, pats[0], pats[1]))]
if ids:
ph = ",".join("?" * len(ids))
for tbl in ("embedding_metadata", "embeddings"):
try:
cur.execute(f"DELETE FROM {tbl} WHERE id IN ({ph})", ids)
except sqlite3.OperationalError:
pass
db.commit()
print(f" purged {len(ids)} pre-existing drawers for {src} from wing '{wing}'")
db.close()
PY
# Write mempalace.yaml into staging dir so the miner uses the right wing
cat > "$STAGE/mempalace.yaml" <<EOF
wing: $WING
rooms:
- name: general
description: Docs, config, and scripts from $WING
keywords: [general]
EOF
echo "Staging $count files into wing '$WING'..."
# ── Run the mine ─────────────────────────────────────────────────────
if ! mempalace mine "$STAGE" --agent "$AGENT" --wing "$WING"; then
echo "error: mempalace mine failed" >&2
exit 4
fi
# ── Repair index ─────────────────────────────────────────────────────
if [[ $NO_REPAIR -eq 0 ]]; then
echo ""
echo "Rebuilding HNSW index..."
mempalace repair --yes
fi
echo ""
echo "Done. Wing '$WING' is ready. Remember to reconnect any live MCP sessions."
+349
View File
@@ -0,0 +1,349 @@
#!/usr/bin/env bash
# mempalace-session — mine opencode session history into MemPalace
#
# Opencode persists every session (verbatim user/assistant turns + tool calls)
# in a local SQLite DB at ~/.local/share/opencode/opencode.db. There is
# currently no opencode session-stopping hook upstream, so the diary-based
# auto-save is best-effort; this wrapper closes the gap by mining the SQLite
# directly.
#
# Strategy:
# 1. Read opencode.db and export each qualifying session to a Claude Code
# JSONL file (format the mempalace normalizer already understands).
# 2. Stage exports under ~/.cache/mempalace-session/<wing>/.
# 3. Run `mempalace mine --mode convos` against the staging dir.
#
# Dedup: mempalace convos mode keys on source_file (absolute staging path).
# The staging path is deterministic (per-wing under XDG_CACHE_HOME) so re-runs
# are idempotent as long as session content hasn't changed.
#
# Session filter: sessions with fewer than --min-messages messages (default 3)
# are skipped to avoid filing throwaway /exit'd sessions.
#
# Usage:
# mempalace-session
# mempalace-session --wing <name>
# mempalace-session --session <id>
# mempalace-session --since 2026-04-01
# mempalace-session --min-messages 6
# mempalace-session --dry-run
# mempalace-session --help
#
# Exit codes:
# 0 success
# 1 usage / argument error
# 2 opencode.db missing or unreadable
# 3 mempalace CLI not installed
# 4 mine failed
#
# Dependencies: bash, python3 (stdlib sqlite3), mempalace (v3.3.3+)
set -euo pipefail
# ── Defaults ─────────────────────────────────────────────────────────
AGENT="${USER:-mempalace}"
WING="wing_conversations"
SESSION_ID=""
SINCE=""
MIN_MESSAGES=3
DRY_RUN=0
NO_REPAIR=0
OPENCODE_DB="${OPENCODE_DB:-$HOME/.local/share/opencode/opencode.db}"
# ── Usage ────────────────────────────────────────────────────────────
usage() {
cat <<'EOF'
mempalace-session — mine opencode session history into MemPalace
Usage:
mempalace-session [options]
Options:
--wing <name> Target wing (default: wing_conversations)
--session <id> Export one session only (default: all qualifying)
--since <YYYY-MM-DD> Only sessions with time_updated on/after this date
--min-messages <N> Skip sessions with fewer than N messages (default: 3)
--agent <name> Agent name recorded on drawers (default: $USER)
--db <path> Path to opencode.db (default: $OPENCODE_DB or
~/.local/share/opencode/opencode.db)
--dry-run Export + list; do not mine into palace
--no-repair Skip `mempalace repair` after mining
-h, --help Show this help
What gets mined:
- Each qualifying session → one Claude Code JSONL file
- Staged under ~/.cache/mempalace-session/<wing>/
- Filed via `mempalace mine --mode convos`
Transcript shape per session:
- Synthetic header as first user turn:
[session: <title> | <directory> | <YYYY-MM-DD>]
- User/assistant messages extracted from message.data + part.data
- Tool calls → Claude Code `tool_use` blocks
- Tool outputs → `tool_result` blocks (folded into the assistant turn by the
mempalace normalizer)
- `step-start` / `step-finish` parts are dropped (noise)
- `reasoning` parts prefixed with `[reasoning]` and kept as text
Dedup:
- source_file = absolute staging path (deterministic per session ID)
- Re-runs skip unchanged sessions. To force re-mining, delete the staging
dir: rm -rf ~/.cache/mempalace-session/<wing>/
Rationale:
Opencode lacks a session-stopping hook (upstream PRs #16598, #16769 still
open). Until that lands + mempalace hooks_cli.py gains an opencode harness,
this wrapper is how we get automatic session capture.
EOF
}
# ── Parse args ───────────────────────────────────────────────────────
while [[ $# -gt 0 ]]; do
case "$1" in
-h|--help) usage; exit 0 ;;
--wing) WING="${2:-}"; shift 2 ;;
--session) SESSION_ID="${2:-}"; shift 2 ;;
--since) SINCE="${2:-}"; shift 2 ;;
--min-messages) MIN_MESSAGES="${2:-}"; shift 2 ;;
--agent) AGENT="${2:-}"; shift 2 ;;
--db) OPENCODE_DB="${2:-}"; shift 2 ;;
--dry-run) DRY_RUN=1; shift ;;
--no-repair) NO_REPAIR=1; shift ;;
--) shift; break ;;
-*) echo "error: unknown option: $1" >&2; usage >&2; exit 1 ;;
*) echo "error: unexpected arg: $1" >&2; exit 1 ;;
esac
done
# ── Preflight ────────────────────────────────────────────────────────
if [[ ! -f "$OPENCODE_DB" ]]; then
echo "error: opencode.db not found at $OPENCODE_DB" >&2
echo " override with --db <path> or OPENCODE_DB env var" >&2
exit 2
fi
if ! command -v mempalace >/dev/null 2>&1; then
echo "error: mempalace CLI not found in PATH" >&2
exit 3
fi
if ! [[ "$MIN_MESSAGES" =~ ^[0-9]+$ ]]; then
echo "error: --min-messages must be an integer" >&2
exit 1
fi
# ── Staging dir ──────────────────────────────────────────────────────
# Deterministic per-wing path so source_file dedup works across re-runs.
CACHE_ROOT="${XDG_CACHE_HOME:-$HOME/.cache}/mempalace-session"
STAGE="$CACHE_ROOT/$WING"
mkdir -p "$STAGE"
# ── Export sessions (Python heredoc) ────────────────────────────────
# Writes one JSONL file per qualifying session into $STAGE.
# Prints: EXPORTED <count> on stdout, plus per-session lines.
export_count=$(python3 - "$OPENCODE_DB" "$STAGE" "$SESSION_ID" "$SINCE" "$MIN_MESSAGES" <<'PY'
import sqlite3, json, sys, os
from datetime import datetime, timezone
from pathlib import Path
db_path, stage, session_filter, since, min_messages = sys.argv[1:6]
min_messages = int(min_messages)
stage = Path(stage)
# Convert --since YYYY-MM-DD to epoch ms (opencode uses ms timestamps)
since_ms = None
if since:
try:
since_ms = int(datetime.strptime(since, "%Y-%m-%d").replace(tzinfo=timezone.utc).timestamp() * 1000)
except ValueError:
print(f"error: --since must be YYYY-MM-DD, got {since!r}", file=sys.stderr)
sys.exit(1)
conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)
conn.row_factory = sqlite3.Row
cur = conn.cursor()
# Select sessions
q = "SELECT * FROM session WHERE 1=1"
params = []
if session_filter:
q += " AND id = ?"
params.append(session_filter)
if since_ms is not None:
q += " AND time_updated >= ?"
params.append(since_ms)
q += " ORDER BY time_updated"
cur.execute(q, params)
sessions = [dict(r) for r in cur.fetchall()]
if not sessions:
print("EXPORTED 0")
sys.exit(0)
# Prefetch messages + parts for qualifying sessions
exported = 0
skipped_short = 0
for sess in sessions:
sid = sess["id"]
cur.execute("SELECT COUNT(*) FROM message WHERE session_id=?", (sid,))
msg_count = cur.fetchone()[0]
if msg_count < min_messages:
skipped_short += 1
continue
cur.execute(
"SELECT * FROM message WHERE session_id=? ORDER BY time_created", (sid,)
)
messages = [dict(r) for r in cur.fetchall()]
cur.execute(
"SELECT * FROM part WHERE session_id=? ORDER BY time_created", (sid,)
)
parts_by_msg: dict[str, list] = {}
for r in cur.fetchall():
d = dict(r)
parts_by_msg.setdefault(d["message_id"], []).append(d)
# Build JSONL lines
out_lines: list[dict] = []
# Synthetic header as first user turn — injects title/directory/date
# into the transcript so semantic search can find sessions by topic,
# not just by session-id filename.
title = sess.get("title") or "(untitled)"
directory = sess.get("directory") or "?"
date_str = datetime.fromtimestamp(
sess["time_created"] / 1000, tz=timezone.utc
).strftime("%Y-%m-%d")
header = f"[session: {title} | {directory} | {date_str}]"
out_lines.append({"type": "user", "message": {"content": header}})
for msg in messages:
mdata = json.loads(msg["data"])
role = mdata.get("role")
if role not in ("user", "assistant"):
continue
parts = parts_by_msg.get(msg["id"], [])
blocks = []
tool_results = []
for p in parts:
try:
pd = json.loads(p["data"])
except json.JSONDecodeError:
continue
t = pd.get("type")
if t == "text":
txt = (pd.get("text") or "").strip()
if txt:
blocks.append({"type": "text", "text": txt})
elif t == "tool":
# opencode tool part → tool_use block + deferred tool_result
state = pd.get("state") or {}
tool_name = pd.get("tool") or "Unknown"
call_id = pd.get("callID") or p["id"]
tool_input = state.get("input") or {}
tool_output = state.get("output")
blocks.append({
"type": "tool_use",
"id": call_id,
"name": tool_name,
"input": tool_input,
})
if tool_output:
tool_results.append({
"type": "tool_result",
"tool_use_id": call_id,
"content": str(tool_output),
})
elif t in ("step-start", "step-finish"):
continue
elif t == "reasoning":
rtext = (pd.get("text") or "").strip()
if rtext:
blocks.append({"type": "text", "text": f"[reasoning] {rtext}"})
if not blocks:
continue
# Simplify single-text-block messages to a bare string (more tolerant
# of normalizer edge cases; mempalace accepts either shape).
if len(blocks) == 1 and blocks[0]["type"] == "text":
content = blocks[0]["text"]
else:
content = blocks
out_lines.append({
"type": role,
"message": {"content": content},
})
# For assistants, follow up with a synthetic human tool_result message
# per tool call. The mempalace normalizer's `is_tool_only` branch
# folds these back into the assistant turn (see normalize.py:211-214).
if role == "assistant" and tool_results:
out_lines.append({
"type": "human",
"message": {"content": tool_results},
})
# Must have at least 2 turns for the normalizer to accept the file
if len(out_lines) < 2:
skipped_short += 1
continue
slug = sess.get("slug") or "session"
out_path = stage / f"{slug}_{sid}.jsonl"
with open(out_path, "w", encoding="utf-8") as f:
for obj in out_lines:
f.write(json.dumps(obj, ensure_ascii=False) + "\n")
# Set mtime to session time_updated so dedup sees a stable value.
try:
ts = sess["time_updated"] / 1000
os.utime(out_path, (ts, ts))
except Exception:
pass
exported += 1
print(f" {out_path.name} ({msg_count} msgs, {len(out_lines)} turns)",
file=sys.stderr)
print(f"EXPORTED {exported}")
if skipped_short:
print(f"SKIPPED_SHORT {skipped_short}", file=sys.stderr)
PY
)
# Parse count from stdout
count="${export_count##*EXPORTED }"
count="${count%%[!0-9]*}"
count="${count:-0}"
if [[ "$count" -eq 0 ]]; then
echo "no sessions qualified for export"
exit 0
fi
echo ""
echo "Exported $count session(s) to $STAGE"
if [[ $DRY_RUN -eq 1 ]]; then
echo "--dry-run: skipping mine step"
exit 0
fi
# ── Run the mine ─────────────────────────────────────────────────────
echo ""
echo "Mining into wing '$WING'..."
if ! mempalace mine "$STAGE" --mode convos --wing "$WING" --agent "$AGENT"; then
echo "error: mempalace mine failed" >&2
exit 4
fi
# ── Repair index ─────────────────────────────────────────────────────
if [[ $NO_REPAIR -eq 0 ]]; then
echo ""
echo "Rebuilding HNSW index..."
mempalace repair --yes
fi
echo ""
echo "Done. Wing '$WING' updated. Remember to reconnect any live MCP sessions."
+184
View File
@@ -0,0 +1,184 @@
#!/usr/bin/env bash
# install.sh — install mempalace-toolkit executables + companion agent skill
#
# Idempotent. Safe to re-run after container recreate.
set -euo pipefail
# ── locate self ──────────────────────────────────────
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# ── targets ──────────────────────────────────────────
BIN_SRC="${SCRIPT_DIR}/bin"
BIN_DEST="${HOME}/.local/bin"
SKILL_SRC="${SCRIPT_DIR}/SKILL.md"
SKILL_DEST_DIR="${HOME}/.agents/skills/opencode-mempalace-bridge"
SKILL_DEST="${SKILL_DEST_DIR}/SKILL.md"
# ── args ─────────────────────────────────────────────
ACTION="install"
ASSUME_YES="no"
while [[ $# -gt 0 ]]; do
case "$1" in
--uninstall) ACTION="uninstall"; shift ;;
-y|--yes) ASSUME_YES="yes"; shift ;;
-h|--help)
cat <<EOF
install.sh — install mempalace-toolkit
Usage:
./install.sh # install (interactive confirm)
./install.sh --yes # install without prompt
./install.sh --uninstall # remove symlinks
What install does:
- Symlinks each executable in bin/ into ~/.local/bin/
- Symlinks SKILL.md into ~/.agents/skills/opencode-mempalace-bridge/SKILL.md
(auto-discovered by opencode; run agents-sync from cli_utils to also
reach Claude Code and Kiro)
What uninstall does:
- Removes symlinks in ~/.local/bin/ that point into this repo
- Removes the skill symlink if it points into this repo
EOF
exit 0 ;;
*) echo "Unknown flag: $1" >&2; exit 2 ;;
esac
done
# ── helpers ──────────────────────────────────────────
ok() { printf ' \e[32m✓\e[0m %s\n' "$*"; }
note() { printf '==> %s\n' "$*"; }
warn() { printf ' \e[33m!\e[0m %s\n' "$*" >&2; }
err() { printf ' \e[31m✗\e[0m %s\n' "$*" >&2; }
confirm() {
[[ "$ASSUME_YES" == "yes" ]] && return 0
read -r -p "Proceed? [y/N] " ans
[[ "$ans" =~ ^[Yy]$ ]]
}
link_if_into_repo() {
# Return 0 if $1 is a symlink pointing into $SCRIPT_DIR
local target
[[ -L "$1" ]] || return 1
target=$(readlink -f "$1")
[[ "$target" == "$SCRIPT_DIR"/* ]]
}
# ── install ──────────────────────────────────────────
install_bin() {
mkdir -p "$BIN_DEST"
note "Symlinking bin/ executables into $BIN_DEST"
local count=0
for src in "$BIN_SRC"/*; do
[[ -x "$src" && -f "$src" ]] || continue
local name; name=$(basename "$src")
local dest="$BIN_DEST/$name"
if [[ -e "$dest" || -L "$dest" ]]; then
if link_if_into_repo "$dest"; then
ok "Already linked: $name"
count=$((count+1))
continue
else
warn "Skipping $name: $dest exists and is not our symlink"
continue
fi
fi
ln -s "$src" "$dest"
ok "Linked $name$src"
count=$((count+1))
done
echo
ok "Installed $count executable(s)"
}
install_skill() {
note "Linking companion agent skill"
mkdir -p "$SKILL_DEST_DIR"
if [[ -e "$SKILL_DEST" || -L "$SKILL_DEST" ]]; then
if link_if_into_repo "$SKILL_DEST"; then
ok "Skill already linked"
return 0
else
warn "Skipping skill: $SKILL_DEST exists and is not our symlink"
return 0
fi
fi
ln -s "$SKILL_SRC" "$SKILL_DEST"
ok "Linked SKILL.md → $SKILL_SRC"
}
check_path() {
case ":$PATH:" in
*":$BIN_DEST:"*) : ;;
*) warn "$BIN_DEST is not on \$PATH. Add to your shell rc:";
printf ' export PATH="%s:$PATH"\n' "\$HOME/.local/bin" ;;
esac
}
do_install() {
echo
echo "mempalace-toolkit installer"
echo "Repository: $SCRIPT_DIR"
echo
echo "==> Installation plan:"
echo " Symlink executables in bin/ into $BIN_DEST"
echo " Symlink SKILL.md into $SKILL_DEST"
echo
confirm || { echo "Aborted."; exit 0; }
echo
install_bin
echo
install_skill
echo
check_path
echo
ok "Done."
echo
echo "Next: ./bin/mempalace-session --dry-run"
echo " or: ./bin/mempalace-docs /path/to/project --dry-run"
}
# ── uninstall ────────────────────────────────────────
do_uninstall() {
echo
echo "mempalace-toolkit uninstaller"
echo "Repository: $SCRIPT_DIR"
echo
confirm || { echo "Aborted."; exit 0; }
echo
note "Removing executable symlinks from $BIN_DEST"
local removed=0
for src in "$BIN_SRC"/*; do
[[ -x "$src" && -f "$src" ]] || continue
local name; name=$(basename "$src")
local dest="$BIN_DEST/$name"
if link_if_into_repo "$dest"; then
rm "$dest"
ok "Removed $name"
removed=$((removed+1))
fi
done
ok "Removed $removed executable symlink(s)"
echo
note "Removing skill symlink"
if link_if_into_repo "$SKILL_DEST"; then
rm "$SKILL_DEST"
ok "Removed skill symlink"
else
ok "No skill symlink to remove"
fi
echo
ok "Done."
}
case "$ACTION" in
install) do_install ;;
uninstall) do_uninstall ;;
esac