Files

T

Joakim Persson 2f703a8ebc Docs: explain diary vs session mine (why keep both)

Automated session mining could plausibly lead a user (or a future
agent) to conclude that writing diary entries is redundant — mining
captures every turn, so why also write a compressed summary at
wind-down? That conclusion is wrong, and it's worth explaining why
in the docs so both disciplines survive.

ARCHITECTURE.md §5 gets a new subsection 'Diary vs session mine:
why keep both?' that presents this as a first-class concept:

- Comparison table — content, granularity, compression, authorship,
  signal density, retrieval pattern, and the question each answers.
- The defining property of a diary entry: editorial judgment by the
  author. Captures meta-observations that were never said aloud
  during the session (lessons, patterns, pending items, aggregate
  counts). Mining raw turns can never surface these because the
  words don't exist verbatim.
- Three practical scenarios where the distinction bites: wake-up
  token economics, 'what did we decide' vs 'what did we say',
  redundancy covering each other's failure modes.
- Practical implications: don't skip either habit, let them
  specialize (diary = release notes; mine = git log).

README.md gets a brief teaser in the 'First mine' area with a link
to the canonical ARCHITECTURE.md section — enough for a skim reader
to decide they want to keep writing diaries, and for a deep reader
to know where to go for the full explanation.

SKILL.md replaces the three-line 'Relationship to the mempalace
skill' note with a compact version of the comparison table and a
direct call-out of the 'session mining means I don't need diaries'
misconception agents fall into. Points agents at ARCHITECTURE.md
§5 for the full treatment when users ask the question.

Cross-references verified: anchor slug for the new section is
#diary-vs-session-mine-why-keep-both (standard slug rules: colon,
spaces, punctuation removed/hyphenated). Both linking docs use the
matching fragment.

2026-04-30 08:56:20 +00:00

15 KiB

Raw Blame History

name, description

name	description
opencode-mempalace-bridge	Set up the producer side of MemPalace — feed opencode session history and project docs into the palace via the wrappers in the mempalace-toolkit repo. Use when provisioning a new machine, when the user asks how palace feeding works, when opencode sessions aren't showing up in searches, or when a project needs docs-only mining. Pairs with the `mempalace` skill (consumer side).

Opencode ↔ MemPalace Bridge (producer side)

Overview

The mempalace skill covers using the palace (search, diary, KG). This skill covers feeding it — specifically, how to wire opencode session history and project docs into the palace on a new machine or after a container recreate.

Authoritative source: /workspace/mempalace-toolkit/ARCHITECTURE.md (also at the root of the mempalace-toolkit repo on gitea). When in doubt, read that file — it's the canonical spec. This skill is the short-form checklist.

Core idea: two thin wrappers in mempalace-toolkit/bin/ close gaps in the stock mempalace CLI:

Gap	Wrapper
`mempalace mine` floods the palace with source code we don't want	`mempalace-docs`
`mempalace mine --mode convos` can't read opencode's SQLite DB	`mempalace-session`

Both follow the same stage-to-cache-then-mine idiom — they curate input into ~/.cache/…/<wing>/, then delegate to mempalace mine.

When to Load This Skill

User asks "how does the palace get fed?" or mentions setting up mempalace on a new machine.
Opencode conversations are missing from palace searches (wing_conversations is empty or stale).
A project needs to be mined but you want docs only, no source code.
User asks about mempalace-docs or mempalace-session.
After a container recreate on a devbox — the wrappers need reinstall.
Planning to retire either wrapper once upstream PRs merge (see §6 of ARCHITECTURE.md).

Setup Recipe (new machine)

Prerequisites: opencode installed with an active DB at ~/.local/share/opencode/opencode.db, mempalace CLI v3.3.3+, Python 3 (stdlib sqlite3 only — no extra deps).

If mempalace itself isn't installed yet, suggest uv tool install mempalace (not pip install mempalace — it fights PEP 668 on modern distros and leaks deps into system site-packages). For a system-wide install on a container or shared box, set UV_TOOL_DIR=/opt/uv-tools + UV_TOOL_BIN_DIR=/usr/local/bin before uv tool install, and ship an MCP wrapper on PATH that exec's the venv's Python — otherwise MCP clients fail silently with ModuleNotFoundError. Full recipe in mempalace-toolkit/README.md#installing-mempalace-itself-prerequisite.

# 1. Clone mempalace-toolkit (holds the two wrappers in bin/)
git clone ssh://git@gitea.jordbo.se:2222/joakimp/mempalace-toolkit.git ~/mempalace-toolkit
cd ~/mempalace-toolkit

# 2. Install — symlinks bin/* into ~/.local/bin, adds loader to rc file
./install.sh

# 3. Verify ~/.local/bin is on PATH
which mempalace-session mempalace-docs

# 4. Mine opencode session history into wing_conversations
#    (No global init needed — the palace is created lazily on first write.
#     `mempalace init <dir>` is per-project and optional.)
mempalace-session --dry-run              # preview: which sessions qualify?
mempalace-session                        # do it (~20 min per 60 sessions)

# 5. Mine project docs per project (docs only — no source code)
#    Optional: `mempalace init --yes <dir>` first to customize wing/entities
mempalace-docs /workspace/my_project --dry-run
mempalace-docs /workspace/my_project

# 7. If a long-lived MCP session is open, reconnect it
#    (from inside the MCP client): mempalace_reconnect

Containerized (devbox) specifics

Named Docker volumes preserve state across container recreate:

devbox-palace → ~/.mempalace/palace
devbox-data → ~/.local/share/opencode

Bind mount /workspace/mempalace-toolkit from the host — code survives recreate, syncs via gitea.

After container recreate: ~/.local/bin is ephemeral. Just re-run ./install.sh (idempotent) — everything else already persists.

Key Operational Rules

Always dry-run first on a cold system

mempalace-session --dry-run       # shows qualifying sessions
mempalace-docs <dir> --dry-run    # shows files that would be mined

A docs-heavy repo should produce ~5–10 drawers per file. >15 drawers/file on average = code leaked in; investigate.

Dedup is free — re-running is safe

mempalace-docs: dedup keyed on source_file path + mtime. Unchanged files skipped.
mempalace-session: dedup keyed on source_file path alone (no mtime check for convos). Staging filenames are deterministic per session (<slug>_<id>.jsonl), so re-runs skip already-filed sessions.

Second run immediately after first → 0 new drawers, only the post-mine repair step runs (~5 min on 5k drawers).

mempalace-session --dry-run is dedup-aware. Each session listed is tagged [NEW] (would be filed) or [SKIP] (already in the palace), and the summary reports the split:

Exported 62 session(s) to ~/.cache/...
  0 new   → will be filed on mine
  62 already filed → will be skipped (dedup by source_file)

So when a user asks "will it mine the same sessions again?" — point them at mempalace-session --dry-run and read the summary line. If N new = 0, nothing will be re-filed. The classification check is best-effort (falls back to "everything is new" if palace unreachable); the real mine step delegates to mempalace mine --mode convos, which is always the authoritative dedup source.

Incremental catch-up

mempalace-session --since 2026-04-20          # only recent sessions
mempalace-session --session ses_abc123        # one specific session

Force re-mine

rm -rf ~/.cache/mempalace-session/<wing>/     # nukes staging dir
mempalace-session                              # stages + mines fresh

Staging is ephemeral by design; the palace is the source of truth.

Operational Routine (when to invoke)

Until upstream opencode session hooks land, mempalace-session is the entire mechanism that gets opencode conversations into the palace. If the user's opencode history isn't showing up in mempalace_search, the most likely cause is "mempalace-session hasn't been run recently".

Agent-level triggers

Suggest invoking the tool when any of these apply:

User asks "why can't you find our conversation from earlier?" / "you should remember this from last week" → the palace may not have the opencode session yet. Check mempalace_list_wings for wing_conversations; if the last drawer there is older than the referenced conversation, run mempalace-session.
User mentions an upcoming container recreate / system migration / OS reinstall → suggest a full mine first as a backup checkpoint. The opencode DB normally survives via named volumes, but an explicit mine is cheap insurance.
User has just set up a new machine and asks about mempalace → part of the recipe is the initial backfill. Run mempalace-session --dry-run first to show scope, then the real mine.
User completed a substantive session they want preserved past /exit → offer a targeted mempalace-session --session <id> right then.
User asks "how do I keep this up to date?" → point at contrib/ (systemd timer or cron recipes).

Cadence guidance

Situation	Suggested cadence
Active devbox, daily opencode use	Weekly automated (systemd timer or cron)
Occasional opencode user	Monthly manual or weekly automated
Fresh machine / first setup	One-shot full backfill, then schedule
"I'm about to rebuild the container"	Run now, as a checkpoint
Automated daily mines	Pass `--no-repair` + schedule weekly repair separately

Don't suggest running more often than daily — the post-mine HNSW repair (~5 min on 5k drawers) dominates cost, and session growth is slow enough that daily is already overkill.

Relationship to the `mempalace` skill — and why diaries still matter

The consumer-side mempalace skill covers in-session habits (wake-up search, wind-down diary write). mempalace-session is between-session maintenance — automated, raw-turn capture. The two are complementary, not alternatives:

	`mempalace-session` (this tool)	Agent diary (`mempalace` skill)
Content	Every turn verbatim	Curated AAAK summary
Written by	Extraction from `opencode.db`	The agent, at wind-down
Answers	"What did we say?"	"What did we decide / learn?"
Retrieval	Semantic search	Recency scan

Common misconception agents fall into: "session mining runs weekly, so I don't need to write diary entries." Wrong. Diary entries capture meta-observations that were never said during the session — "lesson learned", "pattern confirmed", "user caught a bug before ship", "pending next time". Mining raw turns can never surface those because the words don't exist verbatim; they're the agent's editorial judgment at wind-down.

If a user asks whether diaries are redundant given automated mining, point them at ARCHITECTURE.md §5 → "Diary vs session mine: why keep both?" for the full treatment. Short answer: release notes vs. git log — a repo keeps both, and so should the palace.

Both systems cover each other's failure modes too: a missed diary (agent /exit'd without writing) → session mine catches the raw content; a missed mine (cron behind schedule, new machine) → diary captures the essence. Belt and braces.

Quick automation pitch

If the user has a devbox and hasn't set up automation, offer it. Pick the scheduler that matches the OS:

Linux (systemd user timer):

cd ~/mempalace-toolkit
mkdir -p ~/.config/systemd/user
cp contrib/systemd/*.{service,timer} ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now mempalace-session.timer
sudo loginctl enable-linger "$USER"   # optional, for headless boxes

macOS (launchd user agent):

cd ~/mempalace-toolkit
sed "s|USER|$USER|g" contrib/launchd/se.jordbo.mempalace-session.plist \
  > ~/Library/LaunchAgents/se.jordbo.mempalace-session.plist
mkdir -p ~/Library/Logs
launchctl bootstrap "gui/$(id -u)" ~/Library/LaunchAgents/se.jordbo.mempalace-session.plist
launchctl enable "gui/$(id -u)/se.jordbo.mempalace-session"

BSD or systemd-less Linux (cron): see contrib/cron/.

Full install/verify/uninstall recipes for all three are in contrib/README.md. All three default to weekly runs on Monday 03:00 local time.

Failure Modes & Fixes

Symptom	Cause	Fix
`mempalace-session: command not found`	`~/.local/bin` wiped (container recreate)	`cd ~/mempalace-toolkit && ./install.sh --yes`
Sessions missing from palace	Fewer messages than `--min-messages` (default 3)	Lower threshold or `--session <id>` explicitly
"Error finding id" on search after mining	Stale HNSW index	`mempalace repair --yes` + `mempalace_reconnect`
Drawers doubled for a project	Someone ran raw `mempalace mine` alongside wrapper, or renamed wing mid-flight	Inspect `embedding_metadata` in `chroma.sqlite3`, purge duplicates by source prefix, then `mempalace repair`
Post-mine ChromaDB search returns stale results in MCP	MCP server caches old index	Call `mempalace_reconnect` from MCP
Opencode DB not at default path	Non-standard `XDG_DATA_HOME` or opencode config	`export OPENCODE_DB=/custom/path/opencode.db` or `--db`

What to File Under Which Wing

Content type	Wing (convention)	Room	Tool
Opencode session transcripts	`wing_conversations`	auto (keyword)	`mempalace-session`
Project docs (md, yaml, Dockerfile)	`wing_<project-name>`	auto	`mempalace-docs`
Per-agent session diaries	`wing_<agent-name>`	`diary`	`mempalace_diary_write` (from the consumer-side `mempalace` skill)
Ad-hoc verbatim facts	any	any	`mempalace_add_drawer`

Cost Profile (reference)

From a 10-day opencode corpus (140 sessions / 1491 msgs / 4656 parts):

Dry run: seconds.
Full mine: ~21 min wall / ~38 min user CPU → 2378 drawers from 62 qualifying sessions.
Dedup re-run: mine instant, repair ~5 min.

Budget ~20 minutes per 60-session batch. Scales roughly linearly with message count.

Anti-Patterns

Don't run mempalace mine directly on a project. Use mempalace-docs — otherwise source code floods the palace.
Don't try to point mempalace mine --mode convos at opencode.db directly. The convos miner reads files (txt/md/json/jsonl) only — no SQLite support. Use mempalace-session to export first.
Don't delete staging dirs unnecessarily. They're dedup anchors; deleting means a forced re-mine of everything in that wing.
Don't forget mempalace_reconnect after a mine from inside a live MCP session — otherwise search hits the stale index.
Don't mine with --min-messages 0 or 1 — 78 out of 140 sessions in reference corpus were throwaway /exit'd sessions that would flood the palace with noise. Default 3 is sensible.

Upstream Roadmap (when to retire these wrappers)

MemPalace PR #1213 merges → mempalace-docs becomes redundant (exclude patterns in mempalace.yaml). Retire to thin shim or delete.
Opencode session-stopping hooks merge (PR #16598 et al.) AND hooks_cli.py gains opencode harness → live auto-save works; mempalace-session becomes a manual-only backfill tool (cron / historic import).
SQLite mode lands in mempalace mine --mode convos → mempalace-session loses its reason to exist entirely.

Check ARCHITECTURE.md §6 in mempalace-toolkit/ for current upstream status before doing any retirement work.

15 KiB Raw Blame History Unescape Escape