Docs: explain diary vs session mine (why keep both)

Automated session mining could plausibly lead a user (or a future
agent) to conclude that writing diary entries is redundant — mining
captures every turn, so why also write a compressed summary at
wind-down? That conclusion is wrong, and it's worth explaining why
in the docs so both disciplines survive.

ARCHITECTURE.md §5 gets a new subsection 'Diary vs session mine:
why keep both?' that presents this as a first-class concept:

- Comparison table — content, granularity, compression, authorship,
  signal density, retrieval pattern, and the question each answers.
- The defining property of a diary entry: editorial judgment by the
  author. Captures meta-observations that were never said aloud
  during the session (lessons, patterns, pending items, aggregate
  counts). Mining raw turns can never surface these because the
  words don't exist verbatim.
- Three practical scenarios where the distinction bites: wake-up
  token economics, 'what did we decide' vs 'what did we say',
  redundancy covering each other's failure modes.
- Practical implications: don't skip either habit, let them
  specialize (diary = release notes; mine = git log).

README.md gets a brief teaser in the 'First mine' area with a link
to the canonical ARCHITECTURE.md section — enough for a skim reader
to decide they want to keep writing diaries, and for a deep reader
to know where to go for the full explanation.

SKILL.md replaces the three-line 'Relationship to the mempalace
skill' note with a compact version of the comparison table and a
direct call-out of the 'session mining means I don't need diaries'
misconception agents fall into. Points agents at ARCHITECTURE.md
§5 for the full treatment when users ask the question.

Cross-references verified: anchor slug for the new section is
#diary-vs-session-mine-why-keep-both (standard slug rules: colon,
spaces, punctuation removed/hyphenated). Both linking docs use the
matching fragment.
This commit is contained in:
Joakim Persson
2026-04-30 08:56:20 +00:00
parent 349a3a3d3d
commit 2f703a8ebc
3 changed files with 62 additions and 6 deletions
+40 -1
View File
@@ -224,7 +224,46 @@ That makes the routine worth codifying:
| Wind-down diary write | Agent session end | Agent, during session | | Wind-down diary write | Agent session end | Agent, during session |
| `mempalace-session` mine | Between sessions (manual or scheduled) | Operator or automation | | `mempalace-session` mine | Between sessions (manual or scheduled) | Operator or automation |
The first two are live; the third is batched. They're complementary, not alternatives. A machine doing only wake-up/wind-down keeps a diary but loses the actual conversation turns. A machine doing only `mempalace-session` captures the raw turns but not the curated summaries. Do both. The first two are live; the third is batched. They're complementary, not alternatives. The next subsection explains why both matter.
#### Diary vs session mine: why keep both?
A reasonable question: *"if every session is mined into `wing_conversations` anyway, what's the point of the agent also writing a diary entry?"* They're not redundant. They answer different questions and cover each other's failure modes.
| | Session mine (`wing_conversations`) | Diary (`wing_<agent>`) |
|---|---|---|
| Content | Every turn verbatim — prompts, responses, tool calls, dead ends, typos | Curated summary — what was decided, discovered, left pending |
| Granularity | One session ≈ 50200 drawers | One session ≈ 1 drawer |
| Compression | None (raw JSONL → normalized turns) | High (AAAK dialect — dots + pipes + entity codes, ~30× reduction) |
| Written by | Nothing — extracted from `opencode.db` | The agent that lived the session, at wind-down |
| Signal density | High noise (wrong turns, corrections, `/exit`'d threads) | High signal (agent's editorial judgment of what mattered) |
| Retrieval pattern | Semantic search (`mempalace_search("topic X")`) | Recency scan (`mempalace_diary_read(last_n=5)`) |
| Answers the question | *"What did we say exactly?"* | *"What did we accomplish / learn / decide?"* |
The distinguishing property of a diary entry is **editorial judgment by the author**. The diary captures things that were *never said aloud during the session* — meta-observations the agent made about the session as a whole:
- *"this pattern came up again, worth remembering"*
- *"user caught the bug before I shipped it — lesson: verify CLI examples against `--help` first"*
- *"10 commits across 3 repos today, all pushed"*
- *"healthy interruption: user stopped me before a long-running step"*
These are thoughts *about* the session, not utterances *during* it. Mining the raw turns will never surface them because the exact words were never spoken — they're the agent's reflection at wind-down.
**Three scenarios where the distinction matters in practice:**
1. **Wake-up token economics.** Reading `mempalace_diary_read(last_n=5)` returns five dense drawers, maybe 12k tokens total, 100% signal. Matching that orientation from the session mine would require semantic-searching for recent topics and reading chunks of raw turns — hundreds of drawers, tens of thousands of tokens, 90% noise.
2. **"What did we decide?" vs. "what did we say?"** If you ask *"when did we decide to split `mempalace-toolkit` from `cli_utils`?"* the diary gives you the crisp answer (date, trigger, rationale). The session mine gives you the actual seven-turn conversation that led up to the decision, including the turns where alternatives were considered. Both useful; different questions.
3. **Redundancy as safety.** If the agent `/exit`s without writing a diary (heuristic save missed it, no upstream hook), the session mine still catches the raw content. If `mempalace-session` hasn't run this week, the diary still captures the session's essence. The two systems cover each other's failure modes.
**Practical implications for how you work with mempalace:**
- **Don't skip diary writing** just because sessions are mined. A session without a diary entry is a session the next agent can read word-for-word but has no compressed summary of — expensive to orient against.
- **Don't skip session mining** just because agents write diaries. Diaries miss content (especially on `/exit`), and semantic search over raw turns is valuable when "what did we say exactly?" is the right question.
- **Do both, and let them specialize.** Treat the diary as your release notes (editorial, curated, recency-scanned) and the session mine as your git log (raw, searchable, complete). A repo keeps both; so should the palace.
If anything, automating session mining *increases* the value of diary entries. The agent can focus the diary on the parts mining cannot capture — meta-observations, self-critique, pattern noticing, pending work — rather than re-stating content the mine already has.
#### Automation #### Automation
+9
View File
@@ -188,6 +188,15 @@ mempalace-docs /workspace/my_project
> **Note:** mempalace has no one-time global init. The palace itself is created lazily on first write (at `~/.mempalace/palace/`). `mempalace init <dir>` is a *per-project* command that sets up a `mempalace.yaml` + entity list for a specific source directory — optional, not a prerequisite for either wrapper. > **Note:** mempalace has no one-time global init. The palace itself is created lazily on first write (at `~/.mempalace/palace/`). `mempalace init <dir>` is a *per-project* command that sets up a `mempalace.yaml` + entity list for a specific source directory — optional, not a prerequisite for either wrapper.
### Diary vs session mine: why keep both?
Automated session mining captures every turn verbatim into `wing_conversations`. But agents are still expected to write a short AAAK-compressed diary entry at wind-down (the consumer-side `mempalace` skill calls this out as mandatory). They're not redundant — they answer different questions:
- **Session mine** = git log with diffs. *"What did we say exactly?"* Raw, searchable, complete. High noise.
- **Diary** = release notes. *"What did we decide / learn / accomplish?"* Curated, compressed, recency-scanned. The agent's editorial judgment of what mattered, including meta-observations that were never said aloud.
A machine running only one of these has half a memory. Full treatment with practical implications in [`ARCHITECTURE.md` §5 → "Diary vs session mine: why keep both?"](ARCHITECTURE.md#diary-vs-session-mine-why-keep-both). Short answer: automate the mine, keep writing diaries, and let them specialize.
### Keeping it fresh (automation) ### Keeping it fresh (automation)
Manual invocation is fine while you're actively driving the machine, but long-running devboxes benefit from a weekly automated mine. [`contrib/`](contrib/) ships ready-to-install templates: Manual invocation is fine while you're actively driving the machine, but long-running devboxes benefit from a weekly automated mine. [`contrib/`](contrib/) ships ready-to-install templates:
+13 -5
View File
@@ -142,14 +142,22 @@ Suggest invoking the tool when any of these apply:
Don't suggest running more often than daily — the post-mine HNSW repair (~5 min on 5k drawers) dominates cost, and session growth is slow enough that daily is already overkill. Don't suggest running more often than daily — the post-mine HNSW repair (~5 min on 5k drawers) dominates cost, and session growth is slow enough that daily is already overkill.
### Relationship to the `mempalace` skill ### Relationship to the `mempalace` skill — and why diaries still matter
The consumer-side `mempalace` skill covers in-session habits (wake-up search, wind-down diary). `mempalace-session` is **between-session maintenance**it runs when no agent session is active, typically automated. Both are needed: The consumer-side `mempalace` skill covers in-session habits (wake-up search, wind-down diary write). `mempalace-session` is **between-session maintenance**automated, raw-turn capture. The two are complementary, not alternatives:
- Wake-up/wind-down → curated summaries (what mattered, what was decided) | | `mempalace-session` (this tool) | Agent diary (`mempalace` skill) |
- `mempalace-session` → raw turns (the actual words exchanged) |---|---|---|
| Content | Every turn verbatim | Curated AAAK summary |
| Written by | Extraction from `opencode.db` | The agent, at wind-down |
| Answers | *"What did we say?"* | *"What did we decide / learn?"* |
| Retrieval | Semantic search | Recency scan |
A machine running only one of these has half a memory. Agents loading the `mempalace` skill on an opencode machine should be aware this tool exists and point users at it if they ask about persistence. **Common misconception agents fall into:** *"session mining runs weekly, so I don't need to write diary entries."* Wrong. Diary entries capture meta-observations that were never said during the session — "lesson learned", "pattern confirmed", "user caught a bug before ship", "pending next time". Mining raw turns can never surface those because the words don't exist verbatim; they're the agent's editorial judgment at wind-down.
If a user asks whether diaries are redundant given automated mining, point them at [`ARCHITECTURE.md` §5 → "Diary vs session mine: why keep both?"](../../ARCHITECTURE.md#diary-vs-session-mine-why-keep-both) for the full treatment. Short answer: release notes vs. git log — a repo keeps both, and so should the palace.
Both systems cover each other's failure modes too: a missed diary (agent `/exit`'d without writing) → session mine catches the raw content; a missed mine (cron behind schedule, new machine) → diary captures the essence. Belt and braces.
### Quick automation pitch ### Quick automation pitch