Docs: explain diary vs session mine (why keep both)

Automated session mining could plausibly lead a user (or a future agent) to conclude that writing diary entries is redundant — mining captures every turn, so why also write a compressed summary at wind-down? That conclusion is wrong, and it's worth explaining why in the docs so both disciplines survive. ARCHITECTURE.md §5 gets a new subsection 'Diary vs session mine: why keep both?' that presents this as a first-class concept: - Comparison table — content, granularity, compression, authorship, signal density, retrieval pattern, and the question each answers. - The defining property of a diary entry: editorial judgment by the author. Captures meta-observations that were never said aloud during the session (lessons, patterns, pending items, aggregate counts). Mining raw turns can never surface these because the words don't exist verbatim. - Three practical scenarios where the distinction bites: wake-up token economics, 'what did we decide' vs 'what did we say', redundancy covering each other's failure modes. - Practical implications: don't skip either habit, let them specialize (diary = release notes; mine = git log). README.md gets a brief teaser in the 'First mine' area with a link to the canonical ARCHITECTURE.md section — enough for a skim reader to decide they want to keep writing diaries, and for a deep reader to know where to go for the full explanation. SKILL.md replaces the three-line 'Relationship to the mempalace skill' note with a compact version of the comparison table and a direct call-out of the 'session mining means I don't need diaries' misconception agents fall into. Points agents at ARCHITECTURE.md §5 for the full treatment when users ask the question. Cross-references verified: anchor slug for the new section is #diary-vs-session-mine-why-keep-both (standard slug rules: colon, spaces, punctuation removed/hyphenated). Both linking docs use the matching fragment.
2026-04-30 08:56:20 +00:00
parent 349a3a3d3d
commit 2f703a8ebc
3 changed files with 62 additions and 6 deletions
@@ -224,7 +224,46 @@ That makes the routine worth codifying:
 | Wind-down diary write | Agent session end | Agent, during session |
 | `mempalace-session` mine | Between sessions (manual or scheduled) | Operator or automation |

-The first two are live; the third is batched. They're complementary, not alternatives. A machine doing only wake-up/wind-down keeps a diary but loses the actual conversation turns. A machine doing only `mempalace-session` captures the raw turns but not the curated summaries. Do both.
+The first two are live; the third is batched. They're complementary, not alternatives. The next subsection explains why both matter.
+
+#### Diary vs session mine: why keep both?
+
+A reasonable question: *"if every session is mined into `wing_conversations` anyway, what's the point of the agent also writing a diary entry?"* They're not redundant. They answer different questions and cover each other's failure modes.
+
+|  | Session mine (`wing_conversations`) | Diary (`wing_<agent>`) |
+|---|---|---|
+| Content | Every turn verbatim — prompts, responses, tool calls, dead ends, typos | Curated summary — what was decided, discovered, left pending |
+| Granularity | One session ≈ 50–200 drawers | One session ≈ 1 drawer |
+| Compression | None (raw JSONL → normalized turns) | High (AAAK dialect — dots + pipes + entity codes, ~30× reduction) |
+| Written by | Nothing — extracted from `opencode.db` | The agent that lived the session, at wind-down |
+| Signal density | High noise (wrong turns, corrections, `/exit`'d threads) | High signal (agent's editorial judgment of what mattered) |
+| Retrieval pattern | Semantic search (`mempalace_search("topic X")`) | Recency scan (`mempalace_diary_read(last_n=5)`) |
+| Answers the question | *"What did we say exactly?"* | *"What did we accomplish / learn / decide?"* |
+
+The distinguishing property of a diary entry is **editorial judgment by the author**. The diary captures things that were *never said aloud during the session* — meta-observations the agent made about the session as a whole:
+
+- *"this pattern came up again, worth remembering"*
+- *"user caught the bug before I shipped it — lesson: verify CLI examples against `--help` first"*
+- *"10 commits across 3 repos today, all pushed"*
+- *"healthy interruption: user stopped me before a long-running step"*
+
+These are thoughts *about* the session, not utterances *during* it. Mining the raw turns will never surface them because the exact words were never spoken — they're the agent's reflection at wind-down.
+
+**Three scenarios where the distinction matters in practice:**
+
+1. **Wake-up token economics.** Reading `mempalace_diary_read(last_n=5)` returns five dense drawers, maybe 1–2k tokens total, 100% signal. Matching that orientation from the session mine would require semantic-searching for recent topics and reading chunks of raw turns — hundreds of drawers, tens of thousands of tokens, 90% noise.
+
+2. **"What did we decide?" vs. "what did we say?"** If you ask *"when did we decide to split `mempalace-toolkit` from `cli_utils`?"* the diary gives you the crisp answer (date, trigger, rationale). The session mine gives you the actual seven-turn conversation that led up to the decision, including the turns where alternatives were considered. Both useful; different questions.
+
+3. **Redundancy as safety.** If the agent `/exit`s without writing a diary (heuristic save missed it, no upstream hook), the session mine still catches the raw content. If `mempalace-session` hasn't run this week, the diary still captures the session's essence. The two systems cover each other's failure modes.
+
+**Practical implications for how you work with mempalace:**
+
+- **Don't skip diary writing** just because sessions are mined. A session without a diary entry is a session the next agent can read word-for-word but has no compressed summary of — expensive to orient against.
+- **Don't skip session mining** just because agents write diaries. Diaries miss content (especially on `/exit`), and semantic search over raw turns is valuable when "what did we say exactly?" is the right question.
+- **Do both, and let them specialize.** Treat the diary as your release notes (editorial, curated, recency-scanned) and the session mine as your git log (raw, searchable, complete). A repo keeps both; so should the palace.
+
+If anything, automating session mining *increases* the value of diary entries. The agent can focus the diary on the parts mining cannot capture — meta-observations, self-critique, pattern noticing, pending work — rather than re-stating content the mine already has.

 #### Automation