Docs: explain diary vs session mine (why keep both)

Automated session mining could plausibly lead a user (or a future
agent) to conclude that writing diary entries is redundant — mining
captures every turn, so why also write a compressed summary at
wind-down? That conclusion is wrong, and it's worth explaining why
in the docs so both disciplines survive.

ARCHITECTURE.md §5 gets a new subsection 'Diary vs session mine:
why keep both?' that presents this as a first-class concept:

- Comparison table — content, granularity, compression, authorship,
  signal density, retrieval pattern, and the question each answers.
- The defining property of a diary entry: editorial judgment by the
  author. Captures meta-observations that were never said aloud
  during the session (lessons, patterns, pending items, aggregate
  counts). Mining raw turns can never surface these because the
  words don't exist verbatim.
- Three practical scenarios where the distinction bites: wake-up
  token economics, 'what did we decide' vs 'what did we say',
  redundancy covering each other's failure modes.
- Practical implications: don't skip either habit, let them
  specialize (diary = release notes; mine = git log).

README.md gets a brief teaser in the 'First mine' area with a link
to the canonical ARCHITECTURE.md section — enough for a skim reader
to decide they want to keep writing diaries, and for a deep reader
to know where to go for the full explanation.

SKILL.md replaces the three-line 'Relationship to the mempalace
skill' note with a compact version of the comparison table and a
direct call-out of the 'session mining means I don't need diaries'
misconception agents fall into. Points agents at ARCHITECTURE.md
§5 for the full treatment when users ask the question.

Cross-references verified: anchor slug for the new section is
#diary-vs-session-mine-why-keep-both (standard slug rules: colon,
spaces, punctuation removed/hyphenated). Both linking docs use the
matching fragment.
This commit is contained in:
Joakim Persson
2026-04-30 08:56:20 +00:00
parent 349a3a3d3d
commit 2f703a8ebc
3 changed files with 62 additions and 6 deletions
+40 -1
View File
@@ -224,7 +224,46 @@ That makes the routine worth codifying:
| Wind-down diary write | Agent session end | Agent, during session |
| `mempalace-session` mine | Between sessions (manual or scheduled) | Operator or automation |
The first two are live; the third is batched. They're complementary, not alternatives. A machine doing only wake-up/wind-down keeps a diary but loses the actual conversation turns. A machine doing only `mempalace-session` captures the raw turns but not the curated summaries. Do both.
The first two are live; the third is batched. They're complementary, not alternatives. The next subsection explains why both matter.
#### Diary vs session mine: why keep both?
A reasonable question: *"if every session is mined into `wing_conversations` anyway, what's the point of the agent also writing a diary entry?"* They're not redundant. They answer different questions and cover each other's failure modes.
| | Session mine (`wing_conversations`) | Diary (`wing_<agent>`) |
|---|---|---|
| Content | Every turn verbatim — prompts, responses, tool calls, dead ends, typos | Curated summary — what was decided, discovered, left pending |
| Granularity | One session ≈ 50200 drawers | One session ≈ 1 drawer |
| Compression | None (raw JSONL → normalized turns) | High (AAAK dialect — dots + pipes + entity codes, ~30× reduction) |
| Written by | Nothing — extracted from `opencode.db` | The agent that lived the session, at wind-down |
| Signal density | High noise (wrong turns, corrections, `/exit`'d threads) | High signal (agent's editorial judgment of what mattered) |
| Retrieval pattern | Semantic search (`mempalace_search("topic X")`) | Recency scan (`mempalace_diary_read(last_n=5)`) |
| Answers the question | *"What did we say exactly?"* | *"What did we accomplish / learn / decide?"* |
The distinguishing property of a diary entry is **editorial judgment by the author**. The diary captures things that were *never said aloud during the session* — meta-observations the agent made about the session as a whole:
- *"this pattern came up again, worth remembering"*
- *"user caught the bug before I shipped it — lesson: verify CLI examples against `--help` first"*
- *"10 commits across 3 repos today, all pushed"*
- *"healthy interruption: user stopped me before a long-running step"*
These are thoughts *about* the session, not utterances *during* it. Mining the raw turns will never surface them because the exact words were never spoken — they're the agent's reflection at wind-down.
**Three scenarios where the distinction matters in practice:**
1. **Wake-up token economics.** Reading `mempalace_diary_read(last_n=5)` returns five dense drawers, maybe 12k tokens total, 100% signal. Matching that orientation from the session mine would require semantic-searching for recent topics and reading chunks of raw turns — hundreds of drawers, tens of thousands of tokens, 90% noise.
2. **"What did we decide?" vs. "what did we say?"** If you ask *"when did we decide to split `mempalace-toolkit` from `cli_utils`?"* the diary gives you the crisp answer (date, trigger, rationale). The session mine gives you the actual seven-turn conversation that led up to the decision, including the turns where alternatives were considered. Both useful; different questions.
3. **Redundancy as safety.** If the agent `/exit`s without writing a diary (heuristic save missed it, no upstream hook), the session mine still catches the raw content. If `mempalace-session` hasn't run this week, the diary still captures the session's essence. The two systems cover each other's failure modes.
**Practical implications for how you work with mempalace:**
- **Don't skip diary writing** just because sessions are mined. A session without a diary entry is a session the next agent can read word-for-word but has no compressed summary of — expensive to orient against.
- **Don't skip session mining** just because agents write diaries. Diaries miss content (especially on `/exit`), and semantic search over raw turns is valuable when "what did we say exactly?" is the right question.
- **Do both, and let them specialize.** Treat the diary as your release notes (editorial, curated, recency-scanned) and the session mine as your git log (raw, searchable, complete). A repo keeps both; so should the palace.
If anything, automating session mining *increases* the value of diary entries. The agent can focus the diary on the parts mining cannot capture — meta-observations, self-critique, pattern noticing, pending work — rather than re-stating content the mine already has.
#### Automation