Commit Graph

4 Commits

Author SHA1 Message Date
joakimp 6352373a1f fix(feeders): make post-mine repair opt-in, not default
The three feeder wrappers (mempalace-docs, mempalace-pi-session,
mempalace-session) unconditionally ran 'mempalace repair --yes' after
mining, controllable only via --no-repair opt-out. The contrib launchd
and systemd templates did not pass --no-repair, so every scheduled tick
invoked the destructive in-place HNSW rebuild.

This has bitten us twice:
  - 2026-05-04 09:08: a kickstart triggered repair while an MCP
    subprocess held the DB open; the live collection was wiped (0
    drawers) and had to be restored from the palace.backup snapshot.
  - 2026-05-05 10:00: post-mine repair crashed mid-rebuild with
    'NotFoundError: Collection [<uuid>] does not exist' - chromadb's
    rebuild recreated the collection under a new UUID while the code
    still held the old handle. Live DB survived only by luck (crash
    hit before the swap).

Fix: flip the default.
  - New flag: --repair (opt-in). Prints a warning and sleeps 3s before
    invoking 'mempalace repair --yes'.
  - --no-repair is retained as a deprecated no-op alias for backward
    compatibility with any scripts/units still passing it.
  - Default behavior: no repair. Routine ChromaDB add() keeps HNSW
    consistent; repair is a recovery op, not a maintenance tick.

Docs updated to match: README, SKILL, ARCHITECTURE, AGENTS,
contrib/README. Scheduling guidance now explicitly warns against
enabling --repair on cron/launchd/systemd-timer runs.
2026-05-05 12:35:04 +02:00
joakimp 14d253f929 feat(session): tag opencode staging headers with '| source: opencode'
Complement to the mempalace-pi-session feeder: now that a second source
mines into wing_conversations, every session's synthetic header carries
an explicit source tag so the LLM can discriminate at read time when
searches return first-chunk content:

  [session: <title> | <directory> | <YYYY-MM-DD> | source: opencode]

The primary disambiguator in search results remains source_file basename
(opencode: '<slug>_<id>.jsonl', pi: 'pi_<uuid>.jsonl'), which is present
in every chunk's metadata regardless of where the search hit landed in
the session. This header tag is a cosmetic second signal on first-chunk
hits.

Caveat: existing drawers keep their old header — mempalace mine dedups
by source_file path, which didn't change, so old opencode sessions are
not re-mined. They are implicitly opencode (the only pre-pi source).
2026-05-05 08:48:27 +02:00
Joakim Persson 349a3a3d3d mempalace-session: make --dry-run dedup-aware
A --dry-run report showed all qualifying sessions without indicating
which would actually hit the palace on a real run. On a second run
against an already-mined corpus this was misleading — output said
'Exported 62 session(s)' but the real mine step would skip all 62.

The wrapper now queries the palace's chroma.sqlite3 (read-only, via
file:...?mode=ro URI) for source_file values under the staging dir,
then tags each exported session as [NEW] or [SKIP] during listing and
reports the split in the summary:

  Exported 62 session(s) to ~/.cache/mempalace-session/wing_conversations
    0 new   → will be filed on mine
    62 already filed → will be skipped (dedup by source_file)

  --dry-run: no new sessions to mine. A real run would skip all 62.

Implementation notes:
- Classification is best-effort. If the palace is unreachable (fresh
  install, moved, permission-denied, file missing) the wrapper falls
  back to treating all exports as NEW — the real mine step still
  delegates dedup to 'mempalace mine --mode convos' which is the
  authoritative source of truth. Getting the classification wrong
  in --dry-run is cosmetic; behaviour of a real run is unchanged.
- Palace path respects $MEMPALACE_PATH env var for non-default setups.
- Same classification also shown on a real (non-dry-run) mine so users
  see upfront how much of the export set is actually new before the
  miner runs.

Verified both directions:
- All-already-filed case (current box, 62 sessions in palace): reports
  0 new, 62 skipped. --dry-run message correctly says 'would skip all'.
- Partial case (simulated by deleting one session's metadata from
  palace): reports 1 new, 61 skipped. --dry-run message correctly
  says 'would file 1 new'. Palace was restored from backup
  immediately after the test.

README and SKILL.md both updated with the new dedup-aware output and
a direct answer to the FAQ 'will it mine the same sessions again?'
2026-04-30 08:33:36 +00:00
Joakim Persson 954c3f2ebb Initial commit — split out from cli_utils
Producer-side MemPalace tooling: two bash wrappers that bridge opencode
session history and project documentation into the palace. Originally
developed in cli_utils (2026-04-28); split into its own repo on
2026-04-30 because the conceptual fit was weak — cli_utils is
interactive shell tooling, while this is agent memory infrastructure
with its own architecture, dependency surface, and growth trajectory.

Contents:
- bin/mempalace-docs — docs-only mining wrapper (originally a2ddcc9 in
  cli_utils), bridges the gap until MemPalace PR #1213 (exclude_patterns)
  merges upstream.
- bin/mempalace-session — opencode → palace session bridge (originally
  dacca0e in cli_utils). Reads ~/.local/share/opencode/opencode.db,
  exports each session to Claude Code JSONL, mines via
  'mempalace mine --mode convos'. Bridges the gap until opencode
  session-stopping hooks + an opencode harness in hooks_cli.py land
  upstream.
- ARCHITECTURE.md — canonical spec: architecture diagram, component
  details, setup recipe, operational notes, upstream-retirement
  roadmap. Originally a4cf314 in cli_utils.
- SKILL.md — companion agent skill (producer side). Pairs with the
  consumer-side mempalace skill. Symlinked into
  ~/.agents/skills/opencode-mempalace-bridge/ by install.sh.
- install.sh — idempotent installer, also handles --uninstall.
- AGENTS.md — repo conventions.

History of the individual files is not preserved in this split; see
cli_utils (gitea.jordbo.se/joakimp/cli_utils) commits a2ddcc9, dacca0e,
and a4cf314 for the original authorship context.
2026-04-30 05:30:04 +00:00