fix(feeders): make post-mine repair opt-in, not default

The three feeder wrappers (mempalace-docs, mempalace-pi-session,
mempalace-session) unconditionally ran 'mempalace repair --yes' after
mining, controllable only via --no-repair opt-out. The contrib launchd
and systemd templates did not pass --no-repair, so every scheduled tick
invoked the destructive in-place HNSW rebuild.

This has bitten us twice:
  - 2026-05-04 09:08: a kickstart triggered repair while an MCP
    subprocess held the DB open; the live collection was wiped (0
    drawers) and had to be restored from the palace.backup snapshot.
  - 2026-05-05 10:00: post-mine repair crashed mid-rebuild with
    'NotFoundError: Collection [<uuid>] does not exist' - chromadb's
    rebuild recreated the collection under a new UUID while the code
    still held the old handle. Live DB survived only by luck (crash
    hit before the swap).

Fix: flip the default.
  - New flag: --repair (opt-in). Prints a warning and sleeps 3s before
    invoking 'mempalace repair --yes'.
  - --no-repair is retained as a deprecated no-op alias for backward
    compatibility with any scripts/units still passing it.
  - Default behavior: no repair. Routine ChromaDB add() keeps HNSW
    consistent; repair is a recovery op, not a maintenance tick.

Docs updated to match: README, SKILL, ARCHITECTURE, AGENTS,
contrib/README. Scheduling guidance now explicitly warns against
enabling --repair on cron/launchd/systemd-timer runs.
This commit is contained in:
2026-05-05 12:35:04 +02:00
parent 53d96adc65
commit 6352373a1f
8 changed files with 50 additions and 20 deletions
+14 -4
View File
@@ -36,7 +36,7 @@ AGENT="${USER:-mempalace}"
WING=""
SRC=""
DRY_RUN=0
NO_REPAIR=0
DO_REPAIR=0
# File patterns to include. Docs + config + intent-bearing scripts.
# Everything else (code) is excluded by omission.
@@ -77,7 +77,13 @@ Options:
--wing <name> Override wing name (default: source directory name)
--agent <name> Agent name recorded on drawers (default: $USER)
--dry-run List files that would be mined; do not file
--no-repair Skip `mempalace repair` after mining
--repair Run `mempalace repair` after mining (opt-in).
WARNING: repair does a destructive in-place HNSW
rebuild. If it races a live MCP connection or crashes
mid-rebuild, it can wipe the collection. Only pass
this from a quiet, interactive context. Not safe for
unattended cron/launchd schedules.
--no-repair (Deprecated; no-repair is now the default.)
-h, --help Show this help
What gets mined:
@@ -109,7 +115,8 @@ while [[ $# -gt 0 ]]; do
--wing) WING="${2:-}"; shift 2 ;;
--agent) AGENT="${2:-}"; shift 2 ;;
--dry-run) DRY_RUN=1; shift ;;
--no-repair) NO_REPAIR=1; shift ;;
--repair) DO_REPAIR=1; shift ;;
--no-repair) shift ;; # deprecated alias; no-repair is the default
--) shift; break ;;
-*) echo "error: unknown option: $1" >&2; usage >&2; exit 1 ;;
*) if [[ -z "$SRC" ]]; then SRC="$1"; shift; else echo "error: unexpected arg: $1" >&2; exit 1; fi ;;
@@ -258,8 +265,11 @@ if ! mempalace mine "$STAGE" --agent "$AGENT" --wing "$WING"; then
fi
# ── Repair index ─────────────────────────────────────────────────────
if [[ $NO_REPAIR -eq 0 ]]; then
if [[ $DO_REPAIR -eq 1 ]]; then
echo ""
echo "WARNING: --repair runs an in-place HNSW rebuild that has wiped"
echo " live palaces on past runs. Proceeding in 3 seconds..."
sleep 3
echo "Rebuilding HNSW index..."
mempalace repair --yes
fi
+14 -4
View File
@@ -51,7 +51,7 @@ SESSION_ID=""
SINCE=""
MIN_MESSAGES=3
DRY_RUN=0
NO_REPAIR=0
DO_REPAIR=0
PI_SESSIONS_DIR="${PI_SESSIONS_DIR:-$HOME/.pi/agent/sessions}"
# ── Usage ────────────────────────────────────────────────────────────
@@ -74,7 +74,13 @@ Options:
--dry-run Export + list; do not mine into palace. Each session
is tagged [NEW] or [SKIP] based on whether its
source_file is already in the palace.
--no-repair Skip `mempalace repair` after mining
--repair Run `mempalace repair` after mining (opt-in).
WARNING: repair does a destructive in-place HNSW
rebuild. If it races a live MCP connection or
crashes mid-rebuild, it can wipe the collection.
Only pass this from a quiet, interactive context.
Not safe for unattended cron/launchd schedules.
--no-repair (Deprecated; no-repair is now the default.)
-h, --help Show this help
Idempotency:
@@ -118,7 +124,8 @@ while [[ $# -gt 0 ]]; do
--agent) AGENT="${2:-}"; shift 2 ;;
--sessions-dir) PI_SESSIONS_DIR="${2:-}"; shift 2 ;;
--dry-run) DRY_RUN=1; shift ;;
--no-repair) NO_REPAIR=1; shift ;;
--repair) DO_REPAIR=1; shift ;;
--no-repair) shift ;; # deprecated alias; no-repair is the default
--) shift; break ;;
-*) echo "error: unknown option: $1" >&2; usage >&2; exit 1 ;;
*) echo "error: unexpected arg: $1" >&2; exit 1 ;;
@@ -463,8 +470,11 @@ if ! mempalace mine "$STAGE" --mode convos --wing "$WING" --agent "$AGENT"; then
fi
# ── Repair index ─────────────────────────────────────────────────────
if [[ $NO_REPAIR -eq 0 ]]; then
if [[ $DO_REPAIR -eq 1 ]]; then
echo ""
echo "WARNING: --repair runs an in-place HNSW rebuild that has wiped"
echo " live palaces on past runs. Proceeding in 3 seconds..."
sleep 3
echo "Rebuilding HNSW index..."
mempalace repair --yes
fi
+14 -4
View File
@@ -47,7 +47,7 @@ SESSION_ID=""
SINCE=""
MIN_MESSAGES=3
DRY_RUN=0
NO_REPAIR=0
DO_REPAIR=0
OPENCODE_DB="${OPENCODE_DB:-$HOME/.local/share/opencode/opencode.db}"
# ── Usage ────────────────────────────────────────────────────────────
@@ -69,7 +69,13 @@ Options:
--dry-run Export + list; do not mine into palace. Each session
is tagged [NEW] or [SKIP] based on whether its
source_file is already present in the palace.
--no-repair Skip `mempalace repair` after mining
--repair Run `mempalace repair` after mining (opt-in).
WARNING: repair does a destructive in-place HNSW
rebuild. If it races a live MCP connection or
crashes mid-rebuild, it can wipe the collection.
Only pass this from a quiet, interactive context.
Not safe for unattended cron/launchd schedules.
--no-repair (Deprecated; no-repair is now the default.)
-h, --help Show this help
Idempotency:
@@ -117,7 +123,8 @@ while [[ $# -gt 0 ]]; do
--agent) AGENT="${2:-}"; shift 2 ;;
--db) OPENCODE_DB="${2:-}"; shift 2 ;;
--dry-run) DRY_RUN=1; shift ;;
--no-repair) NO_REPAIR=1; shift ;;
--repair) DO_REPAIR=1; shift ;;
--no-repair) shift ;; # deprecated alias; no-repair is the default
--) shift; break ;;
-*) echo "error: unknown option: $1" >&2; usage >&2; exit 1 ;;
*) echo "error: unexpected arg: $1" >&2; exit 1 ;;
@@ -392,8 +399,11 @@ if ! mempalace mine "$STAGE" --mode convos --wing "$WING" --agent "$AGENT"; then
fi
# ── Repair index ─────────────────────────────────────────────────────
if [[ $NO_REPAIR -eq 0 ]]; then
if [[ $DO_REPAIR -eq 1 ]]; then
echo ""
echo "WARNING: --repair runs an in-place HNSW rebuild that has wiped"
echo " live palaces on past runs. Proceeding in 3 seconds..."
sleep 3
echo "Rebuilding HNSW index..."
mempalace repair --yes
fi