Skip to content

Releases: mnemos-dev/mnemos

v1.2.1 — Refine race + identity isolation + SessionEnd persistence hot-fix

01 May 13:31

Choose a tag to compare

Changelog

All notable changes to Mnemos are documented here. For the narrative
version of how the project evolved across paradigms, see
HISTORY.md.

v1.2.1 (rolling) — SessionEnd Persistence + X-Close Coverage Hot-Fix (2026-04-29)

Bug

The user reported that on every fresh session the injected briefing was
one /exit cycle stale, despite the previous session having refined
its transcript correctly. Two related root causes:

  1. mnemos.session_end_hook._run_brief_regen and
    _run_identity_refresh_if_due both invoked claude --print /mnemos-briefing <cwd> (and /mnemos-identity-refresh) with
    subprocess.call(..., stdout=subprocess.DEVNULL). The skill contract
    is "emit body to stdout, the wrapper persists" — but the wrapper was
    piping stdout to /dev/null, so the briefing body and refreshed
    identity body were both discarded. Cwd cache and
    _identity/L0-identity.md only ever updated via the next
    SessionStart's bg catchup, which is by-design async and therefore
    one-session-late on every /exit.
  2. mnemos.recall_briefing.handle_session_start's sync fallback fired
    only on pending and not cache_exists. With a stale cache present,
    X-close (or any partial-fail SessionEnd) left the next SessionStart
    to inject the stale body and defer regen to bg catchup — same
    one-session-late symptom on every X-close cycle.

Same-class bug as the v1.2.1 stale-OK skip (LLM output never reached the
vault). Surfaced empirically because v1.2.0 + v1.2.1 had eliminated the
other LLM-output-loss paths, leaving SessionEnd persistence as the
remaining gap.

Fix

  • _run_brief_regen(cwd, vault) calls
    recall_briefing.brief_and_cache, which wraps the skill subprocess
    via run_brief_sync (stdout → tempfile → write_cache) so the body
    lands in <vault>/.mnemos-briefings/<cwd-slug>.md.
  • _run_identity_refresh_if_due calls mnemos.identity.refresh(vault, force=False) directly. refresh() uses _invoke_claude_print with
    capture_output=True, writes the refreshed body atomically to
    _identity/L0-identity.md, and snapshots a history copy under
    _identity/_history/. The auto_refresh and refresh_min_days
    gates remain in the worker; delta and tag-relevance gates live
    inside refresh().
  • handle_session_start sync fallback gate relaxed from pending and not cache_exists to pending. Pending JSONLs always trigger sync
    refine + brief at SessionStart regardless of cache state.
    MIN_USER_TURNS (default 3) inside
    find_unrefined_jsonls_for_cwd already filters trivial sessions out
    of pending, so this never fires for noise.
  • recall_briefing._strip_briefing_preamble defensive scan: trim any
    LLM preamble emitted before the first **Label:** bold line.
    Reproduced empirically in kasamd's C--Projeler-mnemos-v1-1.md
    cache, where "I have all the layers I need. Let me synthesize the briefing." had bled in despite the skill prompt's "Start directly
    with **Current State:**" instruction.

Four new tests:
test_worker_brief_regen_writes_cache_to_disk,
test_worker_identity_refresh_persists_to_disk,
test_brief_and_cache_strips_llm_preamble,
test_return_visit_pending_runs_sync_fallback_even_with_stale_cache
(rename + retarget of the prior ..._injects_and_bg_catches_up test
that asserted the now-deprecated stale-cache-inject behavior). Two
existing test monkeypatches updated for the new (cwd, vault)
signature on _run_brief_regen. Suite: 556 pass (+3 net vs 553).

v1.2.1 (rolling) — Stale-OK Skip Hot-Fix (2026-04-28)

Bug

A second class of refine-pipeline data loss surfaced after the
duplicate-refine race fix shipped: when a JSONL had been refined
while still active (e.g. a sibling SessionStart auto_refine
picked it during an idle window, marking the ledger OK), any later
/exit for that same session silently skipped re-refining via
claim_jsonl_for_refine's ledger=OK gate — losing every byte the
user had typed after the premature refine. The author's planning
session for the v1.2.1 PyPI publish reproduced this: ~6 hours of
post-refine work survived only in the raw JSONL.

Fix

mnemos.session_end_hook.supersede_stale_refine_if_needed runs
inside the SessionEnd worker before claim_jsonl_for_refine. When
the JSONL's mtime exceeds the prior Sessions/<date>-<slug>.md's
mtime by more than 60 s (typical /resume gap), the helper:

  • Renames the stale Session/.md to
    <name>.bak-superseded-<utc-iso> (preserves the partial summary
    for archaeology).
  • Drops the OK row from the ledger so the per-JSONL filelock can
    re-grant a fresh claim.

The follow-up refine then writes a Session/.md describing the full
final transcript. SKIP rows are left untouched (sticky decisions).
Six new tests in tests/test_session_end_supersede.py cover the
threshold, missing-prior-entry, deleted-Session/.md, and SKIP-row
cases.

v1.2.1 — Refine-Pipeline Race + Identity Isolation Hot-Fix (2026-04-28)

Spec: docs/specs/2026-04-28-v1.2.1-duplicate-refine-race.md

Bug

Three independent code paths could call
claude --print /mnemos-refine-transcripts <jsonl> concurrently for
the same JSONL: the SessionEnd worker (graceful /exit path), the
SessionStart recall_briefing --catchup (CASE B X-close fallback),
and the SessionStart auto_refine_hook (cross-project backlog
scan). With no per-JSONL coordination, the LLM produced
non-deterministic slugs and two parallel writes either:

  • converged on the same slug → second write overwrote the first
    → silent data loss (one summary disappeared, the user observed
    "a file briefly appeared then vanished"); or
  • diverged on different slugs → two Sessions/<date>-<slug>.md
    siblings persisted (visible duplicate).

The race also corrupted the ledger TSV (concurrent open(…, "a")
appends spliced TAB columns together, producing unparseable lines).

This was a pre-existing v1.1.0 bug; v1.2.0's locale-aware F6.3
empirical smoke surfaced it.

Fix

Three hooks coexist by design — graceful /exit fires
SessionEnd; X-close skips SessionEnd and the next SessionStart's
safety nets catch up. Removing any of them would re-open the
X-close coverage gap. The fix makes that coexistence safe:

  • mnemos/refine_lock.py:claim_jsonl_for_refine(jsonl, ledger)
    — pre-check the ledger (fast skip on existing OK/SKIP), acquire
    filelock.FileLock at <ledger-dir>/locks/<stem>.lock with
    timeout=0 (fail-fast), recheck the ledger inside the lock so
    workers that waited briefly behind a finisher observe the OK
    entry and bail. Returns a context manager on success, None on
    any skip path.
  • All three callers funnel through the gate:
    session_end_hook._run_refine,
    recall_briefing.run_refine_sync (catchup path),
    auto_refine.run (per-picked-JSONL loop).
  • mnemos refine-ledger --normalize — one-shot CLI to repair
    ledgers corrupted before v1.2.1. Drops malformed lines (not
    exactly 3 TAB columns), dedups same-path entries (OK supersedes
    SKIP, last-seen wins among same-status), optionally drops
    entries whose JSONL no longer exists (--validate-paths).
    Atomic via tmp+rename. --dry-run previews counts without
    writing.

Tests

  • 11 new tests in tests/test_refine_lock.py covering: happy-path
    claim, pre-acquire skip on existing ledger entry, post-acquire
    recheck on race-finished ledger entry, per-stem isolation, 10-
    thread concurrency stress (exactly one winner), and four
    normalize cases (dedup, OK-over-SKIP, TAB-corrupted-line drop,
    missing-path drop) + atomic-write contract.
  • 3 new tests in tests/test_cli_refine_ledger.py covering the
    CLI happy path, --dry-run, and missing-ledger error.
  • Total suite: 543 passed, 2 skipped, 3 deselected (was 529 at
    start of v1.2.1 work; +14 new).

Identity bootstrap follow-up fixes (same-day)

Empirical bootstrap pilot on kasamd surfaced three more bugs in the
identity pipeline that v1.2.1 also fixes:

  • identity._invoke_claude_print did not strip ANTHROPIC_API_KEY.
    Hard-invariant violation — every other claude --print site (auto_refine,
    recall_briefing, session_end_hook) strips the key so the call falls
    through to subscription quota. This one site was missed in v1.0/v1.1
    and silently routed bootstrap + refresh through API credits when the
    user had ANTHROPIC_API_KEY set as an env var. Surface symptom:
    claude --print failed (exit 1): with empty stderr (after the fix the
    error also includes stderr in the message). Fix: copy env, pop the
    key, pass to subprocess.run.
  • Bootstrap was contaminated by parent cwd + SessionStart hooks.
    The nested claude --print inherited the parent's cwd, which loaded
    the project CLAUDE.md and let recall_briefing's SessionStart hook
    inject its briefing context. The LLM treated the bootstrap prompt as
    a continuation of the parent dev conversation and emitted a chat
    summary instead of the seven-section profile. Fix:
    _invoke_claude_print now sets MNEMOS_RECALL_HOOK_ACTIVE=1 to
    short-circuit recall_briefing's re-entry, and runs from a fresh
    tempfile.TemporaryDirectory so no project context leaks in.
  • docs/prompts/identity-bootstrap.md OUTPUT section strengthened.
    The original "Only the markdown body to stdout" was too subtle;
    rewritten as a four-bullet "strict" contract (no tools, no chat,
    start with --- frontmatter, fall back to minimal stubs rather
    than refusing).

Plus a small new feature shipped in the same fix wave:

  • mnemos identity bootstrap --limit N — pilot mode that restricts
    input to the most-recent N Sessions. Useful for validating prompt
    quality on a small subset before committing the full ...
Read more

v1.1.0 — SessionEnd-Driven Memory

27 Apr 09:41

Choose a tag to compare

📜 For the broader story — how Mnemos started as an atomic-fragmentation bet, why measurements pivoted it to narrative-first, and how SessionEnd became the trigger of choice — see HISTORY.md.


Spec: docs/specs/2026-04-26-v1.1.0-sessionend-driven-memory-design.md
Plan: docs/plans/2026-04-26-v1.1.0-sessionend-driven-memory.md

Issue 1 — Refine pipeline configurability + Settings TUI

  • New mnemos settings numbered TUI for unified config (20 fields + refinement progress).
  • Configurable refine batch size (refine.per_session, default 3).
  • Configurable refine direction (refine.direction, default newest).
  • Configurable noise floor (refine.min_user_turns, default 3).
  • mnemos init now includes a quota dialog (subscription cost reality + per-session config) before writing yaml.

Issue 2 — Identity bootstrap + auto-refresh

  • Identity bootstrap eligibility gate (identity.bootstrap_threshold_pct, default 25%).
  • Auto-refresh from SessionEnd worker (identity.auto_refresh, identity.refresh_session_delta, identity.refresh_min_days).
  • New skill mnemos-identity-refresh for delta-based identity update.
  • Bootstrap + refresh prompts gain GOOD/BAD/EDGE classification examples + final self-check.

Issue 3 — Briefing readiness gate

  • New config: briefing.readiness_pct (default 60%) — below threshold the SessionStart inject path is silent (avoids anchoring the AI on partial history).

Issue 4 — Smart-layered revision-aware briefing

  • Briefing prompt rewritten as v3: previous brief as anchor + all-cwd Sessions decision-only + recent 5 sessions full body.
  • Revision detection: contradicting decisions explicitly marked in "Revize/iptal edilen kararlar".
  • Token budget raised to 25K hard cap with priority-driven truncation.

Issue 5 — In-session briefing usage

  • New config: briefing.show_systemmessage (default true) — visible "Mnemos: briefing loaded · N sessions" line at session start.
  • New config: briefing.enforce_consistency (default true) — prepends a cross-check directive to additionalContext so Claude pauses when the user contradicts an established decision.

Architectural foundation

  • NEW: SessionEnd hook + detached worker (mnemos.session_end_hook).
    • Hook returns under 100 ms (fits Claude Code's 5 s X-close grace window).
    • Worker uses CREATE_BREAKAWAY_FROM_JOB (Windows) / start_new_session (POSIX) to survive Claude Code termination.
    • 3-stage sequential pipeline: refine THIS transcript -> regen brief -> identity refresh check.
  • NEW: SessionStart sync fallback for missed SessionEnd cases (mid-stream X-close, kill -9).
  • NEW: CASE A first-visit vault-aware sync brief — if the vault already has Sessions for the cwd, brief inline instead of staying silent.
  • NEW: mnemos install-end-hook CLI (atomic install/uninstall, idempotent, surgical).
  • NEW: mnemos/readiness.py helpers — count_eligible_jsonls, count_refined_sessions, compute_readiness_pct, per_cwd_readiness.

Hard invariant

No Anthropic API calls anywhere. All LLM operations route through claude --print subscription quota. CI grep enforces. _child_env() strips ANTHROPIC_API_KEY from every spawned subprocess.

Bug fixes

  • Re-entry guard placement regression coverage carried over from the v1.0 a19cfb9 lesson — the SessionEnd worker has its own guard test ensuring --worker mode bypasses HOOK_ACTIVE_ENV.
  • Briefing junction ~/.claude/skills/mnemos-briefing re-pointed at the v1.1 worktree path (post-migration cleanup) so the canonical-prompt zero-drift test no longer skips.

Test coverage

  • 65+ new tests; suite pass count grows from 455 (v1.0 baseline) to 527 with v1.1 G1-G10 implemented (G12 empirical validation pending).

v0.3.3 — Post-v0.3.2 cleanup

19 Apr 07:03

Choose a tag to compare

Close four deferred follow-ups from the v0.3.1 and v0.3.2 pilots so the
tree is green before Phase 1 starts. No new features — UX + reliability only.

Highlights

mnemos migrate is now atomic

  • New MigrateError + rollback path: if rebuild on the new backend fails, the old backend's storage, mine_log, and mnemos.yaml are restored and the partial new-backend storage is wiped. Ctrl+C-safe (BaseException guard).
  • Concurrent migrates on the same vault are blocked by .migrate.lock.flock (filelock, advisory, auto-releases on process exit — no stale-file trap).

Dry-run estimate readable on tiny vaults

  • ~0–0 minutes~2–3 seconds / ~29–54 seconds below the minute threshold. Single-drawer vaults floor at 1 second instead of rounding to zero.

sqlite-vec score display parity with ChromaDB

  • _l2_to_score = 1 − L2 / 2 instead of _l2_to_cosine_sim = 1 − L2² / 2. Both monotonic in L2 distance → ranking unchanged, benchmark recall identical. The linear form places sqlite-vec scores in the same visual 0.3–0.7 band as ChromaDB, so users stop mistaking 0.016 for a broken index.

Test suite finishes in ~5 min

  • slow pytest marker is now registered in pyproject.toml, --strict-markers catches typos, default addopts = "-m 'not slow'". test_write_without_close_can_lose_hnsw_segments is tagged slow and its subprocess timeout is bumped 300 s → 600 s. Full run: pytest tests/ --override-ini="addopts=" or pytest -m slow.

Tests

  • +7 new unit tests across tests/test_migrate.py covering rollback, mine_log restore, lock contention, and four format_estimate edge cases.
  • Full default suite: 463 passed, 2 skipped, 3 deselected.

Install

pip install --upgrade mnemos-dev

PyPI: https://pypi.org/project/mnemos-dev/0.3.3/

v0.3.2 — Palace Hygiene

18 Apr 19:09

Choose a tag to compare

Pipeline hygiene fixes for wing/room/drawer production, plus atomic mnemos mine --rebuild. See CHANGELOG for the full list.

Highlights

Pipeline hygiene (Group A)

  • Wing canonicalization normalizes Turkish diacritics + delimiters (Satin / Satın / Satin-Alma-Otomasyonu → same wing)
  • Lazy hall / _wing.md / _room.md creation — no more empty wings or phantom rooms in the graph view
  • tags[0] no longer promoted to room name (room comes from folder/keyword detection only)
  • Drawer filenames use source date, not mining date (kills YYYY-MM-DD-YYYY-MM-DD- double prefix)
  • Drawer body gains # <smart-title> H1 + > Source: [[wikilink]] — real graph node titles
  • Entity hygiene — no frontmatter tags as entities, case-preserve dedup

Atomic rebuild (Group B)

  • mnemos mine --rebuild is now genuinely atomic: resolve → plan → dry-run gate → confirm → lock → backup (wings/index/graph) → drop_and_reinit → re-mine → verify → rollback on failure
  • --dry-run / --yes / --no-backup flags; path arg optional (auto-discovers Sessions/ + Topics/ or reads mining_sources)
  • SearchBackend.drop_and_reinit() on both backends; Palace.backup_wings() + KnowledgeGraph.reset()
  • Auto-refine hook respects .rebuild.lock.flock (no concurrent rebuild vs session-start collisions)

Distribution-ready memory-source handling

  • MEMORY.md + leading-underscore .md files skipped by the miner
  • mnemos import markdown|memory <path> now persists to mnemos.yaml (+ external flag)
  • _resolve_sources is additive — UNION of auto-discovered + configured, never replacement

Author-vault state after release: 683 drawers, 16 wings, 5 external mining_sources, source sha256 unchanged (rebuild never mutates Sessions/Topics).

PyPI: https://pypi.org/project/mnemos-dev/0.3.2/

v0.3.1 — Backend UX

17 Apr 18:27

Choose a tag to compare

First-class discovery, migration, and corruption recovery for the two vector backends mnemos has been shipping since v0.2: ChromaDB (default) and sqlite-vec. A 2026-04-17 parity benchmark confirmed they produce identical recall at R@5=0.90 (LongMemEval 10q, 8027 drawers) — so the user-facing question is now reliability/environment fit, not quality.

New

  • mnemos init asks which backend to use (i18n TR+EN, Windows + Python 3.14 platform hint)
  • mnemos migrate --backend <name> — safe switch with pre-flight plan, --dry-run, dated backups (never overwritten), rebuild from source markdown
  • mnemos status now includes a one-line backend summary: Backend: sqlite-vec (search.sqlite3 · 8027 drawers · 42.3 MB)
  • BackendInitError — runtime corruption surfaces as a single-line "use mnemos migrate --backend <other>" suggestion instead of a traceback
  • README Troubleshooting section + hero tweak, CONTRIBUTING backend-count architectural line

Fixed

  • Miner regression: bulk indexer now tolerates duplicate drawer IDs in one batch (was a crash on both backends after v0.2's bulk API)

Links

  • Canonical spec: docs/specs/2026-04-17-v0.3.1-backend-ux-design.md
  • Pilot report: docs/pilots/2026-04-17-v0.3.1-backend-pilot.md
  • Parity benchmark: benchmarks/results/20260417T162632_sqlite-vec_* + 20260417T162706_chromadb_*

Install

pip install --upgrade mnemos-dev

v0.3.0 — First-Run Experience

16 Apr 13:12

Choose a tag to compare

[0.3.0] — 2026-04-16 — First-Run Experience

Goal: Make the path from pip install to "my AI remembers my history" a single command.

Added

  • refine-transcripts Claude Code skill — bundled in the repo at skills/mnemos-refine-transcripts/, junctioned/symlinked into ~/.claude/skills/. Reads JSONL transcripts under ~/.claude/projects/, runs the canonical refinement prompt at docs/prompts/refine-transcripts.md, writes high-signal Sessions/-.md. Ledger-based resume; 5-piece pilot before full batches; zero LLM cost (runs inside the user's Claude Code session). (commit a74c10f)
  • .mnemos-pending.json schema + mnemos.pending module — atomic JSON state at vault root tracking per-source onboarding progress. PendingSource (status enum, file counts, last action), PendingState, load/save (atomic via tmp + os.replace) / upsert_source. Schema versioning + status enum validation in __post_init__. (commit 0783ba2)
  • mnemos init 5-phase onboarding wizard — replaces the legacy "mine vault now?" prompt. Phase 1: intro. Phase 2: discover Claude Code JSONL transcripts + vault Sessions//memory//Topics/ with file counts and time estimates. Phase 3: [A]ll / [S]elective / [L]ater choice. Phase 4: per-source process loop (curated → mine immediately, raw → register pending with refine-skill hint, skip / later branches). Phase 5: hook activation placeholder. Re-run safe via .mnemos-pending.json. (commit fc17751)
  • mnemos import <kind> command familyclaude-code (registers JSONL transcripts as pending, prints refine-skill instructions), chatgpt / slack (single-file JSON exports → mine), markdown / memory (curated .md directories → mine). Every import updates .mnemos-pending.json. Shared _mine_and_record helper consolidates the in-progress → handle_mine → done flow. (commit d9e97a9)
  • CLI i18n (mnemos.i18n) — locale-aware string lookup with TR + EN translations for intro, discovery prompts, choice options, per-source prompts, outcomes, and hook placeholder. t(key, lang, **fmt) + resolve_lang(cfg) API. Locale picked from mnemos.yaml's languages setting (first supported wins; EN fallback). Windows cp1252 console safe via auto stdout UTF-8 reconfigure in main(). (commit 0ddaae9)
  • mnemos install-hook SessionStart auto-refine hook — registers a ~/.claude/settings.json SessionStart entry that refines the last 3 unprocessed Claude Code transcripts in a detached background worker, then mines <vault>/Sessions/. Vault-root .mnemos-hook-status.json drives a live statusline; weekly backlog reminder via additionalContext. Subagent JSONLs filtered. filelock advisory locking prevents overlapping sessions from duplicating work. Strips ANTHROPIC_API_KEY from the claude --print subprocess so it falls back to the user's subscription quota (zero API cost). (commit 725d569, hardened in 512e3dd/138a4cf/4ad8505/47f58af/96aa07f)
  • mnemos install-statusline CLI — idempotently wires the auto-refine progress snippet into ~/.claude/settings.json. Two modes: append a fenced # --- mnemos-auto-refine-statusline --- block to a user-owned bash/.cmd statusline script (settings untouched) or fresh-install ~/.claude/mnemos-statusline.{sh,cmd} and point statusLine.command at it. --uninstall removes the block (and the owned script + statusLine key in fresh mode). .bak-YYYY-MM-DD backups. mnemos init prompts for it after the hook step (i18n TR+EN). (commit 15a21fa)
  • README repositioned around the Claude Code history use case — hero claim "Turn your Claude Code history into a searchable memory palace", refinement skill section, "Why Not Just Raw Transcripts?" comparison table. (commit 0fd64fc)
  • STATUS.md external status doc — single-glance "why does Mnemos exist, what works today, where the roadmap ends up". Linked from README header. (commit af6f60f)
  • CONTRIBUTING.md — dev setup, ROADMAP discipline, commit style, coding conventions, marker language addition guide, refinement skill workflow, four architectural lines that should not be crossed. (commit 4eef132)
  • Project-level CLAUDE.md — one-word mnemos resume protocol for Claude Code. (commit 655ce11)
  • Migration guide for legacy session-memory + mnemos-session-mine.py hooks — README §"Migrating from older session-memory setups" lists the exact files early adopters can remove now that the auto-refine hook captures everything automatically. CONTRIBUTING gains a sibling note so the legacy patterns don't sneak back into the repo. (commit 77f1b78)
  • New-user simulation pilot reportdocs/pilots/2026-04-16-new-user-pilot.md documents an end-to-end clean-vault run of the README onboarding (init → mining → search → install-hook → install-statusline), with what worked, what was caught, and what couldn't be tested from inside Claude Code. (commit d65384f)

Fixed

  • Auto-refine no longer flickers between sessionsauto_refine.run() returns silently on lock timeout instead of writing a destructive phase=busy over the lock-holder's refining 2/3 row. The SessionStart wrapper short-circuits subagent dispatches (transcript_path under /subagents/) so agent-heavy workflows don't spawn fresh bg workers. When there's nothing to refine, the bg skips mnemos mine and the wrapper writes no status. Wrapper writes phase=refining, current=0 directly (no starting snapshot). The idle render uses new last_outcome + last_finished_at fields to show mnemos: last refine Xm ago · N notes · OK · backlog Y for 10 minutes (was 30s). (commit ef69170)
  • Auto-refine no longer re-fires mid-conversation — wrapper whitelists SessionStart source values ({"", "startup", "resume", "clear"}) so auto-compaction (source=compact) and any future ephemeral event types short-circuit. pick_recent_jsonls(exclude=...) accepts the current session's own transcript_path from the hook input, so the in-progress conversation is never marked OK in the ledger before it actually ends — fixes the silent loss of post-refine turns. (commit d6cbeed)
  • mnemos search CLI showed wing=? hall=? for every hit — formatter read top-level r.get("wing"), but search results carry wing/hall under metadata. Caught by the new-user pilot, fixed by reading r.get("metadata") or {} first with ? fallback for ancient indexes. (commit d65384f)
  • install-hook and install-statusline were broken under pip install mnemos-dev — both wrote paths to <repo>/scripts/* resources that only existed in dev installs. Three changes: (1) auto_refine_hook.py moved into the package and invoked as python -m mnemos.auto_refine_hook (no filesystem path in settings.json); (2) statusline snippets moved to mnemos/_resources/ so they ship in the wheel; (3) _parse_existing_target recognises Git Bash POSIX paths (/c/Users/...) and _build_block picks bash-vs-cmd syntax from the target script's suffix, not the host OS — fixes a Windows + Git Bash regression where the appended block used rem/set/call inside a .sh script.

Tests

  • +99 new tests on top of v0.2.0's 226 (10 pending + 25 onboarding + 14 i18n + 9 install-hook + 13 install-statusline + 27 auto-refine behavior + 4 hook-script integration + 2 cli-search formatter). Full suite: 326 passed, 2 skipped.

Workflow

  • ROADMAP [ ] → [~] → [x] discipline — every task carries a commit hash and date when delivered. Delivered v0.1 + Phase 0 design/plan artifacts archived under docs/archive/. (commits 1394be5, 1dfeb66)

v0.2.0 — Phase 0: Foundation

13 Apr 10:55

Choose a tag to compare

Phase 0: Foundation — Raw Storage, Mining Overhaul, Benchmark

Goal: Match MemPalace's 96.6% recall without API calls by building the storage and mining foundation.

Highlights

🔍 Dual ChromaDB Collection — Raw verbatim content + classified fragments, merged via Reciprocal Rank Fusion (RRF). Search across both for maximum recall.

💬 5 Conversation Formats — Auto-detects and normalizes Claude Code JSONL, Claude.ai JSON, ChatGPT JSON, Slack JSON, and plain text.

⛏️ 10-Step Mining Pipeline — Format detection → normalization → prose extraction → exchange-pair chunking → room detection → entity detection → classification → scoring → indexing.

🏷️ 172 Mining Markers — 87 English + 85 Turkish markers across 4 halls (decisions, preferences, problems, events). Up from 49 in v0.1.

🏠 72+ Room Detection Patterns — 13 categories (frontend, backend, planning, testing, security, etc.) with folder path matching and keyword scoring.

👤 Heuristic Entity Detection — Two-pass person/project classification using weighted dialogue, action, code, and Turkish title signals.

🧹 Prose Extraction — Filters code blocks, shell commands, and low-alpha lines before mining to prevent embedding pollution.

📊 LongMemEval Benchmark — Same dataset as MemPalace (500 questions, 54 sessions) for apples-to-apples comparison. Includes dataset loader, Recall@K/NDCG metrics, and CLI runner.

New Features

  • mnemos_search now accepts collection parameter: "raw", "mined", or "both" (default)
  • $in / $nin metadata filters: search across multiple wings or exclude specific ones
  • mnemos mine --rebuild clears mine log and re-mines all sources
  • mnemos benchmark longmemeval runs the LongMemEval benchmark

Stats

  • 208 tests (up from 51)
  • 14 Python modules (up from 11)
  • 172 mining markers (up from 49)
  • 72+ room patterns (up from 0)

Install

pip install mnemos-dev==0.2.0

# With benchmark support
pip install "mnemos-dev[benchmark]==0.2.0"

What's Next (Phase 1: AI Engine)

  • Claude API mining for what regex can't catch
  • LLM-powered search reranking
  • Contradiction detection
  • Target: 96%+ → 100% recall

Full Changelog: v0.1.0...v0.2.0