diff --git a/CHANGELOG.md b/CHANGELOG.md
index c4647fa..fac3733 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,31 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/).
 
 ## [Unreleased]
 
+## [0.7.5] — 2026-04-10
+
+### Added
+- **GraphRAG hybrid retrieval (KS61-KS64):** Full hybrid GraphRAG pipeline with label-graph navigation, 14 MCP tools, 517 tests
+- **GraphRAG Viz MVP (KS65):** Tauri + Sigma.js graph visualization app with 3 daemon endpoints and LOD architecture
+- **Schema-driven fact extraction (KS67):** LLM-based structured extraction with supersession for knowledge updates; 80% micro-benchmark recall
+- **Entity unification (KS73):** EntityFrame, EntityId-based supersession for structured entity tracking
+- **Configurable embedding (KS75):** EmbeddingProvider trait with 10 fastembed models and OpenAI API support
+- **Universal prompt (KS76):** Single prompt template for all reader models (no per-model tuning); temporal boost and 5-signal importance scoring
+- **Temporal boost (KS76):** Temporal-aware retrieval weighting for time-sensitive queries
+- **Multiplicative supersession demotion (KS78):** Superseded memories receive 0.40x multiplicative penalty (configurable)
+- **12 MCP tools:** Added `memory_graph`, `memory_related`, `memory_get` for graph navigation; `config_set` and `persist` for management
+
+### Changed
+- **MCP tool count:** 9 to 12 tools (graph navigation + management tools)
+- **Benchmark results:** 19/20 seeded micro-benchmark, 5/5 abstention, 3/3 NR, 24.2% LME-S (GPT-4o judge)
+- **Default reader model:** qwen2.5:1.5b for consolidation
+
+### Fixed
+- **Persistence format bug (Issue #16):** Format version mismatch causing MCP store/echo to fail
+- **KU-3 knowledge update (KS77):** Fixed retrieval for updated knowledge entries
+- **Temporal label dedup trap (KS77):** Prevent adverse dedup interaction when parent has temporal content
+- **Child memory pipeline rewrite (KS69):** Consolidation redesign fixing IE-1 and KU-1 categories
+- **Superseded_count declaration (KS78):** Variable declared in both temporal and standard code paths
+
 ## [0.7.0] — 2026-04-02
 
 ### Added
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 4d61ee5..7c3c8dd 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -30,7 +30,7 @@ Unit tests run entirely in-memory and complete in seconds. Integration tests dow
 | `shrimpk-security` | Sandbox, permissions | Planned (stub) |
 | `shrimpk-kernel` | Integration facade | Stable |
 | `shrimpk-python` | PyO3 bindings | Exists (untested in CI) |
-| `shrimpk-mcp` | MCP server (9 tools) | Stable |
+| `shrimpk-mcp` | MCP server (12 tools) | Stable |
 | `shrimpk-daemon` | HTTP daemon + proxy | Stable |
 | `shrimpk-tray` | System tray app | Stable |
 
diff --git a/SECURITY.md b/SECURITY.md
index 402e1a4..437fc8f 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -4,8 +4,8 @@
 
 | Version | Supported |
 |---------|-----------|
-| 0.5.x (latest) | Yes |
-| < 0.5.0 | No |
+| 0.7.x (latest) | Yes |
+| < 0.7.0 | No |
 
 Only the latest released version receives security fixes. If you are running an older version, please upgrade before reporting.
 
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index c283c75..8b4ffee 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -662,7 +662,7 @@ Integration layer that wires together `shrimpk-memory`, `shrimpk-context`, and `
 
 ### shrimpk-mcp
 
-Model Context Protocol server. Exposes Echo Memory as MCP tools (`store`, `echo`, `stats`, `forget`, `status`, `config_show`, `dump`) via JSON-RPC 2.0 over stdio. Compatible with any MCP-aware AI client.
+Model Context Protocol server. Exposes Echo Memory as 12 MCP tools (`store`, `echo`, `memory_graph`, `memory_related`, `memory_get`, `stats`, `forget`, `status`, `config_show`, `config_set`, `dump`, `persist`) via JSON-RPC 2.0 over stdio. Compatible with any MCP-aware AI client.
 
 Key design: the `EchoEngine` is lazily initialized on first tool call. The server starts in milliseconds; fastembed model loading (a few seconds) is deferred until a memory operation is actually requested.
 
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
index 0468bf2..2f0d61a 100644
--- a/docs/ROADMAP.md
+++ b/docs/ROADMAP.md
@@ -1,331 +1,221 @@
 # ShrimPK Roadmap
 
 This roadmap reflects the current state of the kernel and planned directions for future releases.
-Dates are aspirational. Contributions are welcome at any stage — see the Contribution Opportunities
+Dates are aspirational. Contributions are welcome at any stage -- see the Contribution Opportunities
 section for specific items you can pick up today.
 
 ---
 
-## Current State — v0.5.0
+## Current State -- v0.7.5
 
-Released March 2026. The core pipeline is stable and benchmarked.
+Released April 2026. The kernel is a mature push-based AI memory system with hybrid GraphRAG
+retrieval, entity unification, configurable embedding, and universal prompt support.
+
+### Workspace
+
+11 crates + CLI binary:
+
+| Crate | Purpose |
+|-------|---------|
+| `shrimpk-core` | Types: MemoryEntry, EchoResult, EchoConfig, Modality |
+| `shrimpk-memory` | Engine: EchoEngine, embedding, LSH, Bloom, Hebbian, labels, FSRS decay, ACT-R activation |
+| `shrimpk-daemon` | HTTP server: axum, proxy, routes (/health, /debug, /v1/chat/completions) |
+| `shrimpk-mcp` | MCP server (stdio): 12 tools for memory management and graph navigation |
+| `shrimpk-context` | ContextAssembler: token-budgeted prompt compilation |
+| `shrimpk-router` | CascadeRouter: provider routing (not yet wired in daemon) |
+| `shrimpk-security` | PII masking (stub -- 6 categories, 14 regex patterns) |
+| `shrimpk-kernel` | Facade crate re-exporting core + memory + context |
+| `shrimpk-python` | PyO3 bindings (maturin) |
+| `shrimpk-ros2` | ROS2 bridge (stub) |
+| `shrimpk-tray` | Windows system tray (win32) |
+| `cli/` | CLI binary: store, echo, status, explore (ratatui TUI) |
 
 ### What is shipped and working
 
 **Echo pipeline**
 
 The full retrieval chain is operational: Bloom filter pre-screening (O(1) topic elimination),
-LSH candidate retrieval (sub-linear at scale), cosine reranking, Hebbian co-activation boosting,
-and recency decay. Optional HyDE (hypothetical document expansion) and LLM reranking are
-available via config flags.
+LSH candidate retrieval (sub-linear at scale), label-based pre-filtering, cosine reranking,
+Hebbian co-activation boosting, FSRS decay, ACT-R activation, temporal boost, importance
+scoring, and multiplicative supersession demotion. Optional HyDE (hypothetical document
+expansion) and LLM reranking are available via config flags.
 
-**Text memory — BGE-small-EN-v1.5**
+**Hybrid GraphRAG (KS61-KS64)**
 
-Primary embedding model: `BAAI/bge-small-en-v1.5` via fastembed. The pipeline achieves 84%
-top-3 recall (combined HyDE + LLM reranker config) on a realistic 41-memory, 25-query benchmark
-spanning five LongMemEval categories: information extraction, multi-session reasoning, temporal
-reasoning, knowledge update, and preference tracking. Temporal queries hit 100% (5/5) across
-all pipeline configs.
+Full hybrid GraphRAG pipeline combining vector similarity with label-graph traversal.
+Label-graph navigation enables neighborhood exploration from any memory. 14 MCP tools
+support store, retrieval, graph exploration, and management operations. 517 tests cover
+the complete pipeline.
 
-**Vision memory — CLIP ViT-B/32**
+**Entity unification (KS73)**
 
-Image memories are embedded using CLIP ViT-B/32 (512-dim) via fastembed's `ClipVitB32` variant.
-Cross-modal retrieval (text queries retrieving image memories) works in the same embedding space.
-The vision feature is gated behind `--features vision`.
+EntityFrame and EntityId-based supersession for structured entity tracking. When new
+information contradicts or updates an existing entity, the old memory is superseded and
+receives a multiplicative demotion penalty (default 0.40x, configurable). This prevents
+stale knowledge from outranking current facts.
 
-**Sleep consolidation**
+**Configurable embedding (KS75)**
 
-A background consolidation pass runs during idle periods (configurable schedule). It uses a local
-LLM via Ollama to extract atomic facts from raw memories, de-duplicate, and merge related entries.
-In benchmarks, consolidation lifted top-3 recall from 72% to 76% over the baseline (no
-consolidation) configuration.
+EmbeddingProvider trait abstraction with 10 fastembed models and OpenAI API support.
+Default model: BGE-small-EN-v1.5 (384-dim) via fastembed. The provider can be swapped
+at configuration time without code changes.
 
-**SHRM v2 storage format**
+**Universal prompt (KS76)**
 
-Memory-mapped binary format with 32-bit CRC per entry, atomic flush, and crash recovery. Stores
-text embeddings (384-dim), optional vision embeddings (512-dim), optional speech embeddings
-(640-dim field, populated from v0.6.0 onward), metadata, and sensitivity labels.
+One prompt template for all reader models. No per-model tuning required. Validated with
+qwen2.5:1.5b (default) and qwen2.5:3b. Includes temporal boost for time-sensitive queries
+and a 5-signal importance scoring system.
 
-**Speech architecture (structure only)**
+**Multimodal SHRM v2**
+
+Memory-mapped binary format with 32-bit CRC per entry, atomic flush, and crash recovery.
+Stores text embeddings (384-dim), optional vision embeddings (512-dim), optional speech
+embeddings (640-dim), metadata, and sensitivity labels. Three-channel architecture: text
+(BGE-small-EN-v1.5), vision (CLIP ViT-B/32), speech (ECAPA-TDNN 256 + Whisper-tiny 384).
+
+**Sleep consolidation**
 
-`shrimpk-memory/src/speech.rs` defines the full `SpeechEmbedder` struct with dimension constants
-(`SPEAKER_DIM=256`, `PROSODY_DIM=384`, `SPEECH_DIM=640`), Whisper log-Mel preprocessing, and
-ONNX sessions wired in v0.6.0. The 16 kHz resampler uses linear interpolation.
+Background consolidation using a local LLM via Ollama with schema-driven fact extraction.
+Child memory pipeline creates atomic facts from raw memories, supports supersession for
+knowledge updates. Default reader model: qwen2.5:1.5b.
 
-**MCP server**
+**MCP server (12 tools)**
 
-`shrimpk-mcp` exposes nine tools over stdio: `store`, `echo`, `forget`, `stats`, `status`,
-`config_show`, `config_set`, `dump`, `persist` (plus `store_image` and `store_audio` when
-multimodal features are enabled). Compatible with Claude Desktop and any MCP client.
+`shrimpk-mcp` exposes 12 tools over stdio: `store`, `echo`, `memory_graph`,
+`memory_related`, `memory_get`, `stats`, `forget`, `status`, `config_show`, `config_set`,
+`dump`, `persist`. Additional multimodal tools (`store_image`, `store_audio`) available
+when feature flags are enabled. Compatible with Claude Desktop and any MCP client.
 
 **Daemon + tray**
 
-`shrimpk-daemon` runs as a background HTTP service on `localhost:11435`. `shrimpk-tray` provides
-a system tray icon and launch/stop controls on Windows.
+`shrimpk-daemon` runs as a background HTTP service on `localhost:11435` with OpenAI-compatible
+proxy (`/v1/chat/completions`). `shrimpk-tray` provides a system tray icon and launch/stop
+controls on Windows.
+
+### Benchmark results
+
+| Benchmark | Score |
+|-----------|-------|
+| Seeded micro-benchmark | 19/20 |
+| Abstention (no-answer detection) | 5/5 |
+| Negative retrieval | 3/3 |
+| LME-S (GPT-4o judge) | 24.2% overall, 25.3% task-avg |
 
 **Performance (release build, i7-1165G7)**
 
 | Metric | Result |
 |--------|--------|
 | P50 echo latency at 10K memories | 3.50ms |
-| P50 echo latency at 100K memories | 23.79ms (regression — see Known Issues) |
 | Store throughput | ~128 memories/sec |
 | RAM (10K text memories) | ~85 MB |
 
----
-
-## v0.6.0 — Speech and Vision Upgrade
-
-Target: Q2 2026. Focus: wire the speech ONNX models and upgrade the vision model.
-
-### Speech: ONNX models wired (640-dim — DONE in KS51)
-
-The speech pipeline is **640-dim** (ECAPA-TDNN 256 + Whisper-tiny encoder 384). The emotion
-channel (Wav2Small, CC-BY-NC-SA-4.0) was dropped as license-incompatible. Both wired models
-carry permissive licenses: ECAPA-TDNN (Apache-2.0) and Whisper-tiny (MIT).
-
-#### ECAPA-TDNN 256-dim — speaker identification
-
-Model: `Wespeaker/wespeaker-cnceleb-resnet34-LM` (`cnceleb_resnet34_LM.onnx`, ~24 MB,
-Apache 2.0). Loaded via `ort` (ONNX Runtime Rust crate). Auto-downloads from HuggingFace Hub.
-
-Input: 80-bin FBank features, shape `(1, frames, 80)`, 25ms frame, 10ms hop, 16 kHz.
-Output: 256-dim L2-normalized speaker embedding (output name: `embs`).
-
-#### Whisper-tiny encoder 384-dim — prosody
-
-Model: `onnx-community/whisper-tiny` (`onnx/encoder_model.onnx`, 32.9 MB, MIT). The encoder
-takes 80-bin Whisper log-Mel spectrogram, shape `(batch, 80, 3000)`, padded to 30 seconds.
-Mean-pooling over the sequence dimension produces a 384-dim prosody vector.
-
-#### Spectrogram preprocessing
-
-Two spectrogram pipelines run in parallel:
-
-- **Kaldi fbank** for ECAPA-TDNN: 80 Mel bins, 25ms frame, 10ms hop, 16 kHz. Implementation via
-  the `mel-spec` crate (v0.3.4, MIT).
-- **Whisper log-Mel** for the encoder: 80 Mel bins, N_FFT=400, hop=160 samples, normalized as
-  `(log_spec + 4.0) / 4.0`. Also handled by `mel-spec`.
-
-#### Band-limited resampling
-
-The current `resample_linear()` stub in `speech.rs` introduces aliasing at high downsample ratios
-(e.g., 48 kHz → 16 kHz). v0.6.0 replaces it with the `rubato` crate (v1.0.1), which provides
-sinc-interpolation and FFT-based resamplers that are alias-free.
-
-#### VAD gate — Silero VAD
-
-A Voice Activity Detection pass runs before the ECAPA and Whisper sessions. Silent frames
-(below a configurable threshold) are skipped entirely to avoid embedding noise as speech.
-Silero VAD is loaded as a small ONNX model (~2 MB, MIT license) via a direct `ort::Session`.
-The `silero-vad` crate on crates.io is GPL-2.0 and is explicitly avoided — the ONNX model
-is loaded directly.
+### Key milestones (KS67-KS78)
 
-#### ort version pinning
-
-fastembed v5.x pins `ort = "=2.0.0-rc.11"`. The speech code must use the exact same version
-to avoid Cargo dependency conflicts. Do not add `ort` as a direct workspace dependency with a
-different version specifier.
-
-#### Model download on first use
-
-Models are downloaded on first `SpeechEmbedder::from_config()` call if not already cached,
-following the fastembed pattern: `hf-hub` crate + `dirs::cache_dir()/shrimpk/models/speech/`.
-Total first-use download: ~60 MB (ECAPA 25 MB + Whisper encoder 33 MB + Silero VAD 2 MB).
-
-### Vision: CLIP ViT-B/32 → Nomic Embed Vision v1.5 (512 → 768-dim)
-
-`NomicEmbedVisionV15` is already a first-class variant in fastembed v5 (`ImageEmbeddingModel`
-enum). The swap is a single-line change in `embedder.rs`. The quality improvement is substantial:
-+7.8 percentage points on ImageNet zero-shot (71.0% vs 63.2%) and dramatically better cross-modal
-MTEB quality (62.28 vs 43.82 for the paired text model). The q4-quantized ONNX is 62 MB vs
-CLIP's unquantized 352 MB — a 6x size reduction.
-
-The 512 → 768 dimension change is a **breaking migration** for stored vision embeddings. The
-SHRM v2 format header records embedding dimensions per modality. On first launch after upgrade,
-the kernel will detect the dimension mismatch, re-embed all stored vision memories, and rewrite
-the store. For the v0.5.0 → v0.6.0 transition the user base is small and a hard-cut re-embed
-is the correct strategy. A migration guide will be included in the release notes.
-
-Cross-modal text queries against vision memories must use Nomic Text v1.5 with the mandatory
-`search_query:` prefix. This is handled internally by the embedder — callers do not need to
-add the prefix manually.
-
-### Fix: 100K latency regression
-
-The P50 latency at 100K memories is 23.79ms against a 4.0ms target. Investigation is required
-before v0.6.0 ships. See Known Issues for details.
+| Sprint | Milestone |
+|--------|-----------|
+| KS67 | Schema-driven fact extraction, 80% micro-benchmark recall |
+| KS68 | IE-1 + KU-1 fixed, 17/20 embedding-only, Greptile P1s resolved |
+| KS69 | Consolidation redesign, child memory pipeline rewrite, 19/20 seeded |
+| KS70 | 20/20 seeded, qwen2.5:1.5b default, first real consolidation validation |
+| KS73 | Entity unification, EntityFrame, EntityId supersession |
+| KS75 | Configurable embedding: EmbeddingProvider trait, 10 models, OpenAI API |
+| KS76 | Universal prompt, temporal boost, importance scoring |
+| KS77 | 19/20 seeded, 5/5 abstention, KU-3 fixed, temporal dedup trap found |
+| KS78 | Multiplicative supersession demotion (0.40x default) |
 
 ---
 
-## v0.7.0 — Robotics, Speaker Upgrade, and Quantization
-
-Target: Q3 2026. Focus: ROS2 integration, model quality improvements, and memory footprint.
-
-### ROS2 bridge — `shrimpk-ros2` crate
-
-A new workspace crate `crates/shrimpk-ros2` will provide a ROS2 node that exposes ShrimPK
-memory over standard ROS2 topics and services.
-
-The node subscribes to:
-- `/shrimpk/store/text` (`std_msgs/String`) — text memories
-- `/shrimpk/store/image` (`sensor_msgs/CompressedImage`) — visual memories via CLIP
-- `/shrimpk/store/audio` (`audio_common_msgs/AudioStamped`) — speech memories
+## Next -- KS79: Multi-Resolution Retrieval
 
-The node publishes to:
-- `/shrimpk/echo` (`shrimpk_msgs/EchoResults`) — push-activated memories
-- `/shrimpk/context` (`std_msgs/String`, latched) — current context string for downstream LLMs
-- `/shrimpk/status` (`std_msgs/String`, JSON) — health and latency stats
+Target: Q2 2026. Focus: retrieval quality at multiple granularity levels.
 
-A `/shrimpk/query` service (`shrimpk_msgs/EchoQuery`) supports pull-based querying for nodes
-that prefer request/response semantics over the push model.
+Multi-resolution retrieval allows the echo pipeline to match queries against memories at
+different levels of abstraction -- raw memories, consolidated facts, entity summaries, and
+topic clusters. This enables both precise fact lookup and broad contextual recall within
+the same query.
 
-Primary integration path: `rclrs` 0.7+ with colcon on ROS2 Jazzy (Ubuntu 24.04).
-Alternative: `r2r` for simpler `cargo build` integration without colcon.
-Optional feature flag: `ros2-native` using `ros2-client` (pure Rust DDS, no ROS2 install needed)
-for distribution to users who do not have a full ROS2 environment.
-
-The echo latency budget is feasible: 3.50ms ShrimPK echo is well within a 30 Hz camera frame
-(33ms). The full pipeline including embedding and topic publish should stay under 15–20ms.
-
-No other push-based memory system has a ROS2 bridge. ReMEmbR (NVIDIA) is pull-based and
-Python-only. `shrimpk-ros2` would be the first native-Rust, push-activated memory layer for ROS2.
-
-### Speaker upgrade: ECAPA-TDNN → CAM++
-
-CAM++ (Context-Aware Masking) achieves lower equal error rate than ECAPA-TDNN on VoxCeleb1/2
-at comparable model size. The upgrade is a drop-in replacement at the 512-dim output level
-provided an Apache 2.0-compatible ONNX export is available. If no suitable pre-built ONNX exists,
-the ECAPA-TDNN model ships in v0.7.0 and CAM++ is deferred to v0.8.0.
+---
 
-### f16 quantization for vision and speech embeddings
+## Next -- KS80: Memory Lifecycle
 
-Stored vision and speech embeddings currently use f32 (4 bytes/dimension). A v0.7.0 storage
-format revision (SHRM v3) will store these as f16 (2 bytes/dimension) with promotion to f32
-at query time. Impact: ~50% reduction in disk and memory footprint for vision/speech memories,
-no measurable quality loss for cosine similarity.
+Target: Q2 2026. Focus: memory aging, archival, and lifecycle management.
 
-SHRM v3 will include automatic migration from v2 on first launch.
+Formalize the memory lifecycle from creation through active use, staleness detection,
+archival, and eventual pruning. Integrate FSRS scheduling data with usage patterns to
+make informed retention decisions. Provide user-facing controls for lifecycle policies.
 
 ---
 
-## Future — No Fixed Timeline
+## Future -- No Fixed Timeline
 
 These items are research directions or require dependencies that are not yet settled.
 
+### ROS2 bridge production readiness
+
+`shrimpk-ros2` exists as a stub. Production readiness requires ROS2 Jazzy integration
+via `rclrs`, topic/service wiring, and latency validation within a 30 Hz camera frame
+budget. The push-based architecture maps naturally to ROS2 topic publishing.
+
 ### Custom fine-tuned embedding model
 
-The text embedding model (BGE-small) is a general-purpose model trained on web text. A model
-fine-tuned specifically on personal memory data (short episodic sentences, user preferences,
-recurring entities) could improve recall quality without increasing model size. This requires
-a labeled dataset and an ML training pipeline — it is a research item, not an implementation task.
+A model fine-tuned specifically on personal memory data (short episodic sentences, user
+preferences, recurring entities) could improve recall quality without increasing model
+size. This requires a labeled dataset and an ML training pipeline.
 
 ### crates.io publish
 
 Publishing `shrimpk-core`, `shrimpk-memory`, and (eventually) `shrimpk-ros2` to crates.io
-is planned once the API stabilizes beyond v0.6.0. The current pre-1.0 semver signals that
-breaking changes are expected.
+is planned once the API stabilizes. The current pre-1.0 semver signals that breaking
+changes are expected.
 
 ### Cloud sync
 
-Optional encrypted sync of the memory store across devices. End-to-end encrypted, the server
-sees only ciphertext. The key design question is key management — the server must never hold
-decryption keys. This is a future research and design item.
+Optional encrypted sync of the memory store across devices. End-to-end encrypted, the
+server sees only ciphertext. The key design question is key management -- the server must
+never hold decryption keys.
 
-### Emotion channel
+### Vision model upgrade
 
-The 3-dim arousal/dominance/valence emotion channel is architecturally present in `speech.rs`
-(`EMOTION_DIM=3`) but has no available ONNX model under a permissive license. If a suitable
-Apache 2.0 or MIT model emerges, the emotion channel can be re-enabled without a breaking change
-to the storage format (the slot is reserved). Alternatively, a categorical speech emotion
-recognition model (4-class: angry, happy, sad, neutral) under a permissive license could
-replace the dimensional approach.
+Nomic Embed Vision v1.5 or SigLIP 2 as a CLIP replacement. The 512 to 768 dimension
+change would be a breaking migration for stored vision embeddings. Deferred until the
+user base is large enough to justify the migration complexity.
+
+### Speaker upgrade: ECAPA-TDNN to CAM++
+
+CAM++ (Context-Aware Masking) achieves lower equal error rate than ECAPA-TDNN on
+VoxCeleb1/2. Blocked on availability of an Apache 2.0-compatible ONNX export.
 
 ---
 
 ## Contribution Opportunities
 
-All issues below are open for contribution. The project uses Apache 2.0. Opening a discussion
-issue before starting significant work is encouraged to avoid duplication.
+All issues below are open for contribution. The project uses Apache 2.0. Opening a
+discussion issue before starting significant work is encouraged to avoid duplication.
 
 ### Good first issue
 
-**Fix vision feature flag propagation** (difficulty: low, Rust knowledge required)
-Vision benchmarks (`echo_multimodal_bench.rs`) are blocked because
-`#[cfg(feature = "vision")]` checks the root test crate's features, not `shrimpk-memory`'s.
-The fix is adding a forwarding `vision` feature to the root `Cargo.toml` that enables
-`shrimpk-memory/vision`. Estimated: 1–2 hours.
-
-**Add `search_query:` prefix for cross-modal text queries** (difficulty: low, Rust)
-When Nomic Embed Vision v1.5 is the active vision model (v0.6.0), text queries used in
-cross-modal retrieval must be prefixed with `"search_query: "`. This should be applied
-automatically in `MultiEmbedder` when the Nomic vision model is active, not pushed to callers.
-Requires reading the fastembed API and adding a model-variant check.
-
 **Extend the Tier 2 benchmark with a CrossEncoder config** (difficulty: low, Rust)
-The realistic Tier 2 benchmark tests four pipeline configs (Baseline, HyDE, Reranker-LLM,
-Combined). A CrossEncoder-only config was benchmarked separately and showed strong results
-(2,823ms average at 100% recall on 6 regression cases). Adding it to the standard Tier 2
-suite would complete the comparison matrix.
+The realistic Tier 2 benchmark tests four pipeline configs. Adding a CrossEncoder-only
+config would complete the comparison matrix.
 
 ### Help wanted
 
-**Investigate 100K latency regression** (difficulty: medium, Rust + profiling)
-P50 at 100K memories is 23.79ms against a 4.0ms target. Likely causes: LSH bucket saturation
-with BGE-small embedding distribution, brute-force fallback frequency, or Windows I/O interference
-during the benchmark. The investigation should profile LSH hit rate, Bloom false-positive rate,
-and brute-force fallback frequency at scale. Tools: `perf`, `cargo flamegraph`, or the
-`tracing` spans already in the echo path. A fix might involve tuning LSH parameters
-(hash count, bucket width) for the BGE-small distribution.
-
-**~~Wire ECAPA-TDNN ONNX session~~** — DONE (KS51). Wespeaker ResNet34 256-dim, FBank
-preprocessing implemented in pure Rust (`compute_fbank_flat()`), `ort` version matches
-fastembed's pinned `=2.0.0-rc.11`.
-
-**~~Wire Whisper-tiny encoder ONNX session~~** — DONE (KS51). Whisper-tiny encoder takes
-`(1, 80, 3000)` log-Mel spectrogram, outputs `(1, 1500, 384)` hidden states, mean-pooled
-to 384-dim.
-Preprocessing uses the Whisper log-Mel formula implemented in `mel-spec`. Can be done in
-parallel with the ECAPA item by a different contributor.
-
-**Implement band-limited resampling with `rubato`** (difficulty: medium, Rust + DSP)
-Replace `resample_linear()` in `speech.rs` with sinc or FFT-based resampling from the `rubato`
-crate (v1.0.1). The current linear resampler causes aliasing at high downsample ratios and is
-documented as a placeholder. The replacement should pass the existing `resample_*` unit tests
-and add a new test verifying that a 1 kHz sine wave downsampled from 48 kHz to 16 kHz does not
-contain aliasing artifacts above 8 kHz.
-
 **Linux CI hardening** (difficulty: medium, DevOps + Rust)
-The kernel builds and tests pass on CI for Linux and macOS, but the test coverage is lower than
-on the primary Windows development machine. Specifically: daemon startup tests, tray icon tests,
-and file locking tests need Linux-specific validation. Contributions improving Linux CI coverage
-are welcome.
+The kernel builds and tests pass on CI, but test coverage is lower on Linux than on the
+primary Windows development machine. Contributions improving Linux CI coverage are welcome.
+
+**100K latency profiling** (difficulty: medium, Rust + profiling)
+P50 at 100K memories needs investigation. Likely causes: LSH bucket saturation with
+BGE-small embedding distribution, or brute-force fallback frequency. Tools: `perf`,
+`cargo flamegraph`, or the `tracing` spans in the echo path.
 
 ### Research needed
 
 **Emotion model under permissive license** (difficulty: high, ML research)
-The 3-dim arousal/dominance/valence emotion slot in the speech pipeline is reserved but empty
-because all mature dimensional emotion models (Wav2Small, wav2vec2-large-robust) carry
-CC-BY-NC-SA-4.0 licenses. Options: (1) identify an existing Apache 2.0 / MIT categorical
-speech emotion model that can be exported to ONNX and mapped to a valence proxy, (2) train a
-small distillation model on CC0 or public-domain audio corpora, or (3) propose an alternative
-paralinguistic dimension that has available permissive models.
+The 3-dim arousal/dominance/valence emotion slot in the speech pipeline is reserved but
+empty because all mature dimensional emotion models carry CC-BY-NC-SA-4.0 licenses.
 
 **LSH parameter tuning for BGE-small distribution** (difficulty: high, information retrieval)
-The LSH index was tuned for `all-MiniLM-L6-v2` embeddings. The upgrade to `BGE-small-EN-v1.5`
-changed the embedding distribution in ways that may require different hash count, bucket width,
-or candidate list size to maintain sub-10ms P50 at 100K scale. This is an empirical research
-task: vary LSH parameters, run the 100K latency benchmark, and identify the configuration that
-recovers the 4.0ms target.
-
-**CAM++ Apache 2.0 ONNX availability** (difficulty: medium, ML research)
-The v0.7.0 speaker upgrade to CAM++ depends on finding or producing an Apache 2.0-compatible
-ONNX export. WeSpeaker provides CAM++ checkpoints but the license status of any pre-built
-ONNX exports needs verification. This research item should produce a clear verdict: model ID,
-license, ONNX file location, and input/output specification.
-
-**SigLIP 2 fastembed support** (difficulty: high, ML + Rust)
-SigLIP 2 ViT-B/16 achieves 78.2% ImageNet zero-shot (vs Nomic Vision v1.5 at 71.0%) but has
-no official ONNX model and no fastembed support as of March 2026. If an Apache 2.0 ONNX export
-emerges, contributing a `SigLIP2VitB16` variant to fastembed and then updating ShrimPK's
-vision channel would be a meaningful quality improvement.
+The LSH index was tuned for all-MiniLM-L6-v2 embeddings. The upgrade to BGE-small changed
+the embedding distribution in ways that may require different hash count, bucket width, or
+candidate list size to maintain sub-10ms P50 at 100K scale.