diff --git a/.devin/skills/sync-upstream-mempalace/SKILL.md b/.devin/skills/sync-upstream-mempalace/SKILL.md new file mode 100644 index 0000000..1f39ee5 --- /dev/null +++ b/.devin/skills/sync-upstream-mempalace/SKILL.md @@ -0,0 +1,228 @@ +--- +name: sync-upstream-mempalace +description: How to keep the Rust port (`quangdang46/mempalace_rust`) in sync with the upstream Python repo (`milla-jovovich/mempalace`). Use whenever the user asks to port, sync, mirror, check gaps, or replay upstream fixes. Covers cloning both repos, building the diff plan against the reference commit recorded in `port.txt`, prioritising correctness fixes over features, and the smallest-blast-radius port checklist. +--- + +# Sync the Rust port with the upstream Python MemPalace + +The Rust port `quangdang46/mempalace_rust` shadows the Python implementation +`milla-jovovich/mempalace`. The ledger of the last shared commit lives in +`port.txt` at the repo root. Everything in this skill is a recipe for taking +the upstream code from "last synced commit" to "current HEAD", spotting the +gaps, and porting them safely. + +## Layout + +| Path | Role | +| ---- | ---- | +| `/home/ubuntu/repos/mempalace_rust` | Rust port — make all PRs here | +| `/home/ubuntu/repos/mempalace_upstream` | Upstream Python — read-only reference | +| `port.txt` (in the Rust repo) | Sync ledger: last reference commit + open gaps | +| `.devin/skills/sync-upstream-mempalace/SKILL.md` | This skill | + +The Python repo lives at `https://github.com/milla-jovovich/mempalace`. The +Rust port lives at `https://github.com/quangdang46/mempalace_rust`. + +## Step 1 — Refresh both checkouts + +```bash +# Rust port +cd /home/ubuntu/repos/mempalace_rust +git fetch origin +git checkout main && git pull --ff-only + +# Upstream Python (cloned read-only the first time) +if [ ! -d /home/ubuntu/repos/mempalace_upstream ]; then + git clone https://github.com/milla-jovovich/mempalace.git /home/ubuntu/repos/mempalace_upstream +fi +cd /home/ubuntu/repos/mempalace_upstream +git fetch origin && git checkout main && git pull --ff-only +``` + +Always work off the **upstream `main`**, never a feature branch. + +## Step 2 — Read the existing ledger + +```bash +head -120 /home/ubuntu/repos/mempalace_rust/port.txt +``` + +The bottom of `port.txt` lists the last synced commit (e.g. `94f1689`) and the +"Remaining gaps" section. Treat that section as authoritative — the last +session already decided what was deferred. + +## Step 3 — Build the diff plan against the ledger reference commit + +```bash +cd /home/ubuntu/repos/mempalace_upstream +LAST_SYNC=$(grep -oE '\b[0-9a-f]{7,40}\b' /home/ubuntu/repos/mempalace_rust/port.txt | tail -1) + +# What changed since last sync? (newest first) +git log --oneline --no-merges "$LAST_SYNC..HEAD" -- mempalace/ | head -100 + +# Read the canonical changelog for headline fixes/features +sed -n '1,120p' CHANGELOG.md +``` + +This produces the candidate list. Filter it with the priorities below. + +## Step 4 — Triage what to port (and what NOT to) + +Port in this order: + +1. **Correctness fixes that affect data integrity or query results.** These + land first because they protect users from silent data corruption. + Examples we have already ported in this skill's parent PR: + - `#1214` — KG rejects inverted intervals (`valid_to < valid_from`) + - `#1215` — `EntityRegistry.save()` atomic write + parent-dir fsync + - `#1243` — diary `agent_name` lowercased so reads are case-insensitive + - `#1164` — ISO-8601 validation at MCP boundary for `as_of`, `valid_from`, + `valid_to`, `ended` + - `#1314` — `tool_kg_add` forwards `valid_to`/`source_file`, + `tool_kg_invalidate` resolves `ended` to today + +2. **MCP boundary changes** (tool input schemas, new fields, validation). + These are user-visible and small; high signal, low blast radius. + +3. **CLI ergonomics changes** that don't require new dependencies. + +4. **Features behind feature flags** (LLM refine, new providers, …) — + defer if not user-requested, but record them in `port.txt` so the next + sync session can pick them up. + +5. **Docs / i18n / website / CHANGELOG-only** changes — almost never port + verbatim into the Rust repo; the Python README is the canonical product + doc. + +For each candidate commit, before porting, read the **commit body** — the +upstream maintainers (Arnold Wender, Igor Lins e Silva, …) write extensive +rationale that you can usually compress into the Rust comment block above the +fix. + +```bash +cd /home/ubuntu/repos/mempalace_upstream +git show -- mempalace/.py | head -120 +``` + +## Step 5 — Port the change to Rust + +Map Python module → Rust module: + +| Python (`mempalace/`) | Rust (`crates/core/src/`) | +| --- | --- | +| `mcp_server.py` | `mcp_server.rs` | +| `knowledge_graph.py` | `knowledge_graph.rs` | +| `entity_registry.py` | `entity_registry.rs` | +| `entity_detector.py` | `entity_detector.rs` | +| `config.py` | `config.rs` | +| `searcher.py` | `searcher.rs` | +| `layers.py` | `layers.rs` | +| `palace.py` | `palace.rs` | +| `palace_graph.py` | `palace_graph.rs` | +| `miner.py` | `miner.rs` | +| `convo_miner.py` | `convo_miner.rs` | +| `dialect.py` | `dialect.rs` | +| `normalize.py` | `normalize.rs` | +| `repair.py` | `repair.rs` | + +Rules of thumb: + +- **Comment the upstream issue number** (`#1214`, `#1243`, …) in the Rust + code where the fix lands. This makes the next sync session trivial. +- **Write a Rust regression test that mirrors the upstream test name.** + Upstream uses pytest, Rust uses `#[test]`; the function names should be + recognisably the same (e.g. `test_diary_read_case_insensitive_agent`). +- **Use `anyhow::bail!` for new validation errors** to match the existing + Rust error style. Map Python `ValueError(...)` to + `anyhow::bail!("{field}={raw:?} is not …")`. +- **For atomic file IO**, use `std::fs::OpenOptions` + `sync_all()` + + `std::fs::rename`. On `cfg(unix)`, also open the parent directory and + `sync_all()` it. See `crates/core/src/entity_registry.rs::save` for the + reference implementation. +- **For MCP tool schema changes**, both the `Deserialize` `struct Input` + AND the `serde_json::json!(...)` `properties` block in the tool + registration need updating. + +## Step 6 — Build, test, lint (gating) + +```bash +cd /home/ubuntu/repos/mempalace_rust + +# 1. Build everything +cargo build --workspace + +# 2. Run all tests +cargo test --workspace + +# 3. Clippy with warnings-as-errors (the repo standard) +cargo clippy --workspace --all-targets -- -D warnings +``` + +If `cargo build` fails because rustc complains about `feature edition2024`, +run `rustup update stable` once and retry — recent dependencies require +stable >= 1.85. + +If `cargo clippy` flags `redundant_guards` on a `Some(s) if s.is_empty()` +match arm, rewrite as `Some("")`. + +## Step 7 — Update `port.txt` + +After porting, append a new section to `port.txt`: + +- The new reference commit hash (current upstream HEAD). +- A bullet list of ported issue numbers / commits. +- The updated "Remaining gaps" — copy forward anything still deferred, + remove what got done. + +Keep the existing history in the file; this is an append-only ledger so +the next sync can audit decisions. + +## Step 8 — Open the PR + +```bash +cd /home/ubuntu/repos/mempalace_rust +git checkout -b devin/$(date +%s)-port-upstream-fixes +git add -A +git commit -m "fix: port upstream correctness fixes (#1164, #1214, #1215, #1243, #1314)" +git push -u origin HEAD +``` + +Use `git_pr(action="fetch_template")` then `git_pr(action="create")`. PR +description must include: + +- One section per ported upstream issue/commit, with the upstream link. +- The "Remaining gaps" delta vs. the previous `port.txt`. +- Test summary: tests added, `cargo test --workspace` pass count, + `cargo clippy --workspace --all-targets -- -D warnings` clean. + +## Step 9 — Verify CI + +```text +git(action="pr_checks", wait_mode="all") +``` + +If CI fails on something unrelated to the diff (e.g. a flaky test), confirm +by running the same test locally before claiming it is preexisting. + +## Common pitfalls + +- **Do NOT carry over Python `Optional[None]` semantics naively.** Rust + `Option<&str>` + a sanitizer that returns `Ok(None)` on `None` is the + canonical pattern; don't introduce magic `""` sentinels in storage. +- **Do NOT alter the SQLite schema lightly.** If an upstream fix adds a + column (e.g. `source_drawer_id`), either add a migration or scope the + port to only the parts the existing schema already supports, and record + the deferred schema change in `port.txt`. +- **Do NOT swallow upstream issue numbers.** Each ported bugfix gets its + number in a Rust comment so future syncs can grep for it. +- **Do NOT port docs/i18n** unless the user explicitly asks. The Python + README is the canonical product doc; mirroring it into Rust drifts fast. + +## Quick reference — find the last synced commit + +```bash +grep -oE '\b[0-9a-f]{7,40}\b' /home/ubuntu/repos/mempalace_rust/port.txt | tail -1 +``` + +That's the only piece of state the next session needs to start the loop +again from Step 1. diff --git a/crates/core/src/config.rs b/crates/core/src/config.rs index 5063e82..efb9a7a 100644 --- a/crates/core/src/config.rs +++ b/crates/core/src/config.rs @@ -7,6 +7,77 @@ use std::path::PathBuf; const DEFAULT_COLLECTION_NAME: &str = "mempalace_drawers"; +/// Validate an ISO-8601 date or canonical UTC datetime string at MCP boundary +/// (#1164 / `4d98b05`). +/// +/// Accepts `None` and empty strings as pass-through. +/// +/// Non-empty inputs must match one of these canonical forms: +/// * `YYYY-MM-DD` +/// * `YYYY-MM-DDTHH:MM:SSZ` +/// * `YYYY-MM-DDTHH:MM:SS+00:00` (normalized to `...Z` on return) +/// +/// Partial dates (e.g. `2026`, `2026-01`) and non-UTC datetimes are rejected +/// because KG queries compare temporal values as TEXT — mixed forms silently +/// return wrong results. +pub fn sanitize_iso_temporal( + value: Option<&str>, + field_name: &str, +) -> anyhow::Result> { + let raw = match value { + None => return Ok(None), + Some("") => return Ok(Some(String::new())), + Some(s) => s.trim().to_string(), + }; + + fn is_valid_date(value: &str) -> bool { + if value.len() != 10 { + return false; + } + let bytes = value.as_bytes(); + if bytes[4] != b'-' || bytes[7] != b'-' { + return false; + } + let year: i32 = match value[0..4].parse() { + Ok(v) => v, + Err(_) => return false, + }; + let month: u32 = match value[5..7].parse() { + Ok(v) => v, + Err(_) => return false, + }; + let day: u32 = match value[8..10].parse() { + Ok(v) => v, + Err(_) => return false, + }; + chrono::NaiveDate::from_ymd_opt(year, month, day).is_some() + } + + fn parse_canonical_utc(value: &str) -> Option { + let normalized = if let Some(stripped) = value.strip_suffix("+00:00") { + format!("{stripped}Z") + } else { + value.to_string() + }; + if normalized.len() != 20 || !normalized.ends_with('Z') { + return None; + } + let body = &normalized[..normalized.len() - 1]; + let dt = chrono::NaiveDateTime::parse_from_str(body, "%Y-%m-%dT%H:%M:%S").ok()?; + Some(format!("{}Z", dt.format("%Y-%m-%dT%H:%M:%S"))) + } + + if is_valid_date(&raw) { + return Ok(Some(raw)); + } + if let Some(canonical) = parse_canonical_utc(&raw) { + return Ok(Some(canonical)); + } + anyhow::bail!( + "{field_name}={raw:?} is not a valid ISO-8601 date or UTC datetime (expected YYYY-MM-DD or YYYY-MM-DDTHH:MM:SSZ)" + ) +} + fn expand_path(path: &str) -> PathBuf { if path.starts_with("~/") { if let Ok(home) = std::env::var("HOME") { @@ -647,4 +718,58 @@ mod tests { std::env::remove_var("XDG_CONFIG_HOME"); } + + // #1164: ISO-8601 validation at MCP boundary so malformed dates fail fast + // instead of silently producing empty KG query results. + #[test] + fn test_sanitize_iso_temporal_accepts_valid_date() { + let out = super::sanitize_iso_temporal(Some("2026-05-11"), "valid_from").unwrap(); + assert_eq!(out.as_deref(), Some("2026-05-11")); + } + + #[test] + fn test_sanitize_iso_temporal_accepts_canonical_utc_datetime() { + let out = super::sanitize_iso_temporal(Some("2026-05-11T12:30:45Z"), "valid_from").unwrap(); + assert_eq!(out.as_deref(), Some("2026-05-11T12:30:45Z")); + } + + #[test] + fn test_sanitize_iso_temporal_normalizes_plus_offset() { + let out = super::sanitize_iso_temporal(Some("2026-05-11T00:00:00+00:00"), "ended").unwrap(); + assert_eq!(out.as_deref(), Some("2026-05-11T00:00:00Z")); + } + + #[test] + fn test_sanitize_iso_temporal_passes_through_none_and_empty() { + assert_eq!(super::sanitize_iso_temporal(None, "as_of").unwrap(), None); + assert_eq!( + super::sanitize_iso_temporal(Some(""), "as_of") + .unwrap() + .as_deref(), + Some("") + ); + } + + #[test] + fn test_sanitize_iso_temporal_rejects_partial_date() { + assert!(super::sanitize_iso_temporal(Some("2026"), "valid_from").is_err()); + assert!(super::sanitize_iso_temporal(Some("2026-05"), "valid_from").is_err()); + } + + #[test] + fn test_sanitize_iso_temporal_rejects_garbage() { + let err = super::sanitize_iso_temporal(Some("March 2026"), "as_of").unwrap_err(); + let msg = err.to_string(); + assert!(msg.contains("as_of="), "error must name the field: {msg}"); + assert!( + msg.contains("not a valid ISO-8601"), + "error must explain: {msg}" + ); + } + + #[test] + fn test_sanitize_iso_temporal_rejects_impossible_calendar_day() { + assert!(super::sanitize_iso_temporal(Some("2026-02-30"), "valid_from").is_err()); + assert!(super::sanitize_iso_temporal(Some("2026-13-01"), "valid_from").is_err()); + } } diff --git a/crates/core/src/entity_registry.rs b/crates/core/src/entity_registry.rs index 299cc63..d516456 100644 --- a/crates/core/src/entity_registry.rs +++ b/crates/core/src/entity_registry.rs @@ -2,6 +2,7 @@ use crate::entity_detector::detect_from_content; use regex::Regex; use serde::{Deserialize, Serialize}; use std::collections::HashMap; +use std::io::Write; use std::path::Path; const EMPTY_REJECTED: &[String] = &[]; @@ -241,7 +242,52 @@ impl EntityRegistry { std::fs::create_dir_all(parent)?; } let content = serde_json::to_string_pretty(&self.data)?; - std::fs::write(&self.path, content)?; + + // Atomic write (#1215): serialize to a sibling .tmp in the same dir + // (so rename stays on one filesystem), fsync, then rename over the + // target. A crash mid-write leaves the previous registry intact + // instead of a truncated or empty file. + let parent = self + .path + .parent() + .map(Path::to_path_buf) + .unwrap_or_else(|| Path::new(".").to_path_buf()); + let file_name = self + .path + .file_name() + .and_then(|s| s.to_str()) + .unwrap_or("entity_registry.json"); + let tmp_path = parent.join(format!("{file_name}.tmp")); + { + let mut tmp = std::fs::OpenOptions::new() + .write(true) + .create(true) + .truncate(true) + .open(&tmp_path)?; + tmp.write_all(content.as_bytes())?; + tmp.flush()?; + tmp.sync_all()?; + } + #[cfg(unix)] + { + use std::os::unix::fs::PermissionsExt; + let _ = std::fs::set_permissions(&tmp_path, std::fs::Permissions::from_mode(0o600)); + } + std::fs::rename(&tmp_path, &self.path)?; + + // fsync the parent directory so the rename is durable across crashes + // on ext4 and similar filesystems (#1215 follow-up 2e441d1). Without + // this, the kernel can ack the rename and a crash can revert to a + // state where the temp file is present and the target is at the old + // version. Windows and some special filesystems reject directory fds + // — they have different durability semantics on rename anyway, so + // ignore failures here. + #[cfg(unix)] + { + if let Ok(dir) = std::fs::File::open(&parent) { + let _ = dir.sync_all(); + } + } Ok(()) } @@ -1165,4 +1211,80 @@ mod tests { assert!(registry.people().contains_key("Riley")); assert_eq!(registry.people().get("Riley").unwrap().source, "learned"); } + + #[test] + fn test_save_leaves_no_temp_file_on_success() { + // #1215: atomic write must rename the .tmp into place; if the .tmp + // sticks around the next save round can race with itself. + let temp_dir = tempfile::TempDir::new().unwrap(); + let path = temp_dir.path().join("registry.json"); + let mut registry = EntityRegistry::load(&path).unwrap(); + registry + .data + .people + .insert("Alice".to_string(), test_entity_entry()); + registry.save().unwrap(); + + let tmp_path = temp_dir.path().join("registry.json.tmp"); + assert!( + !tmp_path.exists(), + "atomic save must clean up the sibling .tmp file" + ); + assert!(path.exists(), "the target registry.json must exist"); + + let loaded: serde_json::Value = + serde_json::from_str(&std::fs::read_to_string(&path).unwrap()).unwrap(); + assert!( + loaded.get("people").and_then(|v| v.get("Alice")).is_some(), + "saved registry must contain the entity we inserted" + ); + } + + #[test] + fn test_save_overwrites_previous_content_atomically() { + // #1215: rerunning save on the same path must publish the new content + // and leave no orphaned temp file behind. + let temp_dir = tempfile::TempDir::new().unwrap(); + let path = temp_dir.path().join("registry.json"); + let mut registry = EntityRegistry::load(&path).unwrap(); + registry.save().unwrap(); + registry + .data + .people + .insert("Bob".to_string(), test_entity_entry()); + registry.save().unwrap(); + + let tmp_path = temp_dir.path().join("registry.json.tmp"); + assert!(!tmp_path.exists()); + let loaded: serde_json::Value = + serde_json::from_str(&std::fs::read_to_string(&path).unwrap()).unwrap(); + assert!(loaded.get("people").and_then(|v| v.get("Bob")).is_some()); + } + + #[cfg(unix)] + #[test] + fn test_save_sets_owner_only_permissions() { + use std::os::unix::fs::PermissionsExt; + let temp_dir = tempfile::TempDir::new().unwrap(); + let path = temp_dir.path().join("registry.json"); + let registry = EntityRegistry::load(&path).unwrap(); + registry.save().unwrap(); + let mode = std::fs::metadata(&path).unwrap().permissions().mode() & 0o777; + assert_eq!( + mode, 0o600, + "atomic save should preserve 0o600 permissions on the final file" + ); + } + + fn test_entity_entry() -> EntityEntry { + EntityEntry { + source: "test".to_string(), + contexts: vec![], + aliases: vec![], + relationship: String::new(), + confidence: 0.9, + canonical: None, + seen_count: Some(1), + } + } } diff --git a/crates/core/src/knowledge_graph.rs b/crates/core/src/knowledge_graph.rs index 63ce836..47d1481 100644 --- a/crates/core/src/knowledge_graph.rs +++ b/crates/core/src/knowledge_graph.rs @@ -142,6 +142,17 @@ impl KnowledgeGraph { source_closet: Option<&str>, source_file: Option<&str>, ) -> anyhow::Result { + // Reject inverted intervals (#1214): a triple with valid_to < valid_from + // would never satisfy `valid_from <= as_of AND valid_to >= as_of`, so it + // would be invisible to every query — silently corrupt. Open intervals + // and point-in-time facts (valid_from == valid_to) remain accepted. + if let (Some(vf), Some(vt)) = (valid_from, valid_to) { + if vt < vf { + anyhow::bail!( + "valid_to={vt:?} is before valid_from={vf:?}; an inverted interval would be invisible to every KG query" + ); + } + } let sub_id = Self::entity_id(subject); let obj_id = Self::entity_id(object); let pred = predicate.to_lowercase().replace(' ', "_"); @@ -818,4 +829,60 @@ mod tests { .iter() .any(|triple| triple.object == "NewCo" && triple.current)); } + + #[test] + fn test_add_triple_rejects_inverted_interval() { + // #1214: a triple with valid_to < valid_from never satisfies the + // temporal filter, so the row would be invisible to every query. + // add_triple must reject it at write time instead of silently + // accepting an unqueryable fact. + let mut kg = KnowledgeGraph::open(std::path::Path::new(":memory:")).unwrap(); + let err = kg + .add_triple( + "Alice", + "works_at", + "Acme", + Some("2026-12-31"), + Some("2026-01-01"), + None, + None, + None, + ) + .expect_err("inverted interval must be rejected"); + let msg = err.to_string(); + assert!( + msg.contains("inverted interval"), + "error message should mention inverted interval, got: {msg}" + ); + } + + #[test] + fn test_add_triple_allows_point_in_time_and_open_intervals() { + // #1214: open intervals (only valid_from OR only valid_to) and + // same-value point-in-time facts must remain accepted; only strict + // inversion is rejected. + let mut kg = KnowledgeGraph::open(std::path::Path::new(":memory:")).unwrap(); + kg.add_triple( + "Alice", + "born_on", + "Earth", + Some("2026-05-11"), + Some("2026-05-11"), + None, + None, + None, + ) + .expect("point-in-time facts must be accepted"); + kg.add_triple( + "Bob", + "joined", + "Cohort A", + Some("2026-01-01"), + None, + None, + None, + None, + ) + .expect("open-ended intervals must be accepted"); + } } diff --git a/crates/core/src/mcp_server.rs b/crates/core/src/mcp_server.rs index 4f79716..eeed93c 100644 --- a/crates/core/src/mcp_server.rs +++ b/crates/core/src/mcp_server.rs @@ -307,8 +307,8 @@ fn make_tools() -> Vec { tool( "mempalace_kg_add", "KG Add", - "Add a fact to the knowledge graph. Subject → predicate → object with optional time window. E.g. ('Max', 'started_school', 'Year 7', valid_from='2026-09-01').", - serde_json::json!({ "type": "object", "properties": { "subject": { "type": "string", "description": "The entity doing/being something" }, "predicate": { "type": "string", "description": "The relationship type (e.g. 'loves', 'works_on', 'daughter_of')" }, "object": { "type": "string", "description": "The entity being connected to" }, "valid_from": { "type": "string", "description": "When this became true (YYYY-MM-DD, optional)" }, "source_closet": { "type": "string", "description": "Closet ID where this fact appears (optional)" } }, "required": ["subject", "predicate", "object"] }), + "Add a fact to the knowledge graph. Subject → predicate → object with optional time window. E.g. ('Max', 'started_school', 'Year 7', valid_from='2026-09-01'). Use valid_to to backfill an already-ended historical fact.", + serde_json::json!({ "type": "object", "properties": { "subject": { "type": "string", "description": "The entity doing/being something" }, "predicate": { "type": "string", "description": "The relationship type (e.g. 'loves', 'works_on', 'daughter_of')" }, "object": { "type": "string", "description": "The entity being connected to" }, "valid_from": { "type": "string", "description": "When this became true (YYYY-MM-DD, optional)" }, "valid_to": { "type": "string", "description": "When this stopped being true (YYYY-MM-DD, optional). Use this to backfill an already-ended fact in one call." }, "source_closet": { "type": "string", "description": "Closet ID where this fact appears (optional)" }, "source_file": { "type": "string", "description": "Source file path where this fact was extracted (optional)" } }, "required": ["subject", "predicate", "object"] }), ), tool( "mempalace_kg_invalidate", @@ -828,17 +828,21 @@ fn tool_kg_query(state: &AppState, args: JsonObject) -> Result, } let input: Input = parse_args(args)?; + // Validate ISO-8601 date at MCP boundary (#1164): malformed dates would + // silently produce empty result sets, indistinguishable from "no facts". + let as_of = crate::config::sanitize_iso_temporal(input.as_of.as_deref(), "as_of") + .map_err(|e| ErrorData::invalid_params(e.to_string(), None))?; let kg = crate::knowledge_graph::KnowledgeGraph::open(&kg_path(state)) .map_err(|e| internal_error_safe(&e))?; let facts = kg .query_entity( &input.entity, - input.as_of.as_deref(), + as_of.as_deref(), input.direction.as_deref().unwrap_or("both"), ) .map_err(|e| internal_error_safe(&e))?; ok_json( - serde_json::json!({ "entity": input.entity, "as_of": input.as_of, "facts": facts, "count": facts.len() }), + serde_json::json!({ "entity": input.entity, "as_of": as_of, "facts": facts, "count": facts.len() }), ) } @@ -850,9 +854,21 @@ fn tool_kg_add(state: &AppState, args: JsonObject) -> Result, + // Forwarded to the KG layer (#1314): callers need `valid_to` to + // backfill already-ended historical facts in a single call instead + // of doing add+invalidate, and `source_file` for adapter provenance. + valid_to: Option, source_closet: Option, + source_file: Option, } let input: Input = parse_args(args)?; + // Validate ISO-8601 dates at MCP boundary (#1164) so malformed dates fail + // fast with a clear error instead of producing silently-invisible triples. + let valid_from = + crate::config::sanitize_iso_temporal(input.valid_from.as_deref(), "valid_from") + .map_err(|e| ErrorData::invalid_params(e.to_string(), None))?; + let valid_to = crate::config::sanitize_iso_temporal(input.valid_to.as_deref(), "valid_to") + .map_err(|e| ErrorData::invalid_params(e.to_string(), None))?; let mut kg = crate::knowledge_graph::KnowledgeGraph::open(&kg_path(state)) .map_err(|e| internal_error_safe(&e))?; let triple_id = kg @@ -860,11 +876,11 @@ fn tool_kg_add(state: &AppState, args: JsonObject) -> Result Result, } let input: Input = parse_args(args)?; + // Validate ISO-8601 date at MCP boundary (#1164) before forwarding to KG. + let ended = crate::config::sanitize_iso_temporal(input.ended.as_deref(), "ended") + .map_err(|e| ErrorData::invalid_params(e.to_string(), None))?; + // Resolve omitted/empty `ended` to today's date at the MCP layer so the + // response reports the actual value persisted, not the literal sentinel + // string "today" (#1314). + let resolved_ended = ended + .as_deref() + .filter(|s| !s.is_empty()) + .map(str::to_string) + .unwrap_or_else(|| chrono::Utc::now().format("%Y-%m-%d").to_string()); let mut kg = crate::knowledge_graph::KnowledgeGraph::open(&kg_path(state)) .map_err(|e| internal_error_safe(&e))?; kg.invalidate( &input.subject, &input.predicate, &input.object, - input.ended.as_deref(), + Some(resolved_ended.as_str()), ) .map_err(|e| internal_error_safe(&e))?; ok_json( - serde_json::json!({ "success": true, "fact": format!("{} → {} → {}", input.subject, input.predicate, input.object), "ended": input.ended.unwrap_or_else(|| "today".to_string()) }), + serde_json::json!({ "success": true, "fact": format!("{} → {} → {}", input.subject, input.predicate, input.object), "ended": resolved_ended }), ) } @@ -1032,6 +1059,9 @@ fn tool_diary_read(state: &AppState, args: JsonObject) -> Result, } let input: Input = parse_args_with_integer_coercion(args, &["last_n"])?; + // Case-insensitive agent name lookup (#1243): diary_write stores the + // lowercased agent name, so reads must match against the same canonical form. + let agent_name = input.agent_name.to_lowercase(); let wing = input .wing .as_deref() @@ -1050,7 +1080,8 @@ fn tool_diary_read(state: &AppState, args: JsonObject) -> Result Result>(), "total": entries.len(), "showing": showing, @@ -1098,6 +1129,9 @@ fn tool_diary_write(state: &AppState, args: JsonObject) -> Result, } let input: Input = parse_args(args)?; + // Normalize agent name to lowercase so reads are case-insensitive (#1243): + // writing as "Claude" and reading as "claude" must resolve to the same agent. + let agent_name = input.agent_name.to_lowercase(); let now = chrono::Local::now(); let wing = input .wing @@ -1105,7 +1139,7 @@ fn tool_diary_write(state: &AppState, args: JsonObject) -> Result Result Result