Proposal: Write Gate server — schema + provenance enforcement for agent-authored MERGE/SET

## Problem

Once `MCP_READ_ONLY` is flipped to allow writes, `ai-toolkit` has no further constraint on what an agent puts into Memgraph. The current write-safety surface is a single regex (`is_write_query`) + the `MCP_READ_ONLY` boolean in [`integrations/mcp-memgraph/src/mcp_memgraph/servers/server.py`](https://github.com/memgraph/ai-toolkit/blob/main/integrations/mcp-memgraph/src/mcp_memgraph/servers/server.py). That's binary allow/deny on the verb-set.

In the **agentic Knowledge-Graph-Augmented Generation (KGAG)** case — autonomous agents writing to a graph-backed memory across many sessions, often multiple agents sharing a graph — two failure modes dominate:

1. **Schema drift.** Agents hallucinate labels and property names per call. The same concept ends up as `:Person`, `:person`, `:User`, `:User_profile`. Required properties get missed. [Issue #127](https://github.com/memgraph/ai-toolkit/issues/127) ("every relationship in KG extraction is `:DIRECTED`") is one symptom of this class.
2. **Absent provenance.** Writes carry no `source`, no `extraction_method`, no `confidence`. Downstream consumers can't distinguish a parsed API response from an LLM-extracted claim from an outright hallucination.

These compound. A graph of 10k nodes from 50 agent sessions becomes unqueryable in weeks — not because the data is wrong, but because nothing constrains *how* it was written. See [Context Rot](https://memgraph.com/blog/ai-context-rot) for the decay pattern this produces.

The toolkit already has schema *introspection* (`get_schema`, `get_constraint`, `get_index`). The missing piece is schema + provenance *enforcement* at write time.

## Proposal

Add a **Write Gate** server variant to `ai-toolkit`'s MCP plugin registry. Two invariants, enforced together:

1. **Schema validation** — every write is matched against a registered schema before MERGE
2. **Computed provenance** — every write stamps `source`, `extraction_method`, `confidence` (computed, not declared), and a gate version marker

The gate is neutral machinery. Consumers decide what their schema contains and what their confidence formula is; the gate enforces the shape and computes the outputs.

### Why a new server variant, not a flag on the existing server?

Valid question. The argument for a flag: simpler, smaller surface. The argument for a variant:

- **Tool surface isolation.** The default server exposes `run_query` — a Cypher pass-through. The Write Gate exposes typed tools (`write_node`, `write_relationship`) that *don't* allow raw Cypher writes. Agents using the gate should not have `run_query` in their tool list; that defeats the enforcement. Separate servers let the operator hand out the right tool surface per agent.
- **Independent configuration.** Schema registry source, formula provider, error behavior — these are gate-specific config. Bolting them onto the existing server's env var surface crowds it.
- **Backwards compatibility.** Existing users running the current server see no change. The gate is opt-in via `AVAILABLE_SERVERS`.

The plugin registry explicitly anticipates this. In [`servers/__init__.py`](https://github.com/memgraph/ai-toolkit/blob/main/integrations/mcp-memgraph/src/mcp_memgraph/servers/__init__.py):

```python
AVAILABLE_SERVERS: Dict[str, Dict[str, Any]] = {
    "server": { ... },
    "memgraph-experimental": { ... },
    # Future servers can be added here:
    # "hygm": {
    #     "module": "mcp_memgraph.servers.hygm",
    #     ...
    # },
}
```

The Write Gate registers as one more entry in that dict.

## Interface spec

Three MCP tools, following the toolkit's snake_case verb-object convention.

### `write_node`

```
Input:
  label: str                                    # Target node label
  merge_keys: dict[str, str|int|float|bool]     # Identity for MERGE (scalars only)
  properties: dict[str, Any]                    # Data (no protected fields allowed)
  source: str                                   # Provenance: where the claim came from
  extraction_method: str                        # Must be in FormulaProvider.allowed_extraction_methods()
  reliability: float = 0.5                      # Clamped to [0.0, 1.0] before formula

Output (success):
  {
    "status": "written",
    "label": str,                               # Canonical (may differ if remapped)
    "merge_keys": dict,
    "confidence": float,                        # In [0.0, 1.0]
    "write_gate_version": str,                  # Semver: "MAJOR.MINOR.PATCH"
    "remapped_from": str | null                 # Present iff schema remap occurred
  }

Output (error):
  {
    "status": "rejected",
    "error_code": str,             # See Error Codes table below
    "message": str,
    "details": dict                # Optional diagnostic info
  }
```

**Emitted Cypher (reference implementation):**
```cypher
MERGE (n:CanonicalLabel {merge_key1: $v1, merge_key2: $v2})
SET n += $properties,
    n.confidence = $computed_confidence,
    n.source = $source,
    n.extraction_method = $extraction_method,
    n.write_gate_version = $gate_version,
    n.last_updated = datetime()
// If remapped:
SET n._schema_remap_from = $original_label
```

### `write_relationship`

```
Input:
  type: str                                     # Relationship type (validated against schema)
  from_label: str
  from_keys: dict[str, str|int|float|bool]      # Scalars only
  to_label: str
  to_keys: dict[str, str|int|float|bool]        # Scalars only
  properties: dict[str, Any] = {}               # No protected fields
  source: str
  extraction_method: str                        # Must be in FormulaProvider.allowed_extraction_methods()
  reliability: float = 0.5                      # Clamped to [0.0, 1.0] before formula
  endpoint_policy: str = "fail_if_missing"      # or "merge_endpoints"

Output: { "status": "written" | "rejected", ... }
```

**Endpoint resolution policy** (addresses the `:DIRECTED` everywhere pattern from #127):
- `fail_if_missing` (**default**): If either endpoint node doesn't exist, reject with `ENDPOINT_NOT_FOUND`. Agent must write the node first. Prevents silent creation of stub nodes.
- `merge_endpoints`: If either endpoint is missing, MERGE it by its keys with `_stub=true` flag. Opt-in for ingestion-style workloads that have ordering constraints.

### `refresh_schema_cache`

No arguments. Reloads the registered schema from its source without restarting the server. Returns count of labels loaded.

## Schema registry interface

Pluggable. The gate ships one reference implementation (graph-backed `:Schema` nodes); consumers can provide others by implementing:

```python
from dataclasses import dataclass, field
from typing import Protocol

@dataclass(frozen=True)
class SchemaEntry:
    label: str                                     # Canonical label
    required_properties: list[str] = field(default_factory=list)
    remaps_from: list[str] = field(default_factory=list)

class SchemaRegistry(Protocol):
    def get_entry(self, label: str) -> SchemaEntry | None: ...
    def all_canonical_labels(self) -> list[str]: ...
    def remap_target(self, label: str) -> str | None: ...
    def fallback_label(self) -> str | None: ...    # Used when policy=remap and no match
```

**Refresh semantics:** `refresh_schema_cache` loads a new snapshot then swaps the in-memory pointer atomically. In-flight writes complete against the pre-refresh snapshot.

**Unknown-label policy:** controlled by `WRITE_GATE_UNKNOWN_LABEL_POLICY` env var, values `remap` (default) or `reject`.

Graph-backed reference implementation reads:
```cypher
MATCH (s:Schema {allowed: true})
RETURN s.name AS label,
       s.required_properties AS required_properties,
       s.absorbs AS remaps_from
```

## Confidence formula provider

Also pluggable. The formula provider is responsible for two things: computing confidence and declaring which `extraction_method` values it accepts.

```python
class FormulaProvider(Protocol):
    def allowed_extraction_methods(self) -> set[str]: ...
    def compute(self, reliability: float, extraction_method: str) -> float: ...
```

The gate ships a **deliberately simple default** so the primitive doesn't dictate ideology:

```python
class DefaultFormulaProvider:
    WEIGHTS = {
        "api":    1.0,   # Live API / CLI / deterministic machine output
        "parsed": 0.85,  # Structured doc (YAML, JSON, HCL, Markdown frontmatter)
        "manual": 0.75,  # Human explicitly stated — capped below verified sources
        "llm":    0.60,  # LLM-extracted from unstructured text
    }
    def allowed_extraction_methods(self) -> set[str]:
        return set(self.WEIGHTS.keys())
    def compute(self, reliability: float, extraction_method: str) -> float:
        return reliability * self.WEIGHTS[extraction_method]
```

**Gate-enforced contract around the provider:**
- `reliability` is clamped to `[0.0, 1.0]` before the provider is called
- `extraction_method` must be in `provider.allowed_extraction_methods()` or the gate rejects with `INVALID_EXTRACTION_METHOD`
- The provider's returned value must be in `[0.0, 1.0]`; out-of-range returns reject with `FORMULA_INVALID_OUTPUT`

Consumers who want richer models plug in their own provider. See "Appendix: Example formula providers" for patterns other implementations have used.

**Protected fields** — the gate refuses writes where `properties` contains any of: `confidence`, `write_gate_version`, `source`, `extraction_method`, `last_updated`. These are gate-computed; agents declare inputs, gate produces outputs.

## Error codes

| Code | Meaning |
|---|---|
| `SCHEMA_UNKNOWN_LABEL` | Label not in registry and no remap target (reject-mode only) |
| `SCHEMA_MISSING_REQUIRED_PROPERTY` | Required property absent from `properties` + `merge_keys` |
| `SCHEMA_PROTECTED_FIELD` | Agent attempted to set a gate-computed field |
| `SCHEMA_SOURCE_UNAVAILABLE` | Registry load failed (graph unreachable, file missing, etc.) |
| `SCHEMA_TYPE_MISMATCH` | Property type doesn't match declared schema type |
| `ENDPOINT_NOT_FOUND` | `write_relationship` called with `fail_if_missing` and endpoint absent |
| `INVALID_EXTRACTION_METHOD` | Value not in `FormulaProvider.allowed_extraction_methods()` |
| `FORMULA_INVALID_OUTPUT` | Provider returned a value outside `[0.0, 1.0]` |

## Acceptance criteria

v1 is complete when:

- [ ] New entry in `AVAILABLE_SERVERS` pointing to `servers/write_gate.py`
- [ ] `write_node`, `write_relationship`, `refresh_schema_cache` tools registered
- [ ] Graph-backed schema registry reference implementation + one fake/in-memory implementation for tests
- [ ] Default confidence formula provider, with clamp + range validation + protected-fields check
- [ ] `_schema_remap_from` breadcrumb written on remap (remap-mode is default via `WRITE_GATE_UNKNOWN_LABEL_POLICY`)
- [ ] `endpoint_policy=fail_if_missing` is the default for `write_relationship` (covered by test row #8)
- [ ] Every code in the Error Codes table has a test that returns it
- [ ] Test matrix (below) passes, including the #127 regression (row #7)

## Test matrix

Minimum set, covering the invariants:

| # | Input | Expected |
|---|---|---|
| 1 | `write_node(label="Person", merge_keys={"name":"Alice"}, properties={"age": 30}, source="test", extraction_method="manual", reliability=0.9)` | `status=written`, `confidence=0.675` (0.9 × 0.75), `write_gate_version` set |
| 2 | Same, with `properties={"confidence": 1.0}` | `SCHEMA_PROTECTED_FIELD` |
| 3 | `write_node(label=":person", ...)` with schema registering `:Person` as canonical for `:person` | `status=written`, `label="Person"`, `remapped_from=":person"`, `_schema_remap_from` set on node |
| 4 | `write_node(label="ZZZNonexistent", ...)` in remap-mode | `status=written`, remapped to fallback (configurable); `_schema_remap_from=":ZZZNonexistent"` |
| 5 | Same, in reject-mode | `SCHEMA_UNKNOWN_LABEL` |
| 6 | `write_node(label="Person", merge_keys={}, properties={})` where schema requires `name` | `SCHEMA_MISSING_REQUIRED_PROPERTY`, details lists `name` |
| 7 | `write_relationship(type="DIRECTED", ...)` where `:DIRECTED` is not a registered type | `SCHEMA_UNKNOWN_LABEL` (regression for #127) |
| 8 | `write_relationship` with missing endpoint, `endpoint_policy=fail_if_missing` | `ENDPOINT_NOT_FOUND` |
| 9 | Same, with `endpoint_policy=merge_endpoints` | `status=written`, endpoint created with `_stub=true` |
| 10 | `refresh_schema_cache()` after adding `:Event` to registry | `{"loaded": <N+1>}`, subsequent `write_node(label="Event", ...)` succeeds |

## Follow-up (out of scope for v1)

Conflict detection (comparing existing vs incoming confidence to flag silent overwrites) and dedup / fuzzy entity resolution are the natural Phase 2 and Phase 3 additions; I'm willing to author both as follow-up issues once v1 lands. Cross-label entity identity, temporal decay in the comparator, and a MAGE in-process deployment variant are further-out options that should be discussed if there's community pull.

## Related art and standards

- **[W3C PROV-O: The PROV Ontology](https://www.w3.org/TR/prov-o/)** — the canonical vocabulary for provenance on the web. Recommended as the reference point for any provenance-field naming choices an implementation makes.
- **[Cognee](https://github.com/topoteretes/cognee)** — application-layer AI memory with knowledge-engine self-improvement. Complementary: a Memgraph-native Write Gate is the missing *substrate* for patterns like Cognee's.
- **Issue [#127](https://github.com/memgraph/ai-toolkit/issues/127)** — symptom of schema drift; the Write Gate prevents the class.

## Appendix: Example formula providers (informational)

The v1 default is intentionally simple. Domain-specific grading systems that slot in as alternative `FormulaProvider` implementations without changing the gate's interface:

- **[Admiralty Code](https://en.wikipedia.org/wiki/Admiralty_code)** — NATO AJP-2.1 two-dimensional grading (source reliability A-F × information credibility 1-6), widely used in Cyber Threat Intelligence
- **Flat declaration** — agent declares a 0-1 float, no transformation; simplest option for trusted agent pipelines
- **[STIX 2.1 Confidence Scales](https://docs.oasis-open.org/cti/stix/v2.1/os/stix-v2.1-os.html)** — OASIS-standardized confidence values with mappings across several qualitative scales (DNI, Admiralty, WEP)

## Reference implementation

I have a working implementation of this pattern I'll extract the v1 subset from. Happy to share offline with maintainers if useful during review.


Code	Meaning
`SCHEMA_UNKNOWN_LABEL`	Label not in registry and no remap target (reject-mode only)
`SCHEMA_MISSING_REQUIRED_PROPERTY`	Required property absent from `properties` + `merge_keys`
`SCHEMA_PROTECTED_FIELD`	Agent attempted to set a gate-computed field
`SCHEMA_SOURCE_UNAVAILABLE`	Registry load failed (graph unreachable, file missing, etc.)
`SCHEMA_TYPE_MISMATCH`	Property type doesn't match declared schema type
`ENDPOINT_NOT_FOUND`	`write_relationship` called with `fail_if_missing` and endpoint absent
`INVALID_EXTRACTION_METHOD`	Value not in `FormulaProvider.allowed_extraction_methods()`
`FORMULA_INVALID_OUTPUT`	Provider returned a value outside `[0.0, 1.0]`

#	Input	Expected
1	`write_node(label="Person", merge_keys={"name":"Alice"}, properties={"age": 30}, source="test", extraction_method="manual", reliability=0.9)`	`status=written`, `confidence=0.675` (0.9 × 0.75), `write_gate_version` set
2	Same, with `properties={"confidence": 1.0}`	`SCHEMA_PROTECTED_FIELD`
3	`write_node(label=":person", ...)` with schema registering `:Person` as canonical for `:person`	`status=written`, `label="Person"`, `remapped_from=":person"`, `_schema_remap_from` set on node
4	`write_node(label="ZZZNonexistent", ...)` in remap-mode	`status=written`, remapped to fallback (configurable); `_schema_remap_from=":ZZZNonexistent"`
5	Same, in reject-mode	`SCHEMA_UNKNOWN_LABEL`
6	`write_node(label="Person", merge_keys={}, properties={})` where schema requires `name`	`SCHEMA_MISSING_REQUIRED_PROPERTY`, details lists `name`
7	`write_relationship(type="DIRECTED", ...)` where `:DIRECTED` is not a registered type	`SCHEMA_UNKNOWN_LABEL` (regression for #127)
8	`write_relationship` with missing endpoint, `endpoint_policy=fail_if_missing`	`ENDPOINT_NOT_FOUND`
9	Same, with `endpoint_policy=merge_endpoints`	`status=written`, endpoint created with `_stub=true`
10	`refresh_schema_cache()` after adding `:Event` to registry	`{"loaded": <N+1>}`, subsequent `write_node(label="Event", ...)` succeeds

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Write Gate server — schema + provenance enforcement for agent-authored MERGE/SET #159

Problem

Proposal

Why a new server variant, not a flag on the existing server?

Interface spec

`write_node`

`write_relationship`

`refresh_schema_cache`

Schema registry interface

Confidence formula provider

Error codes

Acceptance criteria

Test matrix

Follow-up (out of scope for v1)

Related art and standards

Appendix: Example formula providers (informational)

Reference implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Write Gate server — schema + provenance enforcement for agent-authored MERGE/SET #159

Description

Problem

Proposal

Why a new server variant, not a flag on the existing server?

Interface spec

write_node

write_relationship

refresh_schema_cache

Schema registry interface

Confidence formula provider

Error codes

Acceptance criteria

Test matrix

Follow-up (out of scope for v1)

Related art and standards

Appendix: Example formula providers (informational)

Reference implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`write_node`

`write_relationship`

`refresh_schema_cache`