Skip to content

Trust-minimized indexer: glue mina-light-node (canonicity oracle) + mina-verify gate #23

@dkijania

Description

@dkijania

Summary

Define and pursue a trust-minimized indexer: combine the existing per-block verifier (mina-verify via --verify-block-exe) with an o1-labs/mina-light-node to remove trust in the block source, and (later) add output proofs to remove trust in the serving process. This issue records the trust model, the current ingestion architecture, and the integration design for the light node.

Trust decomposed (the four properties)

"Trustless" is really four separate guarantees. Today only one is covered:

Property Question Covered by Status
Validity Is each block internally valid (proof correct)? mina-verify (--verify-block-exe), fail-closed ✅ today
Canonicity Is this the block consensus selected at this height? indexer's own witness-tree rule today; should be the light node ⚠️ trusts data source
Completeness / DA Am I seeing all canonical blocks (no withholding)? following the network = light node ❌ today (trusts GCS bucket)
Response faithfulness Does a query answer reflect what was ingested? output proofs (Merkle inclusion) — see # ❌ today

Key insight: a valid block proof proves validity, not canonicity. Two competing forks can both carry valid SNARK proofs; the tie-break is Ouroboros Samasika consensus (length → min_window_density / chain strength → VRF/hash). So mina-verify alone, over blocks from an untrusted bucket, can ingest valid-but-non-canonical or incomplete data. The light node is what closes canonicity + completeness.

Framing: verify-block = trustless in · light node = trustless canonical selection · output proofs = trustless out. All three = end-to-end trustless. The first two = trust-minimized inputs, still-trusted serving.

Current ingestion architecture (as-is)

Timer branch in run_indexer (every min(fetch_delay, recovery_delay)s) does, in order:

  1. fetch_new_blocks → runs EXE <network> <best_tip+1> <blocks_dir> (server.rs:918). The exe is block-pull / mesa-pull.
  2. recover_missing_blocks → for each dangling branch root, runs EXE <network> <root_height−1> <blocks_dir> to fill gaps (server.rs:968).
  3. reconcile_blocks_dir → ingest any on-disk block the fs-watcher missed.

block-pull.sh lists the public GCS bucket mina_network_block_data for prefix=<net>-<height>- and downloads every block at each height in [h, h+WINDOW)forks included — named <net>-<height>-<hash>.json, atomic-renamed into blocks_dir.

Then: fs-watcher / reconcile → verify_block gate (if enabled) → block_pipeline → witness tree. Canonicity is computed by the indexer itself from the witness tree (depth-k=290 confirmation, canonical_threshold). So today:

  • The bucket decides what blocks exist (DA trust).
  • mina-verify decides validity.
  • The indexer decides canonicity — but only among the blocks the bucket chose to serve. Withhold the winning fork → the indexer picks a wrong canonical chain.

Where the light node glues in

The light node verifies protocol-state proofs and applies consensus selection, so it independently knows the canonical best chain and finality (below tip−k). Two integration shapes:

Option A — light node as the canonical block source (pull)

Replace/clone block-pull with light-node-pull <network> <height> <dir> that returns only the canonical, verified block(s) at a height (no forks). Cleanest conceptually — the indexer ingests a single canonical chain. Problem: Mina light nodes don't retain deep history, so this can't serve backfill/history bytes on its own.

Option B — bucket serves bytes, light node attests canonicity (recommended)

Separate data availability from canonical selection:

  • Bytes: keep block-pull (bucket has full history, cheap).
  • Canonical truth: the light node exposes the verified best chain as a (height → canonical state_hash) map (and/or a is_canonical(state_hash) → bool + attestation). Anchored by the protocol-state SNARK + consensus.
  • Glue: add a canonicity oracle to the ingest path — same external-process pattern as verify-block. The indexer reconciles its witness-tree canonicity against the oracle: a fork whose state_hash ≠ the light node's canonical hash at that height is marked non-canonical (or quarantined), regardless of what the bucket served.

This matches the repo's established sidecar / thin-exe contract philosophy (--verify-block-exe is already an external service; the light node is another), and keeps it fail-closed.

How the light node decides + conveys "not canonical"

  • Decides: verifies each candidate block's blockchain SNARK (validity), then among valid tips applies Samasika selection (length → density/min_window_density → VRF/state-hash) to compute the best chain. A block not on that chain is non-canonical. Below tip−k it's final (won't reorg); above, tentative.
  • Conveys: the indexer asks either "canonical state_hash at height h?" (drives fork choice) or "is <state_hash> canonical/final?" (yes/no + optional proof). Align the finality boundary with the indexer's existing k=290.

Hard parts / caveats (call out explicitly)

  1. Reorgs. When the light node's best chain changes, the indexer must re-derive canonicity for affected heights. The witness tree already reorgs internally; the new work is keeping its canonical view synced to the oracle, including rollbacks.
  2. History / DA gap. The light node anchors the tip; historical trustlessness comes from a contiguous, gap-free, individually-verified chain whose parent-hash links chain back to a light-node-verified tip. Enforce contiguity; that's what extends the tip's trust into history the light node no longer retains.
  3. Confirm what mina-light-node actually proves. This whole design assumes it verifies the protocol-state proof and applies consensus selection. If it only follows a trusted peer's tip, the canonicity/completeness guarantees collapse to "trust that peer." Action item: document the light node's exact guarantees before claiming trustless.
  4. What you ultimately trust even at best: soundness of Mina's recursive SNARK + consensus rules + the verifier/light-node binaries — plus the indexer's serving code until output proofs land. "Trust the math, not the operator" — name it precisely.

Action items

  • Confirm and document o1-labs/mina-light-node's verification guarantees (proof + consensus selection, or partial?).
  • Decide Option A vs B (recommend B: bucket bytes + light-node canonicity oracle).
  • Spec the oracle contract (is_canonical / canonical_at_height) as an external sidecar mirroring --verify-block-exe.
  • Design reorg reconciliation between the witness tree and the oracle.
  • Enforce parent-hash contiguity as the historical-integrity link.
  • (Separate) output proofs for trustless responses — see the verify-out / Merkle-proof issue.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions