Trust-minimized indexer: glue mina-light-node (canonicity oracle) + mina-verify gate

## Summary

Define and pursue a **trust-minimized indexer**: combine the existing per-block verifier (`mina-verify` via `--verify-block-exe`) with an `o1-labs/mina-light-node` to remove trust in the *block source*, and (later) add output proofs to remove trust in the *serving process*. This issue records the trust model, the current ingestion architecture, and the integration design for the light node.

## Trust decomposed (the four properties)

"Trustless" is really four separate guarantees. Today only one is covered:

| Property | Question | Covered by | Status |
|---|---|---|---|
| **Validity** | Is each block internally valid (proof correct)? | `mina-verify` (`--verify-block-exe`), fail-closed | ✅ today |
| **Canonicity** | Is this the block consensus *selected* at this height? | indexer's own witness-tree rule today; should be the **light node** | ⚠️ trusts data source |
| **Completeness / DA** | Am I seeing *all* canonical blocks (no withholding)? | following the network = **light node** | ❌ today (trusts GCS bucket) |
| **Response faithfulness** | Does a query *answer* reflect what was ingested? | output proofs (Merkle inclusion) — see #<verify-out> | ❌ today |

**Key insight:** a valid block proof proves *validity, not canonicity*. Two competing forks can both carry valid SNARK proofs; the tie-break is Ouroboros Samasika consensus (length → `min_window_density` / chain strength → VRF/hash). So `mina-verify` alone, over blocks from an untrusted bucket, can ingest *valid-but-non-canonical* or *incomplete* data. The light node is what closes canonicity + completeness.

Framing: **verify-block = trustless *in* · light node = trustless *canonical selection* · output proofs = trustless *out*.** All three = end-to-end trustless. The first two = trust-minimized inputs, still-trusted serving.

## Current ingestion architecture (as-is)

Timer branch in `run_indexer` (every `min(fetch_delay, recovery_delay)`s) does, in order:
1. `fetch_new_blocks` → runs `EXE <network> <best_tip+1> <blocks_dir>` (`server.rs:918`). The exe is `block-pull` / `mesa-pull`.
2. `recover_missing_blocks` → for each **dangling branch** root, runs `EXE <network> <root_height−1> <blocks_dir>` to fill gaps (`server.rs:968`).
3. `reconcile_blocks_dir` → ingest any on-disk block the fs-watcher missed.

`block-pull.sh` lists the **public GCS bucket** `mina_network_block_data` for `prefix=<net>-<height>-` and downloads **every** block at each height in `[h, h+WINDOW)` — *forks included* — named `<net>-<height>-<hash>.json`, atomic-renamed into `blocks_dir`.

Then: fs-watcher / reconcile → **`verify_block` gate** (if enabled) → `block_pipeline` → witness tree. **Canonicity is computed by the indexer itself** from the witness tree (depth-`k`=290 confirmation, `canonical_threshold`). So today:
- The **bucket** decides what blocks exist (DA trust).
- `mina-verify` decides validity.
- The **indexer** decides canonicity — but only among the blocks the bucket chose to serve. Withhold the winning fork → the indexer picks a wrong canonical chain.

## Where the light node glues in

The light node verifies protocol-state proofs and applies consensus selection, so it independently knows the **canonical best chain** and **finality** (below tip−k). Two integration shapes:

### Option A — light node as the canonical block *source* (pull)
Replace/clone `block-pull` with `light-node-pull <network> <height> <dir>` that returns only the **canonical, verified** block(s) at a height (no forks). Cleanest conceptually — the indexer ingests a single canonical chain. **Problem:** Mina light nodes don't retain deep history, so this can't serve backfill/history bytes on its own.

### Option B — bucket serves bytes, light node *attests canonicity* (recommended)
Separate **data availability** from **canonical selection**:
- **Bytes:** keep `block-pull` (bucket has full history, cheap).
- **Canonical truth:** the light node exposes the verified best chain as a `(height → canonical state_hash)` map (and/or a `is_canonical(state_hash) → bool + attestation`). Anchored by the protocol-state SNARK + consensus.
- **Glue:** add a canonicity oracle to the ingest path — same external-process pattern as `verify-block`. The indexer reconciles its witness-tree canonicity against the oracle: a fork whose state_hash ≠ the light node's canonical hash at that height is marked non-canonical (or quarantined), regardless of what the bucket served.

This matches the repo's established **sidecar / thin-exe contract** philosophy (`--verify-block-exe` is already an external service; the light node is another), and keeps it fail-closed.

### How the light node decides + conveys "not canonical"
- **Decides:** verifies each candidate block's blockchain SNARK (validity), then among valid tips applies Samasika selection (length → density/`min_window_density` → VRF/state-hash) to compute the best chain. A block not on that chain is non-canonical. Below tip−k it's *final* (won't reorg); above, tentative.
- **Conveys:** the indexer asks either "canonical state_hash at height h?" (drives fork choice) or "is <state_hash> canonical/final?" (yes/no + optional proof). Align the finality boundary with the indexer's existing `k`=290.

## Hard parts / caveats (call out explicitly)
1. **Reorgs.** When the light node's best chain changes, the indexer must re-derive canonicity for affected heights. The witness tree already reorgs internally; the new work is keeping its canonical view *synced to the oracle*, including rollbacks.
2. **History / DA gap.** The light node anchors the *tip*; historical trustlessness comes from a **contiguous, gap-free, individually-verified chain** whose parent-hash links chain back to a light-node-verified tip. Enforce contiguity; that's what extends the tip's trust into history the light node no longer retains.
3. **Confirm what `mina-light-node` actually proves.** This whole design assumes it verifies the protocol-state proof *and* applies consensus selection. If it only follows a trusted peer's tip, the canonicity/completeness guarantees collapse to "trust that peer." **Action item: document the light node's exact guarantees before claiming trustless.**
4. **What you ultimately trust** even at best: soundness of Mina's recursive SNARK + consensus rules + the verifier/light-node binaries — *plus* the indexer's serving code until output proofs land. "Trust the math, not the operator" — name it precisely.

## Action items
- [ ] Confirm and document `o1-labs/mina-light-node`'s verification guarantees (proof + consensus selection, or partial?).
- [ ] Decide Option A vs B (recommend **B**: bucket bytes + light-node canonicity oracle).
- [ ] Spec the oracle contract (`is_canonical` / `canonical_at_height`) as an external sidecar mirroring `--verify-block-exe`.
- [ ] Design reorg reconciliation between the witness tree and the oracle.
- [ ] Enforce parent-hash contiguity as the historical-integrity link.
- [ ] (Separate) output proofs for trustless *responses* — see the verify-out / Merkle-proof issue.

## Related
- `mina-verify` / `--verify-block-exe` trustless gate (already shipped).
- Verify-out / Merkle-inclusion proofs discussion (output side of trustlessness).
- #19 (testing) — a trustless setup needs negative tests proving non-canonical/invalid blocks are rejected.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trust-minimized indexer: glue mina-light-node (canonicity oracle) + mina-verify gate #23

Summary

Trust decomposed (the four properties)

Current ingestion architecture (as-is)

Where the light node glues in

Option A — light node as the canonical block source (pull)

Option B — bucket serves bytes, light node attests canonicity (recommended)

How the light node decides + conveys "not canonical"

Hard parts / caveats (call out explicitly)

Action items

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Property	Question	Covered by	Status
Validity	Is each block internally valid (proof correct)?	`mina-verify` (`--verify-block-exe`), fail-closed	✅ today
Canonicity	Is this the block consensus selected at this height?	indexer's own witness-tree rule today; should be the light node	⚠️ trusts data source
Completeness / DA	Am I seeing all canonical blocks (no withholding)?	following the network = light node	❌ today (trusts GCS bucket)
Response faithfulness	Does a query answer reflect what was ingested?	output proofs (Merkle inclusion) — see #	❌ today

Trust-minimized indexer: glue mina-light-node (canonicity oracle) + mina-verify gate #23

Description

Summary

Trust decomposed (the four properties)

Current ingestion architecture (as-is)

Where the light node glues in

Option A — light node as the canonical block source (pull)

Option B — bucket serves bytes, light node attests canonicity (recommended)

How the light node decides + conveys "not canonical"

Hard parts / caveats (call out explicitly)

Action items

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions