Skip to content

feat: weaver — multi-model ensemble for higher-quality answers#158

Open
yudduy wants to merge 1 commit into
Layr-Labs:masterfrom
yudduy:feat/weaver
Open

feat: weaver — multi-model ensemble for higher-quality answers#158
yudduy wants to merge 1 commit into
Layr-Labs:masterfrom
yudduy:feat/weaver

Conversation

@yudduy
Copy link
Copy Markdown

@yudduy yudduy commented May 13, 2026

Stanford Weaver paper: https://scalingintelligence.stanford.edu/pubs/weaver.pdf

Small models as a hive mind — d-inference is already a fleet of them, so ensembles felt like a natural fit. Opens the door to other ensemble strategies later.

What it does

New "Deep" / "Deep+" chat mode. Runs N candidate generations in parallel, scores them with verifier models, returns the winner + trace via SSE. Provider ↔ coordinator protocol unchanged.

API: "weaver": {"mode": "deep" | "deep_plus"} on /v1/chat/completions.
UI: Fast / Deep / Deep+ buttons in the composer.

Changes

  • Coordinator: orchestrator, verifier selection by family diversity, per-call billing
  • Catalog: Family + CanVerify on SupportedModel (memory + postgres)
  • Provider: reasoning parser opt-in by family (Qwen2.5 correctness fix)
  • UI: mode selector, live candidate strip, trace modal, new proxy routes

Known follow-ups

  • Billing reserve → refund is two non-atomic writes
  • No per-model SupportsWeaver flag
  • Confidence = 0 when only one candidate finishes
  • No dedicated E2E-encryption + weaver test

Tested

  • go test green across api, store, cmd/coordinator
  • Ran Deep and Deep+ end-to-end locally

Adds a coordinator-side orchestration mode that runs multiple candidate
generations in parallel and scores them with verifier models, returning
the highest-scoring response with a confidence metric. Inspired by the
Stanford Weaver paper (https://scalingintelligence.stanford.edu/pubs/weaver.pdf).

Coordinator:
- /v1/chat/completions accepts {"weaver": {"mode": "deep"|"deep_plus"}}
- Streams typed SSE events (weaver_init, candidate_delta, verifier_score,
  weaver_final) with the full trace
- Verifier selection picks diverse model families from the catalog, falls
  back to self-verification with a warning if no eligible verifiers exist
- Adds Family + CanVerify to SupportedModel (memory + postgres backends)
- Billing charges candidate and verifier calls independently

Provider:
- Reasoning parser opt-in by family (Gemma4, DeepSeek R1, Qwen3/QwQ);
  Qwen2.5 no longer mis-applies the qwen3 parser

Console UI:
- Fast / Deep / Deep+ mode selector in ChatInput
- Live candidate strip + tabbed trace modal
- New auth-checked proxy routes for device/provider/stats endpoints

Known follow-ups (not blocking):
- Billing reserve→refund is two non-atomic writes (no two-phase commit)
- No model-level SupportsWeaver flag (any catalog model is eligible)
- Confidence metric is 0 when only one candidate succeeds
- E2E + weaver path lacks a dedicated test
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

@yudduy is attempting to deploy a commit to the EigenLabs Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant