feat: weaver — multi-model ensemble for higher-quality answers by yudduy · Pull Request #158 · Layr-Labs/d-inference

yudduy · 2026-05-13T23:53:11Z

Stanford Weaver paper: https://scalingintelligence.stanford.edu/pubs/weaver.pdf

Small models as a hive mind — d-inference is already a fleet of them, so ensembles felt like a natural fit. Opens the door to other ensemble strategies later.

What it does

New "Deep" / "Deep+" chat mode. Runs N candidate generations in parallel, scores them with verifier models, returns the winner + trace via SSE. Provider ↔ coordinator protocol unchanged.

API: "weaver": {"mode": "deep" | "deep_plus"} on /v1/chat/completions.
UI: Fast / Deep / Deep+ buttons in the composer.

Changes

Coordinator: orchestrator, verifier selection by family diversity, per-call billing
Catalog: Family + CanVerify on SupportedModel (memory + postgres)
Provider: reasoning parser opt-in by family (Qwen2.5 correctness fix)
UI: mode selector, live candidate strip, trace modal, new proxy routes

Known follow-ups

Billing reserve → refund is two non-atomic writes
No per-model SupportsWeaver flag
Confidence = 0 when only one candidate finishes
No dedicated E2E-encryption + weaver test

Tested

go test green across api, store, cmd/coordinator
Ran Deep and Deep+ end-to-end locally

Adds a coordinator-side orchestration mode that runs multiple candidate generations in parallel and scores them with verifier models, returning the highest-scoring response with a confidence metric. Inspired by the Stanford Weaver paper (https://scalingintelligence.stanford.edu/pubs/weaver.pdf). Coordinator: - /v1/chat/completions accepts {"weaver": {"mode": "deep"|"deep_plus"}} - Streams typed SSE events (weaver_init, candidate_delta, verifier_score, weaver_final) with the full trace - Verifier selection picks diverse model families from the catalog, falls back to self-verification with a warning if no eligible verifiers exist - Adds Family + CanVerify to SupportedModel (memory + postgres backends) - Billing charges candidate and verifier calls independently Provider: - Reasoning parser opt-in by family (Gemma4, DeepSeek R1, Qwen3/QwQ); Qwen2.5 no longer mis-applies the qwen3 parser Console UI: - Fast / Deep / Deep+ mode selector in ChatInput - Live candidate strip + tabbed trace modal - New auth-checked proxy routes for device/provider/stats endpoints Known follow-ups (not blocking): - Billing reserve→refund is two non-atomic writes (no two-phase commit) - No model-level SupportsWeaver flag (any catalog model is eligible) - Confidence metric is 0 when only one candidate succeeds - E2E + weaver path lacks a dedicated test

vercel · 2026-05-13T23:53:16Z

@yudduy is attempting to deploy a commit to the EigenLabs Team on Vercel.

A member of the Team first needs to authorize it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: weaver — multi-model ensemble for higher-quality answers#158

feat: weaver — multi-model ensemble for higher-quality answers#158
yudduy wants to merge 1 commit into
Layr-Labs:masterfrom
yudduy:feat/weaver

yudduy commented May 13, 2026

Uh oh!

vercel Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yudduy commented May 13, 2026

What it does

Changes

Known follow-ups

Tested

Uh oh!

vercel Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant