Skip to content

Commit 8330cb5

Browse files
unamedkrclaude
andcommitted
feat(wasm): SmolLM2-135M default (fast) + Llama 1B option (quality)
1B model causes 15-30s+ prefill hang in WASM — unusable as default. SmolLM2-135M: 135MB download, <2s prefill, ~10-20 tok/s in WASM. Quality is basic but responsive — proper demo experience. Llama 3.2 1B Instruct kept as "Quality" option for users willing to wait for the larger model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d389891 commit 8330cb5

1 file changed

Lines changed: 16 additions & 3 deletions

File tree

wasm/index.html

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -174,10 +174,15 @@ <h2>Run an <span>LLM</span> in your browser</h2>
174174
<p class="subtitle">No install. No API key. No server.</p>
175175

176176
<div class="model-cards" id="modelCards">
177-
<div class="model-card recommended" id="card-llama" onclick="loadDemoModel('llama-3.2-1b')">
177+
<div class="model-card recommended" id="card-smol" onclick="loadDemoModel('smollm2-135m')">
178+
<div class="name">SmolLM2 135M</div>
179+
<div class="meta" id="meta-smol">~135 MB &middot; Fast response</div>
180+
<span class="tag">Fast</span>
181+
</div>
182+
<div class="model-card" id="card-llama" onclick="loadDemoModel('llama-3.2-1b')">
178183
<div class="name">Llama 3.2 1B Instruct</div>
179-
<div class="meta" id="meta-llama">~770 MB &middot; Verified quality</div>
180-
<span class="tag">Recommended</span>
184+
<div class="meta" id="meta-llama">~770 MB &middot; Better quality</div>
185+
<span class="tag blue">Quality</span>
181186
</div>
182187
</div>
183188

@@ -218,6 +223,14 @@ <h2>Run an <span>LLM</span> in your browser</h2>
218223
let activeModelId = null;
219224

220225
const MODELS = {
226+
'smollm2-135m': {
227+
url: 'https://huggingface.co/Felladrin/gguf-Q8_0-SmolLM2-135M-Instruct/resolve/main/smollm2-135m-instruct-q8_0.gguf',
228+
name: 'SmolLM2 135M',
229+
size: 135,
230+
cacheKey: 'smollm2-135m-q8',
231+
chatTemplate: (t) => t, // SmolLM2 works best with plain text prompts
232+
cardId: 'card-smol', metaId: 'meta-smol',
233+
},
221234
'llama-3.2-1b': {
222235
url: 'https://huggingface.co/hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF/resolve/main/llama-3.2-1b-instruct-q4_k_m.gguf',
223236
name: 'Llama 3.2 1B Instruct',

0 commit comments

Comments
 (0)