Add SDPA attention fallback and design docs by Lebhoryi · Pull Request #231 · GeeeekExplorer/nano-vllm

Lebhoryi · 2026-05-12T08:03:24Z

Summary

Fix model initialization for configs that do not expose dtype, falling back to torch_dtype or the current PyTorch default dtype.
Add a selectable attention backend with PyTorch SDPA fallback when flash-attn is unavailable, controlled by NANOVLLM_ATTENTION_BACKEND.
Add overview and detailed design docs covering scheduling, KV cache, prefix cache, attention context, tensor parallelism, and execution flow.

Test plan

python -m py_compile nanovllm/engine/model_runner.py nanovllm/layers/attention.py

Qwen3Config may not expose dtype, so fall back to torch_dtype or the current default dtype before initializing model weights and sizing KV cache.

Allow running without flash-attn and document backend selection via NANOVLLM_ATTENTION_BACKEND.

Document the engine architecture, scheduling flow, KV cache lifecycle, prefix cache behavior, attention context, and execution pipeline.

Lebhoryi added 3 commits May 12, 2026 15:22

fix(model): handle missing config dtype

d6e5659

Qwen3Config may not expose dtype, so fall back to torch_dtype or the current default dtype before initializing model weights and sizing KV cache.

feat(attention): add SDPA backend fallback

2d9b51b

Allow running without flash-attn and document backend selection via NANOVLLM_ATTENTION_BACKEND.

docs: add Nano-vLLM design docs

d41fbe2

Document the engine architecture, scheduling flow, KV cache lifecycle, prefix cache behavior, attention context, and execution pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SDPA attention fallback and design docs#231

Add SDPA attention fallback and design docs#231
Lebhoryi wants to merge 3 commits into
GeeeekExplorer:mainfrom
Lebhoryi:ccy_dev

Lebhoryi commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lebhoryi commented May 12, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant