Skip to content

Pinned Loading

  1. vllm vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 57.2k 9.9k

  2. llm-compressor llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python 1.9k 221

  3. recipes recipes Public

    Common recipes to run vLLM

    124 35

Repositories

Showing 10 of 22 repositories
  • vllm-spyre Public

    Community maintained hardware plugin for vLLM on Spyre

    vllm-project/vllm-spyre’s past year of commit activity
    Python 32 Apache-2.0 21 5 17 Updated Sep 5, 2025
  • recipes Public

    Common recipes to run vLLM

    vllm-project/recipes’s past year of commit activity
    124 Apache-2.0 35 3 4 Updated Sep 4, 2025
  • vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    vllm-project/vllm’s past year of commit activity
    Python 57,234 Apache-2.0 9,912 1,806 (28 issues need help) 1,118 Updated Sep 5, 2025
  • vllm-xpu-kernels Public

    The vLLM XPU kernels for Intel GPU

    vllm-project/vllm-xpu-kernels’s past year of commit activity
    Python 6 Apache-2.0 11 0 8 Updated Sep 5, 2025
  • semantic-router Public

    Intelligent Mixture-of-Models Router for Efficient LLM Inference

    vllm-project/semantic-router’s past year of commit activity
    Python 646 Apache-2.0 52 28 3 Updated Sep 5, 2025
  • vllm-ascend Public

    Community maintained hardware plugin for vLLM on Ascend

    vllm-project/vllm-ascend’s past year of commit activity
    Python 1,073 Apache-2.0 408 397 (7 issues need help) 144 Updated Sep 4, 2025
  • aibrix Public

    Cost-efficient and pluggable Infrastructure components for GenAI inference

    vllm-project/aibrix’s past year of commit activity
    Go 4,193 Apache-2.0 452 219 (22 issues need help) 21 Updated Sep 5, 2025
  • speculators Public

    A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

    vllm-project/speculators’s past year of commit activity
    Python 40 Apache-2.0 7 12 (2 issues need help) 18 Updated Sep 4, 2025
  • llm-compressor Public

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    vllm-project/llm-compressor’s past year of commit activity
    Python 1,908 Apache-2.0 221 50 (7 issues need help) 41 Updated Sep 4, 2025
  • ci-infra Public

    This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

    vllm-project/ci-infra’s past year of commit activity
    HCL 18 36 0 9 Updated Sep 4, 2025