Skip to content
@llm-d-incubation

llm-d incubation

Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework

Popular repositories Loading

  1. llm-d-infra llm-d-infra Public

    llm-d helm charts and deployment examples

    Go Template 50 55

  2. llm-d-modelservice llm-d-modelservice Public

    helm charts for deploying models with llm-d

    Go Template 28 51

  3. llm-d-fast-model-actuation llm-d-fast-model-actuation Public

    Kubernetes controllers for fast model actuation using vLLM sleep/wake and launcher-based model swapping

    Go 9 11

  4. batch-gateway batch-gateway Public

    The batch gateway is an llm-d implementation of the OpenAI batch inference API

    Go 5 9

  5. ig-wva ig-wva Public

    Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives

    Jupyter Notebook 2 1

  6. llm-d-ci llm-d-ci Public

    Shell 2 2

Repositories

Showing 9 of 9 repositories

Top languages

Loading…

Most used topics

Loading…