Skip to content

fmwork engines/backends #30

@WarningRan

Description

@WarningRan

Current status

Currently support limited engines. Many potentially relevant engines are missing - some were previously integrated but dropped, others are new alternatives with promising performance.

  1. Tier 1 (High Priority)
  • vLLM
  • torch+transformers (true baseline)
  • TRT-LLM (NVIDIA's, main competitor)
  1. Tier 2 (Medium Priority)
  • SGLang
  • any other inference backend (w/ server-mode) that could compete w/ vLLM
  1. Tier 3
  • optimum-habana (Gaudi2/3)
  • TGI/TGIS
  • llama.cpp
  • LMDeploy
  • any other backend that might implement a few optimization techniques that we should have a look at

Plan to be initiated

  • This issue serves as a tracking placeholder. Detailed implementation plan will be developed after initial investigation of each engine's integration requirements and compatibility with our framework.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions