fmwork engines/backends

## Current status

Currently support limited engines. Many potentially relevant engines are missing - some were previously integrated but dropped, others are new alternatives with promising performance.

1. Tier 1 (High Priority)

- vLLM
- torch+transformers (true baseline)
- TRT-LLM (NVIDIA's, main competitor)

2. Tier 2 (Medium Priority)

- SGLang
- any other inference backend (w/ server-mode) that could compete w/ vLLM

3. Tier 3

- optimum-habana (Gaudi2/3)
- TGI/TGIS
- llama.cpp
- LMDeploy
- any other backend that might implement a few optimization techniques that we should have a look at

## Plan to be initiated

- This issue serves as a tracking placeholder. Detailed implementation plan will be developed after initial investigation of each engine's integration requirements and compatibility with our framework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fmwork engines/backends #30

Current status

Plan to be initiated

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fmwork engines/backends #30

Description

Current status

Plan to be initiated

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions