Skip to content

rayzhou4/llm-inference-frameworks-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

LLM Inference Frameworks Benchmark

About

This project aims to find the benchmark performance of many popular LLM inference frameworks. I am currently planning to test vLLM, TensorRT-LLM, FasterTransformer, ONNX Runtime, and DeepSpeed. The intention is to determine the best inference framework given the specified hardware and LLM model. This will be based off multiple evaluation criterias such as performance metrics (ex. throughput, latency, and scalability), whilst also considering other factors such as hardware utlization efficiency, model support, ease of use, hardware/software flexibility, optimization features, deployment complexity and etc.

Test Specs

This project will be solely ran on Kaggle (unless problems arise) and the hardware and software specifications are noted in my Kaggle Specs Tester Notebook.

More to come...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published