Llama benchmark #112

gxsoar · 2023-11-25T13:14:14Z

Llama Benchmark

Use PyTorch with TorchDynamo to perform vicuna end-to-end inference.

Environments

Run on Ubuntu 22.04.1 LTS
CPU: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz
GPU: NVIDIA GeForce RTX 3090
CUDA：CUDA Version: 12.0
python：python3.9
pytorch：2.0.0+cu118
Anaconda：Miniconda3

Benchmark Time

CPU time per round of inference:
pytorch average time per round of inference: 982.4393878173828 ms
pytorch with torchdynamo average time per round of inference:977.5693103027344 ms
GPU time per round of inference:
pytorch average time per round of inference: 25.33698874791463ms
pytorch with torchdynamo average time per round of inference:19.13074951807658ms

notion-workspace · 2023-11-27T06:26:50Z

Pre-Task: Initial LLaMA Benchmark

gxsoar added 2 commits November 25, 2023 21:03

add vicuna test files

01e4ecb

fix file name

c651778

gxsoar added 2 commits November 28, 2023 05:56

format code and add READEME

746b0ac

change file name

c9a2fda

gxsoar changed the title ~~Vicuna-7b test~~ Llama benchmark Nov 28, 2023

update llama READE

4a01a82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama benchmark #112

Llama benchmark #112

Uh oh!

gxsoar commented Nov 25, 2023 •

edited

Loading

Uh oh!

notion-workspace bot commented Nov 27, 2023

Uh oh!

Uh oh!

Llama benchmark #112

Are you sure you want to change the base?

Llama benchmark #112

Uh oh!

Conversation

gxsoar commented Nov 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Llama Benchmark

Environments

Benchmark Time

Uh oh!

notion-workspace bot commented Nov 27, 2023

Uh oh!

Uh oh!

gxsoar commented Nov 25, 2023 •

edited

Loading