-
Notifications
You must be signed in to change notification settings - Fork 31
adding the NIM LLM TCO calculator tool #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
adding the NIM LLM TCO calculator tool #389
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of small comments and/or suggestions. Some questions regarding file placement and Google access. Other than that, looks good and nothing major.
"```\n", | ||
"export NGC_API_KEY=<YOUR_NGC_API_KEY> \n", | ||
"\n", | ||
"# Choose a container name for bookkeeping\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by bookkeeping
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Terminology from the NIM documentation: https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html
My interpretation is that it's the same as "logging".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrote this portion and removed the comment. I think the name must be compliant with NGC and not just for bookkeeping
], | ||
"source": [ | ||
"%%writefile benchmark.sh\n", | ||
"declare -A useCases\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend adding missing shebang
#!/usr/bin/env bash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. Added.
" local INPUT_SEQUENCE_STD=0\n", | ||
" local OUTPUT_SEQUENCE_LENGTH=$outputLength\n", | ||
" local CONCURRENCY=$concurrency\n", | ||
" local MODEL=meta/llama-3.1-8b-instruct\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MODEL
and --tokenizer
are hard coded in the command here, however the user has the ability to choose the container name above, which could affect this. Recommend exposing the model/tokenizer as external env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to top of file
export MODEL=meta/llama-3.1-8b-instruct # NGC model name
export TOKENIZER_PATH=meta-llama/Meta-Llama-3-8B-Instruct # Either a HF model or path to a local folder containing the tokenizer
however user will still have to manually define this within the file, as model & tokenizer use slightly different model naming (ie. HF vs. NGC)
"source": [ | ||
"## Exporting data to excel format\n", | ||
"\n", | ||
"We next export the benchmarking data to a TCO-tool compatible format, which comprises both metadata fields as well as performance metric fields." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be TCO-tool
or TCO-calculator
compatible format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
"root_dir = \"./artifacts\"\n", | ||
"directory_prefix = \"meta_llama-3.1-8b-instruct-openai-chat-concurrency\" # Change this to fit the actual model deployed\n", | ||
"\n", | ||
"ISL_OSL_list = [\"200_5\", \"200_200\", \"1000_200\", \"200_1000\"]\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend full caps: ISL_OSL_LIST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
"directory_prefix = \"meta_llama-3.1-8b-instruct-openai-chat-concurrency\" # Change this to fit the actual model deployed\n", | ||
"\n", | ||
"ISL_OSL_list = [\"200_5\", \"200_200\", \"1000_200\", \"200_1000\"]\n", | ||
"concurrencies = [1, 2, 5, 10, 50, 100, 250]\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider capitalizing CONCURRENCIES
to match ISL_OSL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
"df = pd.DataFrame(columns=gen_AI_perf_field)\n", | ||
"\n", | ||
"for con in concurrencies:\n", | ||
" for ISL_OSL in ISL_OSL_list:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend lowercase for local loop variables: for isl_osl in ISL_OSL_LIST:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
"concurrencies = [1, 2, 5, 10, 50, 100, 250]\n", | ||
"df = pd.DataFrame(columns=gen_AI_perf_field)\n", | ||
"\n", | ||
"for con in concurrencies:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider rename for concurrency in concurrencies:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
Co-authored-by: Anthony Casagrande <[email protected]>
Co-authored-by: Anthony Casagrande <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that @ajcasagrande did a great job reviewing this and I agree with his comments.
My concern is protecting this. I would like to see the contents of this migrated to a CI test to ensure we know asap if something breaks.
@ganeshku1 can we prioritize that?
Restructured and revised as suggested. Please review again. |
@ajcasagrande @debermudez do you mind reviewing this again? (The TCO spreadsheet calculator link is TBD but rest is good to go) |
@ganeshku1 what is the priority around ensuring the examples in here are covered in the CI? |
Since we're currently focused on AIPerf, this isn't a high priority right now. I don't anticipate any new features coming into GAP, so I’d prefer the team not invest significant effort here at the moment. Lets get this initial doc in and we will have to port over this to AIPerf, we will add CI as part of that porting work. |
This tool allow exporting data from genAI-perf to the NIM LLM spreadsheet TCO calculator tool