adding the NIM LLM TCO calculator tool #389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

vinhngx wants to merge 5 commits into triton-inference-server:main from vinhngx:vinhn-TCO-calculator

vinhngx commented May 21, 2025

This tool allow exporting data from genAI-perf to the NIM LLM spreadsheet TCO calculator tool


          adding the NIM LLM TCO calculator tool

293e517

ajcasagrande requested review from debermudez and ajcasagrande

May 21, 2025 16:03

ajcasagrande reviewed

View reviewed changes

Contributor

ajcasagrande left a comment

Lots of small comments and/or suggestions. Some questions regarding file placement and Google access. Other than that, looks good and nothing major.

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated Show resolved Hide resolved

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "```\n",
+                  "export NGC_API_KEY=<YOUR_NGC_API_KEY> \n",
+                  "\n",
+                  "# Choose a container name for bookkeeping\n",

Contributor

ajcasagrande May 21, 2025

What do you mean by bookkeeping here?

Author

vinhngx May 22, 2025

Terminology from the NIM documentation: https://docs.nvidia.com/nim/large-language-models/latest/getting-started.html

My interpretation is that it's the same as "logging".

Author

vinhngx May 22, 2025

rewrote this portion and removed the comment. I think the name must be compliant with NGC and not just for bookkeeping

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb

+                 ],
+                 "source": [
+                  "%%writefile benchmark.sh\n",
+                  "declare -A useCases\n",

Contributor

ajcasagrande May 21, 2025

Recommend adding missing shebang

#!/usr/bin/env bash

Author

vinhngx May 22, 2025

agreed. Added.

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "        local INPUT_SEQUENCE_STD=0\n",
+                  "        local OUTPUT_SEQUENCE_LENGTH=$outputLength\n",
+                  "        local CONCURRENCY=$concurrency\n",
+                  "        local MODEL=meta/llama-3.1-8b-instruct\n",

Contributor

ajcasagrande May 21, 2025

MODEL and --tokenizer are hard coded in the command here, however the user has the ability to choose the container name above, which could affect this. Recommend exposing the model/tokenizer as external env vars.

Author

vinhngx May 22, 2025 •

edited

Loading

moved to top of file

export MODEL=meta/llama-3.1-8b-instruct # NGC model name
export TOKENIZER_PATH=meta-llama/Meta-Llama-3-8B-Instruct # Either a HF model or path to a local folder containing the tokenizer

however user will still have to manually define this within the file, as model & tokenizer use slightly different model naming (ie. HF vs. NGC)

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                 "source": [
+                  "## Exporting data to excel format\n",
+                  "\n",
+                  "We next export the benchmarking data to a TCO-tool compatible format, which comprises both metadata fields as well as performance metric fields."

Contributor

ajcasagrande May 21, 2025

should this be TCO-tool or TCO-calculator compatible format?

Author

vinhngx May 22, 2025

agreed

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "root_dir = \"./artifacts\"\n",
+                  "directory_prefix = \"meta_llama-3.1-8b-instruct-openai-chat-concurrency\" # Change this to fit the actual model deployed\n",
+                  "\n",
+                  "ISL_OSL_list = [\"200_5\", \"200_200\", \"1000_200\", \"200_1000\"]\n",

Contributor

ajcasagrande May 21, 2025

recommend full caps: ISL_OSL_LIST

Author

vinhngx May 22, 2025

agreed

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "directory_prefix = \"meta_llama-3.1-8b-instruct-openai-chat-concurrency\" # Change this to fit the actual model deployed\n",
+                  "\n",
+                  "ISL_OSL_list = [\"200_5\", \"200_200\", \"1000_200\", \"200_1000\"]\n",
+                  "concurrencies = [1, 2, 5, 10, 50, 100, 250]\n",

Contributor

ajcasagrande May 21, 2025

consider capitalizing CONCURRENCIES to match ISL_OSL

Author

vinhngx May 22, 2025

agreed

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "df = pd.DataFrame(columns=gen_AI_perf_field)\n",
+                  "\n",
+                  "for con in concurrencies:\n",
+                  "    for ISL_OSL in ISL_OSL_list:\n",

Contributor

ajcasagrande May 21, 2025

recommend lowercase for local loop variables: for isl_osl in ISL_OSL_LIST:

Author

vinhngx May 22, 2025

agreed

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated Show resolved Hide resolved

genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb Outdated

+                  "concurrencies = [1, 2, 5, 10, 50, 100, 250]\n",
+                  "df = pd.DataFrame(columns=gen_AI_perf_field)\n",
+                  "\n",
+                  "for con in concurrencies:\n",

Contributor

ajcasagrande May 21, 2025

Consider rename for concurrency in concurrencies:

Author

vinhngx May 22, 2025

agreed

vinhngx and others added 2 commits

May 22, 2025 10:14


          Update genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb

1b9fc8f

Co-authored-by: Anthony Casagrande <[email protected]>


          Update genai-perf/genai_perf/TCO_calculator/TCO_calculator.ipynb

f9b898e

Co-authored-by: Anthony Casagrande <[email protected]>

debermudez reviewed

View reviewed changes

Contributor

debermudez left a comment

I think that @ajcasagrande did a great job reviewing this and I agree with his comments.
My concern is protecting this. I would like to see the contents of this migrated to a CI test to ensure we know asap if something breaks.
@ganeshku1 can we prioritize that?

vinhngx added 2 commits

May 22, 2025 01:51


          fix after review

7a88035


          restructure folder

8c3a43a

Author

vinhngx commented May 22, 2025

Restructured and revised as suggested. Please review again.

Author

vinhngx commented Jun 3, 2025

@ajcasagrande @debermudez do you mind reviewing this again? (The TCO spreadsheet calculator link is TBD but rest is good to go)

Contributor

debermudez commented Jun 3, 2025

@ganeshku1 what is the priority around ensuring the examples in here are covered in the CI?

Collaborator

ganeshku1 commented Jun 3, 2025 •

edited

Loading

I think that @ajcasagrande did a great job reviewing this and I agree with his comments. My concern is protecting this. I would like to see the contents of this migrated to a CI test to ensure we know asap if something breaks. @ganeshku1 can we prioritize that?

Since we're currently focused on AIPerf, this isn't a high priority right now. I don't anticipate any new features coming into GAP, so I’d prefer the team not invest significant effort here at the moment.

Lets get this initial doc in and we will have to port over this to AIPerf, we will add CI as part of that porting work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet