Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
7bb4042
feat: exporters support overrides; bugfix: auto-export on slurm; todo…
ka00ri Oct 2, 2025
8c58db6
bugfix: add remote_local storage type to avoid bypass and log correct…
ka00ri Oct 2, 2025
e587078
Merge remote-tracking branch 'origin/main' into mboubdir/exporters
ka00ri Oct 2, 2025
1510bc7
Merge remote-tracking branch 'origin/main' into mboubdir/exporters
ka00ri Oct 2, 2025
e77374f
feat(exporter): unify artifacts logging for mlflow/wandb
ka00ri Oct 2, 2025
2080e1f
feat(exporter): unify artifacts logging for mlflow/wandb
ka00ri Oct 2, 2025
34ff640
fix(exporter): allow logging all artifacts excluding dirs (cache, ..)
ka00ri Oct 2, 2025
a5af5ad
fix: support spreadsheet_id for gsheets
ka00ri Oct 2, 2025
99c9f62
Merge remote-tracking branch 'origin/main' into mboubdir/exporters
ka00ri Oct 7, 2025
0766832
fix(exporters): log user config without keys, adapt tests to new feat…
ka00ri Oct 7, 2025
92639ab
chore(exoporters): add auto-export config example following new setup
ka00ri Oct 7, 2025
b52f06b
chore(exporters): update docs
ka00ri Oct 7, 2025
447db2e
Merge branch 'main' into mboubdir/exporters
ka00ri Oct 8, 2025
44ae3e8
chore: merge and resolve conflicts
ka00ri Oct 9, 2025
49d3f32
Merge branch 'main' into mboubdir/exporters
ka00ri Oct 13, 2025
f2b56dc
fix(exporters): fix local export of remote job; optimize ssh connecti…
ka00ri Oct 13, 2025
8f0dac2
fix(exporters): safe logging to mlflow
ka00ri Oct 13, 2025
de3ca8a
chore(cli): add debugging helper cmd
ka00ri Oct 14, 2025
cf97c9a
chore(cli): add docstring to debug
ka00ri Oct 14, 2025
cde5b6c
Merge remote-tracking branch 'origin/main' into mboubdir/exporters
ka00ri Oct 14, 2025
ee6768e
chore(lint): debug cli test
ka00ri Oct 14, 2025
30b2472
Merge branch 'main' into mboubdir/exporters
ka00ri Oct 17, 2025
1c55e2a
chore(cli): rename debug to info, display key files
ka00ri Oct 17, 2025
3b8b0a9
Merge branch 'main' into mboubdir/exporters
ka00ri Oct 17, 2025
e98f556
fix(cli): adapt tests and improve cmd info
ka00ri Oct 17, 2025
7214869
chore: linting
ka00ri Oct 17, 2025
c7b018c
further docs fixes
pablo-garay Oct 17, 2025
2a97eec
further docs fixes
pablo-garay Oct 17, 2025
80319a1
further docs fixes 2
pablo-garay Oct 17, 2025
3fd2f27
further docs fixes 3
pablo-garay Oct 17, 2025
145c7f2
chore(docs): update docs for info cmd
ka00ri Oct 17, 2025
47cf7f6
chore(docs): update new docs for info cmd
ka00ri Oct 17, 2025
b02667e
Update tutorial.md
ka00ri Oct 17, 2025
a424298
chore(docs): Revert archive tutorial.md to main
ka00ri Oct 17, 2025
d8e83c8
docs: test documentation build
ka00ri Oct 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/broken_links_false_positives.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{"filename": "deployment/bring-your-own-endpoint/hosted-services.md", "lineno": 158, "uri": "https://platform.openai.com/docs/models", "reason": "OpenAI platform uses bot protection that returns 403 for automated link checkers, but the link is valid"}
{"filename": "nemo-fw/evaluation-hf.md", "lineno": 26, "uri": "https://github.com/nvidia-nemo/export-deploy?tab=readme-ov-file#install-tensorrt-llm-vllm-or-trt-onnx-backend", "reason": "Valid section anchor in Export-Deploy README that link checker may not find due to GitHub's dynamic anchor generation"}
{"filename": "nemo-fw/evaluation-doc.md", "lineno": 154, "uri": "https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#L235", "reason": "Internal repo link that will be valid once changes are merged to main branch"}
{"filename": "nemo-fw/evaluation-doc.md", "lineno": 154, "uri": "https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#L270", "reason": "Internal repo link that will be valid once changes are merged to main branch"}

2 changes: 1 addition & 1 deletion docs/get-started/quickstart/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ All paths require:
| List benchmarks | `nemo-evaluator-launcher ls tasks` |
| Run evaluation | `nemo-evaluator-launcher run --config-dir packages/nemo-evaluator-launcher/examples --config-name <config>` |
| Check status | `nemo-evaluator-launcher status <invocation_id>` |
| Debug job | `nemo-evaluator-launcher debug <invocation_id>` |
| Job info | `nemo-evaluator-launcher info <invocation_id>` |
| Export results | `nemo-evaluator-launcher export <invocation_id> --dest local --format json` |
| Dry run | Add `--dry-run` to any run command |
| Test with limited samples | Add `-o +config.params.limit_samples=3` |
Expand Down
122 changes: 60 additions & 62 deletions docs/libraries/nemo-evaluator-launcher/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ nemo-evaluator-launcher --version # Show version information
- Run evaluations with specified configuration
* - `status`
- Check status of jobs or invocations
* - `info`
- Show detailed job(s) information
* - `kill`
- Kill a job or invocation
* - `ls`
Expand All @@ -29,8 +31,6 @@ nemo-evaluator-launcher --version # Show version information
- Export evaluation results to various destinations
* - `version`
- Show version information
* - `debug`
- Show detailed job(s) information
```

## run - Run Evaluations
Expand Down Expand Up @@ -156,6 +156,62 @@ abc12345.1 | success | container124 | <output_dir>/task2/...
]
```

## info - Job information and navigation

Display detailed job information, including metadata, configuration, and paths to logs/artifacts with descriptions of key result files. Supports copying results locally from both local and remote jobs.

### Basic usage
```bash
# Show job info for one or more IDs (job or invocation)
nemo-evaluator-launcher info <job_or_invocation_id>
nemo-evaluator-launcher info <inv1> <inv2>
```

### Show configuration
```bash
nemo-evaluator-launcher info <id> --config
```

### Show paths
```bash
# Show artifact locations
nemo-evaluator-launcher info <id> --artifacts
# Show log locations
nemo-evaluator-launcher info <id> --logs
```

### Copy files locally
```bash
# Copy logs (defaults to current dir if no path provided)
nemo-evaluator-launcher info <id> --copy-logs [DIR]

# Copy artifacts (defaults to current dir if no path provided)
nemo-evaluator-launcher info <id> --copy-artifacts [DIR]
```

### Example (Slurm)
```text
nemo-evaluator-launcher info <inv_id>

Job <inv_id>.0
├── Executor: slurm
├── Created: <timestamp>
├── Task: <task_name>
├── Artifacts: user@host:/shared/.../4245adf6071cd199/task_name/artifacts (remote)
│ └── Key files:
│ ├── results.yml - Benchmark scores, task results and resolved run configuration.
│ ├── eval_factory_metrics.json - Response + runtime stats (latency, tokens count, memory)
│ ├── metrics.json - Harness/benchmark metric and configuration
│ ├── report.html - Request-Response Pairs samples in HTML format (if enabled)
│ ├── report.json - Report data in json format, if enabled
├── Logs: user@host:/shared/.../4245adf6071cd199/task_name/logs (remote)
│ └── Key files:
│ ├── client-{SLURM_JOB_ID}.out - Evaluation container/process output
│ ├── slurm-{SLURM_JOB_ID}.out - SLURM scheduler stdout/stderr (batch submission, export steps).
│ ├── server-{SLURM_JOB_ID}.out - Model server logs when a deployment is used.
├── Slurm Job ID: <SLURM_JOB_ID>
```

## kill - Kill Jobs

Stop running evaluations.
Expand Down Expand Up @@ -303,64 +359,6 @@ nemo-evaluator-launcher version
nemo-evaluator-launcher --version
```

## debug - Job Information and Debugging helper functionalities

Display detailed job information including metadata, configuration, and locations of logs and artifacts. The debug command is useful for troubleshooting job issues, inspecting configurations, and retrieving artifacts from both local and remote jobs.

### Basic Usage

```bash
# Show job metadata and information for a single or multiple jobs
nemo-evaluator-launcher debug <invocation_id>

nemo-evaluator-launcher debug <invocation_id1> <invocation_id2>
```

### Show Configuration

```bash
# Display the job configuration in YAML format
nemo-evaluator-launcher debug <invocation_id> --config
```

### Show Paths

```bash
# Show only artifact locations
nemo-evaluator-launcher debug <invocation_id> --artifacts

# Show only log locations
nemo-evaluator-launcher debug <invocation_id> --logs
```

For remote jobs (Slurm), paths are shown in the format `user@host:/path`.

### Copy Files Locally

```bash
# Copy logs to local directory (works for both local and remote jobs)
nemo-evaluator-launcher debug <invocation_id> --copy-logs [destination_dir]

# Copy artifacts to local directory (works for both local and remote jobs)
nemo-evaluator-launcher debug <invocation_id> --copy-artifacts [destination_dir]

# If no destination is specified, defaults to current directory
nemo-evaluator-launcher debug <invocation_id> --copy-logs
```


### Debug example for a slurm job

```bash
# Shows remote paths and Slurm job ID
nemo-evaluator-launcher debug abc12345
# Output includes:
# ├── Artifacts: user@host:/shared/results/artifacts (remote)
# ├── Logs: user@host:/shared/results/logs (remote)
# ├── Slurm Job ID: 12345678

```

## Environment Variables

The CLI respects environment variables for logging and task-specific authentication:
Expand Down Expand Up @@ -474,9 +472,9 @@ nemo-evaluator-launcher run --config-dir packages/nemo-evaluator-launcher/exampl
```bash
# Command-specific help
nemo-evaluator-launcher run --help
nemo-evaluator-launcher export --help
nemo-evaluator-launcher info --help
nemo-evaluator-launcher ls --help
nemo-evaluator-launcher debug --help
nemo-evaluator-launcher export --help

# General help
nemo-evaluator-launcher --help
Expand Down
2 changes: 1 addition & 1 deletion docs/nemo-fw/evaluation-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ This section explains how to run evaluations with NeMo Run. For detailed informa

The [evaluation_with_nemo_run.py](https://github.com/NVIDIA-NeMo/Evaluator/blob/main/scripts/evaluation_with_nemo_run.py) script serves as a reference for launching evaluations with NeMo Run. This script demonstrates how to use NeMo Run with both local executors (your local workstation) and Slurm-based executors like clusters. In this setup, the deploy and evaluate processes are launched as two separate jobs with NeMo Run. The evaluate method waits until the PyTriton server is accessible and the model is deployed before starting the evaluations.

> **Note:** Please make sure to update HF_TOKEN in the NeMo Run script's [local_executor env_vars](https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#l267) with your HF_TOKEN if using local executor or in the [slurm_executor's env_vars](https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#l232) if using slurm_executor.
> **Note:** Please make sure to update HF_TOKEN in the NeMo Run script's [local_executor env_vars](https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#L270) with your HF_TOKEN if using local executor or in the [slurm_executor's env_vars](https://github.com/nvidia-nemo/evaluator/blob/main/scripts/evaluation_with_nemo_run.py#L235) if using slurm_executor.

### Run Locally with NeMo Run

Expand Down
4 changes: 2 additions & 2 deletions docs/nemo-fw/evaluation-hf.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,14 @@ python \
--use_vllm_backend
```

The `--model_path` can refer to either a local checkpoint path or a Hugging Face model ID, as shown in the example above. In the example above, checkpoint deployment uses the `vLLM` backend. To enable accelerated inference, install `vLLM` in your environment. To install `vLLM` inside the NeMo Framework container, follow the steps below as shared in [Export-Deploy's README](https://github.com/nvidia-nemo/export-deploy?tab=readme-ov-file#install-tensorrt-llm-vllm-or-trt-onnx-backend:~:text=cd%20/opt/export%2ddeploy%0auv%20sync%20%2d%2dinexact%20%2d%2dlink%2dmode%20symlink%20%2d%2dlocked%20%2d%2dextra%20vllm%20%24(cat%20/opt/uv_args.txt)):
The `--model_path` can refer to either a local checkpoint path or a Hugging Face model ID, as shown in the example above. In the example above, checkpoint deployment uses the `vLLM` backend. To enable accelerated inference, install `vLLM` in your environment. To install `vLLM` inside the NeMo Framework container, follow the steps below as shared in [Export-Deploy's README](https://github.com/nvidia-nemo/export-deploy?tab=readme-ov-file#install-tensorrt-llm-vllm-or-trt-onnx-backend):

```shell
cd /opt/Export-Deploy
uv sync --inexact --link-mode symlink --locked --extra vllm $(cat /opt/uv_args.txt)
```

To install `vLLM` outside of the NeMo Framework container, follow the steps mentioned [here](https://github.com/nvidia-nemo/export-deploy?tab=readme-ov-file#install-tensorrt-llm-vllm-or-trt-onnx-backend:~:text=install%20transformerengine%20%2b%20vllm).
To install `vLLM` outside of the NeMo Framework container, follow the steps mentioned [here](https://github.com/nvidia-nemo/export-deploy?tab=readme-ov-file#install-tensorrt-llm-vllm-or-trt-onnx-backend).

If you prefer to evaluate the Automodel checkpoint without using the `vLLM` backend, remove the `--use_vllm_backend` flag from the command above.

Expand Down
Loading
Loading