-
Notifications
You must be signed in to change notification settings - Fork 0
Energy Benchmarks
The energy is currently measured in-programm with zeus. Due to how zeus and our implementation of energy measurements work, root privileges are required for CPU measurements. GPU measurements can be done without any additional privileges.
The measurements are performed by defining a starting window before and an ending window after the generation process using a given model. The model receives a dataset and begins generation for each data point in an iterative manner. Since the measurement window only covers the generation phase, dataset loading is excluded from the measurements.
Example without any special configurations:
qtransform run=energy model=EB_gpt_small_l1_h2 dataset=tsV2 tokenizer=tsV2 run.max_iters=1 wandb.enabled=False
Example using run configurations for idle time and the maximum number of iterations:
qtransform run=energy model=EB_gpt_small_l1_h2 dataset=tsV2 tokenizer=tsV2 run.max_iters=10 run.idle_time=60 wandb.enabled=False
idle_time: 0
# iterations to generate with given dataset to "preheat" the running device
preheat:
max_iters: 0
# generation configuration
max_new_tokens: 512
temperature: 0.7
top_k: 200
max_iters: ???
out:
path: null # path where the results should be saved to
Most run options are already explained on the qtransform page. Check there for more info.
Set by default to 0. When idle_time is greater than 0, energy while the program is idling is measured. The energy while idling is measured before any generation is done.
The energy measured during idle time is intended to be used to see the spike from idling to preheating/generation. This is useful for visualization purposes.
Generation with the given dataset and model will be done before the actual measuring of energy during generation. This serves to warmup caches.
Set by default to 0. Preheating won't occur unless the number is greater than 0.
The path where the results should be saved to. If not specified the results will be printed to console instead.
The first time the energy command is executed with a path specified, it will generate a run.txt, which keeps track of the current run (run referring here to the execution of the energy command). For each run a new folder with an incrementally increasing number starting at 1 will be created.
Example structure of how the results are stored with run.out.path="/home/user/qtransform/energy_measurements"
qtransform/
├── energy_measurements
│ ├── 1
│ │ ├── energy_verbose.csv
│ │ └── run_cfg.txt
│ ├── 2
│ │ ├── energy_verbose.csv
│ │ └── run_cfg.txt
│ ├── n
│ │ ├── ...
│ │ └── ...
│
...
Saves for each iteration of the given dataset the measured energy in Joule and the time in seconds, start and end timestamp for the specific datapoint. The .csv file storing these values has the following header:
time(s) | cpu_energy(J) | gpu_energy(J) | start | end |
---|
Saves the run configuration