You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: chapters/metrics_collection/LevelZero.md
+10-10
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
# Metrics Collectoin with oneAPI Level Zero (Level Zero) Metric API
1
+
# Metrics Collection with oneAPI Level Zero (Level Zero) Metric API
2
2
## Overview
3
3
oneAPI Level Zero runtime provides a set of API to collect hardware metrics for Intel(R) Processor Graphics.
4
4
@@ -40,7 +40,7 @@ ze_result_t status = ZE_RESULT_SUCCESS;
40
40
status = zeInit(ZE_INIT_FLAG_GPU_ONLY);
41
41
assert(status == ZE_RESULT_SUCCESS);
42
42
```
43
-
Every group and seprate metric contain a list of properties that helps to determine their purpose (like name or description) as well as to grab some additional information that should be used while collecton (e.g. metric type or domain ID).
43
+
Every group and separate metric contain a list of properties that helps to determine their purpose (like name or description) as well as to grab some additional information that should be used while collecting (e.g. metric type or domain ID).
44
44
```cpp
45
45
uint32_t group_count = 0;
46
46
status = zetMetricGroupGet(device, &group_count, nullptr);
Process of metrics collection in continuous mode assumes that there is an infinite loop in a seprate thread, where one asks for collected samples periodically, read the data for a chunk of samples and store them into some memory or file (one sample contains all the metics from a metric group).
82
+
The process of metrics collection in continuous mode assumes that there is an infinite loop in a separate thread, where one asks for collected samples periodically, read the data for a chunk of samples and store them into some memory or file (one sample contains all the metrics from a metric group).
83
83
84
-
First one should choose a metric group to collect, that can be done e.g. while enumeration phase (see above). To activate it, one should call `zetContextActivateMetricGroups` function that will set up the hardware:
84
+
First one should choose a metric group to collect, that can be done e.g. while in the enumeration phase (see above). To activate it, one should call `zetContextActivateMetricGroups` function that will set up the hardware:
85
85
```cpp
86
86
status = zetContextActivateMetricGroups(context, device, 1, &group);
status = zeEventCreate(event_pool, &event_desc, &event);
103
103
assert(status == ZE_RESULT_SUCCESS);
104
104
```
105
-
Metric tracer object performs the management of metrics collection process. Before its creation one should set target sampling interval (time on GPU in nanoseconds between two reports) and number of reports to collect in one chunk:
105
+
The metric tracer object performs the management of metrics collection process. Before its creation one should set target sampling interval (time on GPU in nanoseconds between two reports) and number of reports to collect in one chunk:
After the metrics will be collected and calculated, one can find previously set marker values in `RenderStreamMarker` metric for two reports - the first one will be collected before the target activity, and the second one - after.
178
178
179
-
Each of these reports contains `QueryBeginTime` value that shows the time in nanosends when the report measurement is started. So to get activity execution time, one should compute the difference between these two values.
179
+
Each of these reports contains `QueryBeginTime` value that shows the time in nanoseconds when the report measurement is started. So to get activity execution time, one should compute the difference between these two values.
180
180
181
181
### Query-Based Collection
182
182
An alternate way to collect hardware metrics is query-based mode that allows one to map collected results directly to device activity, e.g. kernel execution or memory transfer.
status = zeEventPoolCreate(context, &event_pool_desc, 0, nullptr, &event_pool);
205
205
assert(status == ZE_RESULT_SUCCESS);
206
206
```
207
-
This time one may have to deal with multiple metric queries at a time (e.g. if multiple kernels are submitted to execution simultaniously), so the size for both pools may be greater than one.
207
+
This time one may have to deal with multiple metric queries at a time (e.g. if multiple kernels are submitted to execution simultaneously), so the size for both pools may be greater than one.
208
208
209
209
The process of metric query submission looks like this:
status = zetContextActivateMetricGroups(context, device, 0, nullptr);
261
261
assert(status == ZE_RESULT_SUCCESS);
262
262
```
263
-
Each calculated quety metric report contains the field `GpuTime` that provides the time spent on GPU in nanoseconds - so it can be used to measure kernel/transfer execution time on the device.
263
+
Each calculated query metric report contains the field `GpuTime` that provides the time spent on GPU in nanoseconds - so it can be used to measure kernel/transfer execution time on the device.
264
264
265
265
### Calculation
266
266
Metric reports collected at the previous stages are in raw format and should be post-calculated to become user-readable. To perform calculations one should call `zetMetricGroupCalculateMetricValues` function for target metric group.
Starting from version 1.1, Level Zero provides this value as `timerResolution` field of `ze_device_properties_t` structure in cycles per second. Also it can be retrieved with the help of Intel(R) Metrics Discovery Application Programming Interface as part of device information as `GpuTimestampFrequency` symbol (look into "Device Information" section from [here](./MetricsDiscoveryAPI.md) for details).
295
295
296
296
#### OpenCL(TM) Kernels
297
-
Common stragety of metrics to kernel mapping for OpenCL(TM) kernels may be the following:
297
+
A common strategy of metrics to kernel mapping for OpenCL(TM) kernels may be the following:
298
298
1. Collect kernel timestamps based on [OpenCL(TM) device activity tracing](../device_activity_tracing/OpenCL.md) mechanism;
299
299
2. Convert device timestamps into host timestamps with the help of `clGetDeviceAndHostTimer` function (Time Correlation section [here](../device_activity_tracing/OpenCL.md));
300
300
3. Convert host timestamps into Level Zero kernel timestamps with the help of `zeDeviceGetGlobalTimestamps` function (Time Correlation section [here](../device_activity_tracing/LevelZero.md));
0 commit comments