-
Notifications
You must be signed in to change notification settings - Fork 74
Description
Extension Type
New Feature
Feature Description
Hi,
With this issue, I wanted to suggest the addition of a device-side timer in SYCL. This can be incredibly useful for profiling to diagnose load balancing and starvation issues. Currently, to my knowledge, no such thing is available natively in SYCL.
Ideally I would propose the following:
sycl::khr::device_globaltimer() -> f64returns the current timer in nanoseconds (to follow the footsteps of the%%globaltimerspecial register in PTX), but the choice of the actual unit is open to discussions.sycl::khr::device_clock() -> u64returns the current number of clock cycles (%%clockin PTX)
In term of implementation, I already have a draft supporting both Acpp and intel/llvm on Nvidia & host. For Intel @TApplencourt pointed me to some snippets suggesting that it is feasible quite directly. The only annoying one is AMD GPUs where we must extrapolate the number of seconds from the clock frequency without direct access to a "true" timer.
Here is a header only implementation (Host+CUDA) that I am already using inside Shamrock (although the name aren't aligned with what I'm proposing here)
Related Functionality in C++
high_resolution_clock::now() is the closest, for the actual timestamp counter it is not c++ but the rdtsc instruction
#include <chrono>
#include <iostream>
// GNU C++ x86_64 (thanks wikipedia)
extern inline uint64_t rdtsc() {
uint64_t a, d;
__asm__ volatile ("rdtsc" : "=a" (a), "=d" (d));
return (d<<32) | a;
}
int main(){
using namespace std::chrono;
auto device_globaltimer = duration_cast<nanoseconds>(high_resolution_clock::now().time_since_epoch()).count();
auto device_clock = rdtsc();
std::cout << device_globaltimer << " " << device_clock << std::endl;
}Related Functionality in Other Languages
OpenGL: https://registry.khronos.org/OpenGL/extensions/ARB/ARB_shader_clock.txt
Vulkan: https://docs.vulkan.org/refpages/latest/refpages/source/VK_KHR_shader_clock.html
Related SYCL Extensions
None to my knowledge