Add flag ROC_SIGNAL_POOL_SIZE_PROFILE and update default value #71
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The change is two-fold:
add a flag to configure the pool size used when profiling, this allows to clearly configure the two values (profile vs. not-profile).
change the size of the profile pool down from 4096 which was too large: the kernel only provides 4094 events for DGPUs, and using two command-queues in OpenCL results in the bug described here:
OpenCL hot loop (100% one thread) when using two command queues with profiling ROCR-Runtime#186
The new default of 1000 has this rationale: it allows up to 4 queues to fit within the 4094 events provided by the kernel (with a little margin).