You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to execute a jax executable using PJRT C API (CPU Plugin) manually. In general, I'm following the steps given in this discussion and was able to replicate everything up until the execution step.
I assume I'm doing something wrong while handing over the execute_arguments but I'd really like to understand how XLA expects the arguments.
Error Context
The function I'm testing is:
def dummy(x, y):
return jnp.dot(x, y) / 2.0
Input: two (2x2) float32 tensors
Output: one (2x2) float32 tensor
Therefore, I build two input buffers and fill them with dummy inputs ((1., 2., 3., 4.) and (2., 3., 4., 5.)). I also provide memory for the output PJRT-Buffer*.
I validated the args parameter at the final function call, as handed over in:
At this point, the execution crashes inside of the .so-plugin and does not return any PJRT-Error, but instead:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==4121==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x75842c8da013 bp 0x7fff828d5ff0 sp 0x7fff828d5fe0 T0)
==4121==The signal is caused by a READ memory access.
==4121==Hint: address points to the zero page.
#0 0x75842c8da013 (<unknown module>)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>)
==4121==ABORTING
Main questions
What is the expected layout of the args vector passed into CPU kernels?
Should there be three pointers? If so, what is the meaning of the third one?
Is my construction of argument_lists/output_lists incorrect?
Any insight would be greatly appreciated, as I've been digging into this with gdb for quite a while.
Thanks in advance! If you need any more information, I'll happily provide more context.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I'm trying to execute a jax executable using PJRT C API (CPU Plugin) manually. In general, I'm following the steps given in this discussion and was able to replicate everything up until the execution step.
After extensive gdb debugging, I traced the crash to the following line from xla/xla/backends/cpu/runtime/kernel.h:
I assume I'm doing something wrong while handing over the execute_arguments but I'd really like to understand how XLA expects the arguments.
Error Context
The function I'm testing is:
Input: two (2x2) float32 tensors
Output: one (2x2) float32 tensor
Therefore, I build two input buffers and fill them with dummy inputs ((1., 2., 3., 4.) and (2., 3., 4., 5.)). I also provide memory for the output PJRT-Buffer*.
I validated the args parameter at the final function call, as handed over in:
XLA_CPU_KernelCallFrame call_frame = {&num_workgroups, &workgroup_id, args.size(), args.data()};args.size()is 3args.data()are the flattened input buffers (correct).dot()-functionFrom my understanding, the third pointer is the output pointer. At this point I'm unsure whether:
Environment
bazel build //xla/pjrt/c:pjrt_api_c_cpu_plugin.so -c dbginside of the xla root repositoryHow I generated the executable:
How I run it in C++:
After loading the plugin and api and creating an api client, I deserialize and load the executable as follows:
Then, after getting the execution device, I get the input buffers as follows, doing the same for my x and y input:
Then, after repeating this for the y-buffer, I finally call the Execute-function:
At this point, the execution crashes inside of the .so-plugin and does not return any PJRT-Error, but instead:
Main questions
Any insight would be greatly appreciated, as I've been digging into this with gdb for quite a while.
Thanks in advance! If you need any more information, I'll happily provide more context.
Beta Was this translation helpful? Give feedback.
All reactions