-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reboost should be fast #27
Comments
@willquinn I made some tests regarding our sorting issues.
Eg. for the time grouping I was considering to exploit that steps within a track should be naturally time sorted. @gipert what do you think? Here is the code: import numpy as np
import awkward as ak
import numba
# generate some random "times" and "idxs"
idx = np.sort(np.random.randint(0, 10**3, size=10**7))
time = np.random.random(size=10**7)
arr = ak.Array({"idx":idx,"time":time})
arr = ak.unflatten(arr,ak.run_lengths(arr.idx)) tried a basic # try a standard ak.
%%time
sort = ak.flatten(ak.sort(arr,axis=-1)).to_numpy() CPU times: user 971 ms, sys: 104 ms, total: 1.07 s
Wall time: 1.07 s Then an argsort: %%time
sort = arr[ak.argsort(arr.time,axis=-1)] CPU times: user 795 ms, sys: 88.1 ms, total: 883 ms
Wall time: 880 ms Then a %%time
# sort first by idx then by time
sort = np.lexsort((time,idx)) CPU times: user 3.35 s, sys: 16 ms, total: 3.36 s
Wall time: 3.36 s Finally I wrote some simple numba jit function: from numba.typed import List
@numba.njit
def sort_subarrays(arr_list):
for i in range(len(arr_list)):
arr_list[i] = np.sort(arr_list[i])
return arr_list But it requires converting the argument to a numba list which is very slow: %%time
arr_l = List(arr.time.to_list()) CPU times: user 6.15 s, sys: 184 ms, total: 6.33 s
Wall time: 6.33 s Finally, I compile the function (run once) then run again: %%time
a = sort_subarrays(arr_l) CPU times: user 777 ms, sys: 4.13 ms, total: 781 ms
Wall time: 775 ms |
this is certainly not a good idea, maybe there is something useful here https://awkward-array.org/doc/main/user-guide/how-to-use-in-numba-intro.html |
probably this is more helpful: https://awkward-array.org/doc/main/user-guide/how-to-use-in-numba-features.html#casting-one-dimensional-arrays-as-numpy this compiles and runs, for example:
(just to illustrate that we can call np functions directly on ak arrays, did not bother to implement the whole sorting right now) |
I doubt we will get an improvement over |
We should discuss checks on reboost performance, identify the limiting factors etc.
The text was updated successfully, but these errors were encountered: