-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate data processing performance #16
Comments
Can we have a flamegraph or profile for common load in order to see what could be improved? If most of the work is spent on base64 and decompression, then it makes little sense to parallelize the rest of parsing, but we could still improve the single threaded case so that the design is simpler and we use parallelism for parsing multiple levels rather than speeding up a single one. |
Approach tried: #17 was meant to decrease the load of float parsing, which would account for ~10% of that flamegraph, however it failed as std is already very damn fast for few digit cases, other crates (specifically lexical) didn't really benefit here and I doubt the situation can be further improved without hacking a custom approach that would only consider a couple of digits or use fixed point |
Other note: using miniz_oxide backend from flate2 and lto didn't change performance in any impactful way either, so we are pretty much left with optimizing the peek parts of the parser, as base64 decoding too has insignificant (< 1%) impact on performance |
Other ideas left to try:
|
Right now, level data processing as implemented in #15 is single threaded. Back in GDCF it was parallelized, but there are some things to consider:
The text was updated successfully, but these errors were encountered: