Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate data processing performance #16

Open
stadust opened this issue Jul 30, 2020 · 5 comments
Open

Investigate data processing performance #16

stadust opened this issue Jul 30, 2020 · 5 comments
Assignees

Comments

@stadust
Copy link
Owner

stadust commented Jul 30, 2020

Right now, level data processing as implemented in #15 is single threaded. Back in GDCF it was parallelized, but there are some things to consider:

  • In dash-rs, base64 decoding takes around 20% to 30% of processing time. According to this, which mgostih pointed out, we can only roughly half processing time using parallelization
  • In GDCF the usecase was different, as GDCF stored the already decoded level string in a database, so actual parsing of the string accounted for nearly 100% of processing time. In dash-rs, the intended usecase is to parse the level string once, and then store it in a completely different dataformat using serde
@mgostIH
Copy link
Collaborator

mgostIH commented Jul 31, 2020

Can we have a flamegraph or profile for common load in order to see what could be improved? If most of the work is spent on base64 and decompression, then it makes little sense to parallelize the rest of parsing, but we could still improve the single threaded case so that the design is simpler and we use parallelism for parsing multiple levels rather than speeding up a single one.

@stadust
Copy link
Owner Author

stadust commented Jul 31, 2020

Alright here we go, I made it process ocular miracle 100 times

flamegraph

@mgostIH mgostIH changed the title Parallelize level data processing Investigate data processing performance Aug 1, 2020
@mgostIH
Copy link
Collaborator

mgostIH commented Aug 1, 2020

Approach tried: #17 was meant to decrease the load of float parsing, which would account for ~10% of that flamegraph, however it failed as std is already very damn fast for few digit cases, other crates (specifically lexical) didn't really benefit here and I doubt the situation can be further improved without hacking a custom approach that would only consider a couple of digits or use fixed point

@mgostIH
Copy link
Collaborator

mgostIH commented Aug 1, 2020

Other note: using miniz_oxide backend from flate2 and lto didn't change performance in any impactful way either, so we are pretty much left with optimizing the peek parts of the parser, as base64 decoding too has insignificant (< 1%) impact on performance

@stadust
Copy link
Owner Author

stadust commented Aug 1, 2020

Other ideas left to try:

  • Eliminate lookahead in the parser. Right now this is used in three placed, the MapAccess and SeqAccess impls and deserialize_option. In the former two, it is only used for error reporting (to attach the value that failed parsing to an Error::Custom variant that got constructed via serde::ser::Error::custom). We can ensure that said function is never called by eliminating usages of deserialize_with and instead move the processing to the HasRobtopFormat impls (note that deserialize_with/with is fine, as long as custom isn't used to construct errors. Then we can eliminate this peek_token() call. (Preliminary benchmarks show a 5%-7% speed up due to this). The lookahead in deserialize_option might be harder to eliminate, but if we manage to get rid of it, we would no longer need Peekable/peek_token
  • Reuse deserializers, as we are currently constructing one for each and every level objects (which amounts to 500 000 deserializers for ocular miracle)
  • Make the deserializer work directly on &[u8] instead of &str. This would get rid of the utf8 check in read_to_string after decompressing the level data (but potentially introduce new utf8 checks depending on how much we deserialize to &str/String anyway)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants