Investigate data processing performance #16

stadust · 2020-07-30T05:47:18Z

Right now, level data processing as implemented in #15 is single threaded. Back in GDCF it was parallelized, but there are some things to consider:

In dash-rs, base64 decoding takes around 20% to 30% of processing time. According to this, which mgostih pointed out, we can only roughly half processing time using parallelization
In GDCF the usecase was different, as GDCF stored the already decoded level string in a database, so actual parsing of the string accounted for nearly 100% of processing time. In dash-rs, the intended usecase is to parse the level string once, and then store it in a completely different dataformat using serde

mgostIH · 2020-07-31T09:50:37Z

Can we have a flamegraph or profile for common load in order to see what could be improved? If most of the work is spent on base64 and decompression, then it makes little sense to parallelize the rest of parsing, but we could still improve the single threaded case so that the design is simpler and we use parallelism for parsing multiple levels rather than speeding up a single one.

stadust · 2020-07-31T18:10:47Z

Alright here we go, I made it process ocular miracle 100 times

mgostIH · 2020-08-01T09:33:22Z

Approach tried: #17 was meant to decrease the load of float parsing, which would account for ~10% of that flamegraph, however it failed as std is already very damn fast for few digit cases, other crates (specifically lexical) didn't really benefit here and I doubt the situation can be further improved without hacking a custom approach that would only consider a couple of digits or use fixed point

mgostIH · 2020-08-01T09:38:58Z

Other note: using miniz_oxide backend from flate2 and lto didn't change performance in any impactful way either, so we are pretty much left with optimizing the peek parts of the parser, as base64 decoding too has insignificant (< 1%) impact on performance

stadust · 2020-08-01T10:25:31Z

Other ideas left to try:

Eliminate lookahead in the parser. Right now this is used in three placed, the MapAccess and SeqAccess impls and deserialize_option. In the former two, it is only used for error reporting (to attach the value that failed parsing to an Error::Custom variant that got constructed via serde::ser::Error::custom). We can ensure that said function is never called by eliminating usages of deserialize_with and instead move the processing to the HasRobtopFormat impls (note that deserialize_with/with is fine, as long as custom isn't used to construct errors. Then we can eliminate this peek_token() call. (Preliminary benchmarks show a 5%-7% speed up due to this). The lookahead in deserialize_option might be harder to eliminate, but if we manage to get rid of it, we would no longer need Peekable/peek_token
Reuse deserializers, as we are currently constructing one for each and every level objects (which amounts to 500 000 deserializers for ocular miracle)
Make the deserializer work directly on &[u8] instead of &str. This would get rid of the utf8 check in read_to_string after decompressing the level data (but potentially introduce new utf8 checks depending on how much we deserialize to &str/String anyway)

stadust assigned mgostIH Jul 30, 2020

mgostIH changed the title ~~Parallelize level data processing~~ Investigate data processing performance Aug 1, 2020

This was referenced Aug 1, 2020

Changed code to use lexical crate for faster float parsing #17

Merged

Revert "Changed code to use lexical crate for faster float parsing" #18

Merged

This was referenced Aug 3, 2020

Eliminate Peekable in IndexedDeserializer #19

Closed

Eliminate use of Peekable in IndexedDeserializer #20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate data processing performance #16

Investigate data processing performance #16

stadust commented Jul 30, 2020

mgostIH commented Jul 31, 2020

stadust commented Jul 31, 2020

mgostIH commented Aug 1, 2020

mgostIH commented Aug 1, 2020

stadust commented Aug 1, 2020

Investigate data processing performance #16

Investigate data processing performance #16

Comments

stadust commented Jul 30, 2020

mgostIH commented Jul 31, 2020

stadust commented Jul 31, 2020

mgostIH commented Aug 1, 2020

mgostIH commented Aug 1, 2020

stadust commented Aug 1, 2020