Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is an ambitious attempt at improving the already great performance of ZPure execution, which benchmarks across environment access, modification, logging and error handling.
Main changes
1. Create a
Runner
class, separate heap and stack variables and reuseStack
andChunkBuilder
:First, I want to apologise for doing this as I'm certain it'll make reviewing the changes much harder, but I think it's probably for the best in the longer run.
This optimization allows us to reuse the
Runner
class viaThreadLocal
and avoid allocations whenever we execute a new ZPure. While this might not provide much benefit when runningZPure
s, it makes a big difference for short-lived ones. In order to make this reentrant-safe, if the current thread is already running another ZPure, then we create a new Runner instance.I've also removed the
failed
variable in favour of throwing a stackless error and catching it when we don't have any error handles in the Stack. This works with the assumption that in most cases, a ZPure will complete successfully.2. Rely on vals / vars instead of Stack for environment / logs
Since ZPure is purely synchronous, we don't need a Stack to store modifications to the environment / logs. We can store the old one instead in a local val, and revert the value once that branch has finished execution. This way we save allocations but also improve performance since we don't need to use
peek()
to access the environment or write to logs3. Don't start a fresh log segment when
keepLogOnError = true
We only need to start a fresh log segment when we need to separate them (i.e., when
keepLogOnError = false
). Otherwise, we can just continue writing in the existing one4. Reduce reliance on
Stack.push
/Stack.pop
by adding manual handling ofLog
andEnvironment
after aFlatMap
Writing to logs and accessing the environment are very common operations when composing ZPures. We optimize for this by adding special handling after a
FlatMap
that doesn't require us pushing / popping from the Stack5. Custom implementation of
Stack
:This was one of the first optimizations I did as part of this work, but in hindsight it might a bit unnecessary given that we've reduced reliance on
Stack
due to the points above. I decided to keep this though just cause it still provides some benefit:The
Stack
from ZIO is already fast enough, but forces GC of values each time they're "popped". This makes sense in ZIO because the Stack might be long-lived, but forZPure
the stack entries are going to be GC'd automatically when the runloop finishes. Also, due to the internal structure of the Stack, at most we'll have 13 non-GC'd objects at any time - which is a small price to pay for not having to GC on everypop
. In addition, ZIO's Stack doesn't provide aclear
method, which we need to make (1) work.Benchmarks
I've only included benchmarking results for the newly added benchmarks, since the existing ones which were only benchmarking
ZPure.succeed
haven't changed:TLDR
The new changes provide a much bigger benefit whenever error handling is involved (i.e., Fold). Also, smaller ZPure's see a bigger improvement since the cost of creating a new Runner is amortized for longer-running
ZPure
Full results
series/2.x:
PR:
Special thanks to @ghostdogpr for writing the benchmarks and providing insights on realistic production use-cases of ZPure!