-
Notifications
You must be signed in to change notification settings - Fork 1.5k
GC string views on hash join build side #16463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Benchmark results:
Seems like it slows down SF 1, 10 a bit :/ |
At SF=100, this PR is 10% faster:
Perhaps a threshold is needed? |
Thank you! This is great. I got some minor suggestions:
|
I will do both of these things later today. I am concerned about the performance impact for smaller-scale tasks. I suspect many users of datafusion are not doing such large joins.... |
We have quite some implementations of gc-ing arrays. I am wondering in this case if the performance can be improved for smaller tables by this heuristic used here: |
Which issue does this PR close?
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
No.