-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-54116][SQL] Add off-heap mode support for LongHashedRelation #52817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SPARK-54116][SQL] Add off-heap mode support for LongHashedRelation #52817
Conversation
…nSuite
### What changes were proposed in this pull request?
Add the following code in `HashedRelationSuite`, to cover the API `HashedRelation#close` in the test suite.
```scala
protected override def afterEach(): Unit = {
super.afterEach()
assert(umm.executionMemoryUsed === 0)
}
```
### Why are the changes needed?
Doing this will:
1. Ensure `HashedRelation#close` is called in test code, to lower memory footprint and avoid memory leak when executing tests.
2. Ensure implementations of `HashedRelation#close` free the allocated memory blocks correctly.
It's an individual effort to improve the test quality, but also a prerequisite task for #52817.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
It's a test PR.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52830 from zhztheplayer/wip-54132.
Authored-by: Hongze Zhang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
…nSuite
### What changes were proposed in this pull request?
Add the following code in `HashedRelationSuite`, to cover the API `HashedRelation#close` in the test suite.
```scala
protected override def afterEach(): Unit = {
super.afterEach()
assert(umm.executionMemoryUsed === 0)
}
```
### Why are the changes needed?
Doing this will:
1. Ensure `HashedRelation#close` is called in test code, to lower memory footprint and avoid memory leak when executing tests.
2. Ensure implementations of `HashedRelation#close` free the allocated memory blocks correctly.
It's an individual effort to improve the test quality, but also a prerequisite task for #52817.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
It's a test PR.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52830 from zhztheplayer/wip-54132.
Authored-by: Hongze Zhang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit a5e866f)
Signed-off-by: Wenchen Fan <[email protected]>
3351fb1 to
3686285
Compare
3686285 to
842d2de
Compare
|
@cloud-fan @yaooqinn @dongjoon-hyun @@HyukjinKwon @viirya Would you kindly help review this PR? It's for making Spark SQL work more smoothly with 3rd party off-heap based operators / expressions. Thanks! |
|
I wonder if the benchmark can be done at least. I worked on similar changes to implement off heap stuff, and realised that it isn't necessarily fast. |
|
e.g., sometimes it appears JIT to be quite smarter than using direct off heap memory |
|
@HyukjinKwon Thank you for the quick response. I benchmarked using the existing HashedRelationMetricsBenchmark. 500K RowsBefore: After (on-heap): Afer (off-heap): 10M RowsBefore: After (on-heap): After (off-heap):
Yes, and also regarding the benchmark results, I do think the new approach came across a bit slower (although they are close). Do you have any suggestions? |
I actually didn't expect the off-heap relation can be faster - it's more a work to make sure Spark can work with a relatively smaller heap under off-heap memory mode, so we can prevent heap OOMs. |
| val got = acquireMemory(size) | ||
| if (got < size) { | ||
| freeMemory(got) | ||
| throw QueryExecutionErrors.cannotAcquireMemoryToBuildLongHashedRelationError(size, got) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this error class
|
It's interesting that, after the change, the off-heap mode relation looks meaningfully faster than on-heap relation in the benchmark. I will update the benchmark result inline. |
|
Just a note to @zhztheplayer . From my understanding, this feature needs enough time to do extensive tests over various workloads. Given that, I'd like to recommend to re-target this to Apache Spark 4.2.0 only. For 4.2.0, we can get more chances to test this in |
@dongjoon-hyun According to the developer document mentioned in the issue, I will leave the target version empty. Thank you for helping with planning the target version. 4.2 looks totally fine to me. |
ff3765c to
b9b24d6
Compare
How is this implemented in this PR? I don't see any branching code regarding different joins. |
|
Hi @cloud-fan, thanks for having a look.
This is a bit subtle due to how the task memory manager is got for creating a hashed relation in Spark code. For SHJspark/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ShuffledHashJoinExec.scala Lines 108 to 116 in 722bcc0
As seen, For BHJ (Driver)The code goes through this path: spark/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala Line 197 in 722bcc0
then spark/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala Lines 1157 to 1166 in 722bcc0
, where no tmm is passed for creating the hashed relation. In this case, a temporary on-heap tmm will be created and used: spark/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala Lines 142 to 150 in 722bcc0
For BHJ (Executor)Similar to the driver side, the deserialization code also uses a temporary on-heap tmm: spark/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala Lines 401 to 407 in 722bcc0
|
|
The CI failure (hanged forever after changing to on-heap) is being addressed by another PR: #53065, which this one depends on. |
What changes were proposed in this pull request?
The PR adds off-heap memory mode support for LongHashedRelation.
The PR only affects ShuffledHashJoin. In BroadcastHashJoin, the hashed relations are not closed explicitly but are managed by GC. So it will require a different approach to allocate from off-heap.
Why are the changes needed?
spark.memory.offHeap.enabled=true, and configures JVM with a comparatively small heap size.Does this PR introduce any user-facing change?
By design, When
spark.memory.offHeap.enabled=trueis set:How was this patch tested?
WIP
Was this patch authored or co-authored using generative AI tooling?
No.