Skip to content

Commit c9180e3

Browse files
updated README.md
1 parent 7e9aa21 commit c9180e3

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

Diff for: code/bonus_chapters/mappartitions/README.md

+11-1
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,17 @@ Note that you may perform final reduction by `RDD.reduce()` as well:
159159

160160

161161
NOTE: data can be huge, but for understanding
162-
the `mapPartitions()` we use a very small data set.
162+
the `mapPartitions()` we used a very small data set.
163+
164+
# Is `RDD.mapPartitions()` Scalable?
165+
The RDD.mapPartitions() is scalable, since we return a single element
166+
from each source RDD partition (comprised of many elements). Even if
167+
the number of partitions in source RDD is high, still it will not cause a
168+
problem. You need to make sure that you custom function is not a bottleneck.
169+
For example, if source RDD has 100,000 partitions, then the target RDD will
170+
have 100,000 elements, which is very simple to apply a final reduction to
171+
the target RDD. Again, make sure that you custom function is simple and
172+
efficient.
163173

164174

165175
# Questions/Comments

0 commit comments

Comments
 (0)