-
Notifications
You must be signed in to change notification settings - Fork 476
/
Copy pathcartesian.txt
executable file
·35 lines (34 loc) · 920 Bytes
/
cartesian.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# ./pyspark
Python 2.6.9 (unknown, Sep 9 2014, 15:05:12)
...
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.3.0
/_/
Using Python version 2.6.9 (unknown, Sep 9 2014 15:05:12)
SparkContext available as sc, SQLContext available as sqlCtx.
>>> a = [('k1','v1'), ('k2', 'v2')]
>>> a
[('k1', 'v1'), ('k2', 'v2')]
>>> b = [('k3','v3'), ('k4', 'v4'), ('k5', 'v5') ]
>>> b
[('k3', 'v3'), ('k4', 'v4'), ('k5', 'v5')]
>>> rdd1= sc.parallelize(a)
>>> rdd1.collect()
[('k1', 'v1'), ('k2', 'v2')]
>>> rdd2= sc.parallelize(b)
>>> rdd2.collect()
[('k3', 'v3'), ('k4', 'v4'), ('k5', 'v5')]
>>> rdd3 = rdd1.cartesian(rdd2)
>>> rdd3.collect()
[
(('k1', 'v1'), ('k3', 'v3')),
(('k1', 'v1'), ('k4', 'v4')),
(('k1', 'v1'), ('k5', 'v5')),
(('k2', 'v2'), ('k3', 'v3')),
(('k2', 'v2'), ('k4', 'v4')),
(('k2', 'v2'), ('k5', 'v5'))
]
>>>