Skip to main content

RDD Basic Transformations Operations

·127 words·1 min·
Spark CheatSheet Utils
Table of Contents

Just a simple Basic CheatSheet on Spark RDD’s

Intro
#

Just a simple Basic CheatSheet on Spark RDD’s

  • Basic Transformations on a RDD containing: {1,2,3,3}
Function NameExampleResult
map()rdd.map(x => x +1){2,3,4,4}
flatmap()rdd.flatMap(x => x.to(3)){1,2,3,2,3,3,3}
filter()rdd.filter(x => x != 1 ){2,3,3}
distinct()rdd.distinct(){1,2,3}
sample()rdd.sample(false,0.5)Nondeterministic
  • Basic two-RDD transformations on RDDs: {1,2,3} and {3,4,5}
Function NameExampleResult
union()rdd.union(other){1,2,3,3,4,5}
intersection()rdd.intersection(other){3}
subtract()rdd.subtract(other){1,2}
cartesian()rdd.cartesian(other){(1,3),(1,4)…(3,5)}
  • Basic Actions on RDD containing: {1,2,3,3}
Function NameExampleResult
collect()rdd.collect(){1,2,3,3}
count()rdd.count()4
countByValue()rdd.countByValue(){(1,1),(2,1),(3,2)}
take()rdd.take(2){1,2}
top()rdd.top(2){3,3}
takeOrdered()rdd.takeOrdered(2)(myOrdering){3,3}
takeSample()rdd.takeSample(false,1)Nondeterministic
reduce()rdd.reduce((x,y) => x + y )9
fold()rdd.fold(0)((x,y) => x + y)9
foreach()rdd.foreach(func)Nothing