Skip to content

Apache Spark - RDD advanced concepts

Photo of PawelWlodarski
Hosted By
PawelWlodarski
Apache Spark - RDD advanced concepts

Details

Workshops plan :

• Small example of real data usage

• Usage of DoubleRDDFunctions

• Usage of PairRDDFunctions

• Understand joins in depth and how to use broadcast mechanism to reduce amount of data shuffled.

• Understand Function/Lambda serialization mechanism between driver and workers

• How to configure partitions

• More advanced usage of accumulators

---------------->>>> Exercises material (https://pawelwlodarski.gitbooks.io/workshops/content/spark_-_more_advanced_operations.html) <<<<------------------

Language : Polish

Photo of Java User Group Łódź group
Java User Group Łódź
See more events