Apache Spark - RDD advanced concepts

Hosted By
PawelWlodarski

Details
Workshops plan :
• Small example of real data usage
• Usage of DoubleRDDFunctions
• Usage of PairRDDFunctions
• Understand joins in depth and how to use broadcast mechanism to reduce amount of data shuffled.
• Understand Function/Lambda serialization mechanism between driver and workers
• How to configure partitions
• More advanced usage of accumulators
---------------->>>> Exercises material (https://pawelwlodarski.gitbooks.io/workshops/content/spark_-_more_advanced_operations.html) <<<<------------------
Language : Polish

Java User Group Łódź
See more events
Apache Spark - RDD advanced concepts