Getting Started with Py-Spark
Kaushik Amaravadi will be presenting on Py-Spark
We will start with little introduction of spark and its components. How python can be used in spark to proccess large datasets. And we will look at the Data Flow and architecture of spark and how pyspark is built on top of spark Java Api. After a small Q/A session . We will see how to setup spark with jupyter. We will jump to basics of pyspark using rdd and datasets, basics operations like map, join, reduce, filter and so on with some data explorations and visualization on a data set.
Kaushik Amaravadi is working as Big Data Engineer at Yotabites. He loves to learn new Data Science techniques and technologies, likes to explore the relationships between numbers and translate them into stories.