This month Alex Hagerman will help us get started with Apache Spark by doing a simple analysis of stackoverflow data via PySpark. During our exploration we will talk about:
- What Apache Spark is
- The different components of Spark
- When you might want to use Spark
- When you might not want to use Spark
For this session we will be focusing on PySpark Dataframes, RDDs and Spark SQL.
Each meeting will generally consist of:
• Food & chit-chat
• A presentation and discussion on some Python related topic. This may sometimes include hands-on opportunities, so bring a laptop if possible.
• An opportunity for other "lightning talks" and general discussion of cool or interesting things happening in the Python world.