Spark 2.0 (pySpark) made easy - Hands on Code
Details
Hello again!
We are announcing the next meetup on Wed, April 19th.
This time we are going to be diving into Spark 2.0, especifically:
- Loading data from Azure Storage and Azure Data Lake Store, what's best.
- Dataframes Functions vs Spark SQL
- Data formats: CSV vs PARQUET vs JSON vs ORC
- Persistance of Hive Tables using Azure SQL as the metastore
- Example of an ETL process with a public dataset
You can bring your own laptop and use your Azure account to follow the example and get the most of it.
See you there!

