Spark 2.0 (pySpark) made easy - Hands on Code

Hosted By
Pablo M. and Ranga V.

Details
Hello again!
We are announcing the next meetup on Wed, April 19th.
This time we are going to be diving into Spark 2.0, especifically:
-
Loading data from Azure Storage and Azure Data Lake Store, what's best.
-
Dataframes Functions vs Spark SQL
-
Data formats: CSV vs PARQUET vs JSON vs ORC
-
Persistance of Hive Tables using Azure SQL as the metastore
-
Example of an ETL process with a public dataset
You can bring your own laptop and use your Azure account to follow the example and get the most of it.
See you there!

Data & AI - Microsoft
See more events
Microsoft Las Colinas
7100 State Highway 161 · Irving, TX
Sponsors
Spark 2.0 (pySpark) made easy - Hands on Code