Skip to content

Spark 2.0 (pySpark) made easy - Hands on Code

Photo of Pablo Marin
Hosted By
Pablo M. and Ranga V.
Spark 2.0 (pySpark) made easy - Hands on Code

Details

Hello again!

We are announcing the next meetup on Wed, April 19th.

This time we are going to be diving into Spark 2.0, especifically:

  1. Loading data from Azure Storage and Azure Data Lake Store, what's best.

  2. Dataframes Functions vs Spark SQL

  3. Data formats: CSV vs PARQUET vs JSON vs ORC

  4. Persistance of Hive Tables using Azure SQL as the metastore

  5. Example of an ETL process with a public dataset

You can bring your own laptop and use your Azure account to follow the example and get the most of it.

See you there!

Photo of Data & AI - Microsoft group
Data & AI - Microsoft
See more events
Microsoft Las Colinas
7100 State Highway 161 · Irving, TX