Name: Spark 2.0 (pySpark) made easy - Hands on Code
Start: 2017-04-19T18:00:00-05:00
End: 2017-04-19T20:00:00-05:00
Location: Microsoft Las Colinas

Hello again!

We are announcing the next meetup on Wed, April 19th.

This time we are going to be diving into Spark 2.0, especifically:

1) Loading data from Azure Storage and Azure Data Lake Store, what's best.

2) Dataframes Functions vs Spark SQL

3) Data formats: CSV vs PARQUET vs JSON vs ORC

4) Persistance of Hive Tables using Azure SQL as the metastore

5) Example of an ETL process with a public dataset

You can bring your own laptop and use your Azure account to follow the example and get the most of it.

See you there!

Pablo Marin

Ranga Vadlamudi

Data & AI - Microsoft

MSFT Data & AI DFW

Microsoft

Technology

Big Data

Machine Learning

Predictive Analytics

Microsoft Azure

Microsoft Dynamics

Data Science

Apache Spark

Data Analytics

Data Visualization

Data Mining