ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics


Details
Miklos Christine from Databricks will be presenting. There will be pizza & drinks before hand, and we'll start the presentation at 6pm.
Apache Spark is a powerful tool to use when dealing with today's big data problems. Spark can be used for data ingestion and processing, but also capable of performing machine learning algorithms on the same datasets. In this talk, we'll cover an overview of the capabilities of Spark from a data engineer's perspective to get data into an efficient processing format. Finally, we walk through the necessary steps to run machine learning algorithms on the dataset.
Within these topics, we will cover Spark APIs, common technical issues from users, and best practices for running Spark pipelines.

ETL to ML: Use Apache Spark as an end to end tool for Advanced Analytics