Introduction to Apache Spark with Databricks


Details
Hey all,
Houston Data Science is very excited to host Databricks, the creators of Apache Spark, one of the best and fastest growing big data processing engines today.
Meetup Summary:
Databricks (https://databricks.com/), the creators of Apache Spark (http://spark.apache.org/), will be providing a speaker, Don Hilborn (https://www.linkedin.com/in/don-hilborn-92503165), Solutions Architect. Don will address the capabilities offered by Spark, the open-source distributed compute engine that provides Machine Learning, SQL, R, and Streaming components.
We will cover an introduction to Spark, where Spark fits into the Big Data ecosystem, use-cases and also review some of the enhancements available in Spark version 2.0, including Structured Streaming and Tungsten (phase 2) memory management.
Agenda:
6:30 - 7: Networking and Food
7 - 8: Introduction to Spark
8 - 8:30: Question and answers
Venue:
Station Houston (http://stationhouston.org/) offers workspace, mentorship and guidance to local start-ups and this meetup will be one of the first at its new location.
Resources:
Intro to Apache Spark (quick overview) (https://player.oreilly.com/videos/9781491919729)
Intro to Apache Spark (in-depth free online training) (https://www.edx.org/course/introduction-apache-spark-uc-berkeleyx-cs105x)
Intro to Apache Spark on Databricks (https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/346304/2168141618055043/484361/latest.html)
Real life use-case | Capital One using Spark for fraud prevention (https://www.youtube.com/watch?v=q5HFMVoN_rc)
Find Station Houston: http://stationhouston.org/find-us
This building locks its doors starting at 6 p.m. so look for Ted Petrou to be at the entrance to let you in.

Introduction to Apache Spark with Databricks