Intro to Spark


Details
Event Tag: Data Science – See our “About Us” section for more information on tags.
Philip Best, founder of the Nashville R Users group and Director of Analytic Products at HCA will be sharing some knowledge he has gained while working with Spark streaming on Cloudera. This will be a hands on deep dive into some concepts around Spark and how to get meaningful data from a Spark install. If you’d like to follow along, download the Cloudera single node pseudo cluster (available in many virtual formats such as VirtualBox). There’s no expectation of having worked with Hadoop stacks before, though it would be quite helpful.
You can find the Cloudera quickstart VM here: http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms/cdh-5-3-x.html
(link updated since new release)
Wifi should be available onsite, however please download the VM in advance if you plan to follow along as it will be several GBs in size.
Parking: Park near the Ezell building on the Belmont Blvd side of campus and walk in toward the center of campus. Just past the bell tower, there will be a building on your right called the Swang building. If you enter through the rear doors that face Allen arena, you will be at the CCT offices.

Intro to Spark