Use Spark to solve a real-world problem - hands on workshop


Details
This session will highlight the simplicity and ease of developing powerful data analytics using Spark and Clojure. Spark is a BigData processing tool written in Scala that leverages in memory processing and data locality to speed up map-reduce type jobs, while also exposing a rich set of data operations beyond map and reduce. In addition to introducing the use of Spark through its Scala client we will also show how to use Spark through Clojure. The functional programming concepts used for data processing in Spark translate cleanly to Clojure, a modern LISP dialect hosted and freely interoperating with the Java Virtual Machine. Notably, Clojure supports a style of explorative interactive development at the REPL (Read Evel Print Loop), which, in combination with Sparkling, a Clojure API for Spark, allows complex data analytics to be quickly built up and tested in a piecewise manner. Once an analytic pipeline has been prototyped at the REPL, the same code can then be compiled and submitted to a Spark cluster, shortening the time needed to develop by allowing code from the prototyping phase to be re-used in production.
We will lead an open session to work through a real world problem using Spark through Scala or Clojure as you prefer. Bring your laptop and be ready to work as we will guide you. Drop in for a fun informal workshop.

Use Spark to solve a real-world problem - hands on workshop