Two Spark Talks! EclairJS and Spark In Practice


Details
Happy New Year, everybody! As it has been a while since the last meetup, we're going to try and hit 2016 running with our first meetup of the year, featuring 2 25-minute talks. First up, we'll have Bill Reed talking about EclairJS:
EclairJS enables developers to develop Node.js and JavaScript programs to take advantage of Apache Spark's scalable processing of streaming data, SQL, machine learning and graph data. The open-source EclairJS project is made up of two components, EclairJS Node and EclairJS Nashorn. EclairJS Node provides Node.js applications with a Spark API through an npm installable client so that Node.js applications can run remotely from the Spark engine. EclairJS Nashorn implements the support for JavaScript in Spark, and provides a framework that supports various applications including EclairJS Node, a REPL and Jupyter Notebooks. During the presentation we will demonstrate and describe EclairJS, and describe several application use cases, potential limitations, and future plans for the codebase.
Secondly, I'll be giving a talk on Spark In Practice. I'll be talking about what happens when you move away from word count and attempt to do real jobs on large amounts of data. I'll show you some of the pitfalls you may encounter and how to deal with messier world outside of the example datasets.
Looking forward to seeing everybody!
(18:30 doors for 19:00 start!)

Two Spark Talks! EclairJS and Spark In Practice