Scaling Spark


Details
With the constant increase of data volumes, comes many scaling challenges. Hadoop has been the front leader for the last decade and has given us many tools to lean on for the “Big Data” processing. Spark is the new kid on the block and its quickly proliferating the ecosystem. Whats not to like about Spark? It’s fast, scalable and can process both streaming and batch workloads. Coupled together with Amazon Web Services, it can be a very potent combination.
Learn first hand what it takes to deploy Spark and Yarn on AWS and how to take advantage of Spot instance to take advantage of the cloud’s elasticity and reduce your operating costs.
Presenter: Alex Rovner
Alex Rovner is the Director of Data Engineering at Magnetic. He has a long history of using Hadoop and other open source projects within it’s ecosystem. Alex is very bullish on Spark and believes that it will replace Hadoop MapReduce as the de facto data processing engine in the very near future.
Doors open at 6:30pm. Presentation kicks off at 7pm.

Scaling Spark