Skip to content
Scaling Spark

Details

With the constant increase of data volumes, comes many scaling challenges. Hadoop has been the front leader for the last decade and has given us many tools to lean on for the “Big Data” processing. Spark is the new kid on the block and its quickly proliferating the ecosystem. Whats not to like about Spark? It’s fast, scalable and can process both streaming and batch workloads. Coupled together with Amazon Web Services, it can be a very potent combination.

Learn first hand what it takes to deploy Spark and Yarn on AWS and how to take advantage of Spot instance to take advantage of the cloud’s elasticity and reduce your operating costs.

Presenter: Alex Rovner

Alex Rovner is the Director of Data Engineering at Magnetic. He has a long history of using Hadoop and other open source projects within it’s ecosystem. Alex is very bullish on Spark and believes that it will replace Hadoop MapReduce as the de facto data processing engine in the very near future.

Doors open at 6:30pm. Presentation kicks off at 7pm.

Photo of New York Hadoop User group group
New York Hadoop User group
See more events
ThoughtWorks
99 Madison Ave., 15th Floor · New York, NY