Skip to content

Lessons learned in building a Spark distribution

Photo of Félix-Étienne Trépanier
Hosted By
Félix-Étienne T.
Lessons learned in building a Spark distribution

Details

Apache Spark is a next-generation tool that makes ground-breaking improvements over MapReduce on Hadoop. It lets developers be more productive and use their resources more efficiently. But it's less known that the authors of Mesos introduced it in their paper as a framework "to validate their hypothesis".

We built a distribution of Spark at Typesafe, exploiting the synergy with Mesos. We will show how the combination is ready for multi-tenant, heterogeneous cluster environments. We'll see how they fit into a larger picture, along side YARN or containers. And we'll have a look at enterprise use cases that drove some of our choice. Finally, we'll also mention the bumps on the road, and improvements we would like to see to make the Spark and Mesos combination even more versatile and powerful.

---

About the Speaker

François Garillot joined Typesafe in 2012 after an early stint in research, where he spoke frequently at international conferences. He is now working in Typesafe's Spark team, leveraging his Scala knowledge to improve Spark's support for scalable machine learning and data science applications.

Based in Lausanne, he speaks at Swiss conferences and Scala user groups in Lyon and Paris. He recently spoke at Strata Hadoop Barcelona on how to make your next big data hackathon successful.

Photo of Lambda Montreal group
Lambda Montreal
See more events
Wajam
4115 St-Laurent (suite #300) · Montréal, QC