Apache Spark Meetup at Cloudera


Details
We are jointly organizing this event together with the Cloudera Tech Meetup (https://www.meetup.com/Cloudera-Tech-Meetup-Budapest/)
Schedule:
18:00 Doors Open, Sandwiches & Refreshments
18:30 Opening notes and announcements
18:40 Fault-Tolerance in Apache Spark (and where it has failed) - Imran Rashid
19:10 Q&A
19:20 TBA Talk - Cagdas Yetkin
19:40 Q&A
19:50 Open Discussion, Beers, Networking
Talks:
Fault-Tolerance in Apache Spark (and where it has failed) - Imran Rashid
Apache Spark is a distributed computing platform built using Scala with fault-tolerance in mind. It has been tested rigorously and deployed in production at many companies for years. And yet, fault-tolerance issues are still surfaced. How did these faults slip through?
The talk will focus on the challenges of making Spark work on large clusters where hardware failure is a fact of life. We'll talk about spark's design, what adjustments have been necessary, and the bugs that have been found. We'll focus on Spark but will discuss designs of distributed systems in general as well.
Bio:
Imran Rashid is a Committer and PMC member of Apache Spark, and a developer at Cloudera. He has used Hadoop for the past 8 years, and been closely involved in Spark even before it was an Apache project. Imran used to write Spark programs for machine learning on terabytes of data -- now he spends his time hunting down bugs that surface when Spark is pushed to 1000s of nodes.
Integrating Spark with H2o.ai - Cagdas Yetkin, Datapao
<>
Please note that this is an English speaking event.

Apache Spark Meetup at Cloudera