Why is My Spark Job Failing? by Sandy Ryza of Cloudera


Details
Abstract:
You are not a bad person. But your Apache Spark job is failing. It is running out of memory. It is stalled. It is complaining that no executors have registered or spitting out "Filesystem closed" exceptions with lines upon lines of $anon$1's or being consumed by a swarm of locusts the likes of which have not been seen since Moses crossed the Red Sea. Or it's completing -- 20 times as slow as it should reasonably take. Why? In this talk, you'll learn the internals of Spark jobs, the root causes of such ailments, and tuning strategies for avoiding them.
Bio:
Sandy Ryza is a data scientist at Cloudera, an Apache Hadoop committer, and a Spark contributor. Sandy is also the co-author of Advanced Analytics with Spark.
Parking: Garage parking is available under the building, and validation will be provided by Playtika! Please do NOT park in Reserved spots. Here is a picture of a car entering our garage. https://www.google.com/maps/@34.026649,-118.476098,3a,75y,135.22h,90t/data=!3m4!1e1!3m2!1suj6GRVik1nvVm6Q_va0WLQ!2e0

Why is My Spark Job Failing? by Sandy Ryza of Cloudera