Shuffling Spark with Kafka, Standalone Spark approach

Name: Shuffling Spark with Kafka, Standalone Spark approach
Start: 2016-04-05T18:00:00+03:00
End: 2016-04-05T20:15:00+03:00
Location: Taboola Offices Rooftop

Hosted by Tal S. and Ruthy G.

Israel Spark Meetup

Details

A joint meetup between Israel Spark Meetup and HadoopIsrael Meetup

18:00 - 18:30 - Mingling

18:30 - 19:15 - David Gruzman - “Kafka architecture, place of Kafka Streaming and usage of Kafka as Spark's shuffle engine”

We will get into Kafka architecture, and try to understand together - what is Kafka streaming and when it should be used.

In addition we will share our experience of using Kafka to accelerate our Spark application. I will tell also a few words about our system itself, where this acceleration was used.

19:15 - 19:30 - Break

19:30 - 20:15 - Alon Torres - DevOps Enginner, Totango & Romi Kuntsman - Senior Big Data Engineer, Totango - “Standalone Spark for Stability and Performance”

After initially trying AWS EMR and YARN with lackluster results, we decided to move to a manually fine-tuned Spark Standalone setup over AWS EC2.
We'll share our experience with controlling Spark components separately, using Chef, autoscaling groups, log integration, and more.
Since moving to this architecture, the days of cluster instability are long gone, and our server utilization is great.

Israel Spark Meetup

Shuffling Spark with Kafka, Standalone Spark approach

Israel Spark Meetup

Details

Related topics

You may also like