Shuffling Spark with Kafka, Standalone Spark approach

Name: Shuffling Spark with Kafka, Standalone Spark approach
Start: 2016-04-05T18:00:00+03:00
End: 2016-04-05T21:00:00+03:00
Location: Taboola offices

Hosted by David G.

HadoopIsrael

Details

Shuffling Spark with Kafka, Standalone Spark approach
A joint meetup between Israel Spark Meetup and HadoopIsrael Meetup

18:00 - 18:30 - Mingling

18:30 - 19:15 - David Gruzman - “Kafka architecture, place of Kafka Streaming and usage of Kafka as Spark's shuffle engine”

We will get into Kafka architecture, and try to understand together - what is Kafka streaming and when it should be used.
In addition we will share our experience of using Kafka to accelerate our Spark application. I will tell also a few words about our system itself, where this acceleration was used.

19:15 - 19:30 - Break

19:30 - 20:15 - Alon Torres - DevOps Enginner, Totango & Romi Kuntsman - Senior Big Data Engineer, Totango - “Standalone Spark for Stability and Performance”

After initially trying AWS EMR and YARN with lackluster results, we decided to move to a manually fine-tuned Spark Standalone setup over AWS EC2.
We'll share our experience with controlling Spark components separately, using Chef, autoscaling groups, log integration, and more.
Since moving to this architecture, the days of cluster instability are long gone, and our server utilization is great.

HadoopIsrael

Shuffling Spark with Kafka, Standalone Spark approach

HadoopIsrael

Details

Related topics

You may also like