Skip to content

Pre-shuffling Spark Jobs and Queueing Theory

Photo of Miri
Hosted By
Miri
Pre-shuffling Spark Jobs and Queueing Theory

Details

To connect to the meetup Zoom, please register here: https://www.meetup.com/Scalegineering/events/277827852/

Hi again!

We gathered some interesting talks you may like!

Super duper Avishai Ish Shalom proposed to talk about Queuing Theory, a highly technical material which is unfortunately not as widespread as it should. And, grande grande Lior Chaga will talk about how Taboola leverages Apache Cassandra to pre-shuffle for Apache Spark jobs.

It will be in Hebrew this time.

Schedule:

17:00 - 17:50 - Pre-shuffling Data for Spark jobs with Cassandra

17:50 - 18:40 - Queue Theory 101

Please notice:

  • Link to the event will only be visible to people who RSVP.
  • The online meetup will be held on May 9, 17:00, IL Time Zone, GMT +2:00

Talks

Pre-shuffling Data for Spark jobs with Cassandra - Lior Chaga, Taboola

Taboola is the world's largest discovery platform, serving recommendations to over 1.5B unique users each month. This in turn, results in over 500k requests/sec hitting our servers, and >100TB of daily data, which we collect to our backend datacenter.

At the backend we use Apache Spark to combine this data into complete and consistent data sets, consisting of over 3B page views per day, and used by our billing, reporting, analysis and deep learning processes.
In this session we will see how we use Apache Cassandra to avoid expensive shuffle operations with Apache Spark, how we protect our system from severe skewness issues, and discuss some operational considerations concerning our Cassandra cluster.

Lior is a Big Data Engineer in the Infrastructure Group at Taboola.

Queue Theory 101 - Avishai Ish Shalom, ScyllaDB

Queueing Theory is perhaps one of the most important mathematical theories in systems design and analysis, yet only few engineers learn it. This talk teaches the basics of queueing theory and explores the ramifications of queue behavior on system performance and resiliency. This talk aims to give practical skills that can be applied better build and tune your systems.

In a world where anything has an API, everything is a software problem" this insight has guided Avishai Ish-Shalom throughout his diverse career working on improving the complex socio-technical systems that create and operate modern software and promoting the use of Mathematics in system design and operations. Spending 15 years in various software fields and capacities, Avishai has served as Engineer in Residence in Aleph VC, engineering manager at Wix.com, co-founded Fewbytes and consulted many other companies on software operations, reliability, design and culture. Currently Avishai is a Developer Advocate for ScyllaDB (The boring database ;-)

Photo of Meetups @ Taboola IL group
Meetups @ Taboola IL
See more events