Skip to content

Data Engineering: Bulk Up Your Data Eng Skills using Airflow, Spark & DASK

Photo of
Hosted By
Tair K.
Data Engineering: Bulk Up Your Data Eng Skills using Airflow, Spark & DASK

Details

// Please notice: This meetup will be hybrid -- it will be both online (talk will be streamed via Zoom) and Offline (Bitan 27, Namal TLV). ------
---
Welcome engineers and technologists in the TLV area. Whether you are a hands-on software engineer, data engineer, or just a person interested in advanced technologies, you will enjoy this meetup. Hosted at Wix's TLV offices.
We’ll cover topics around data science and data engineering, plus will work on improving your knowledge of a common tool such as Airflow, increasing your knowledge of BigData Engines like Spark, and will even get you familiarized with Dask. This meetup will focus on helping you improve your dev velocity - we will share must-know practices, pitfalls, optimizations, tuning, plus will also introduce the tools that are gaining momentum in the DS world.

// Agenda

  • 17:00 - 17:30 - Gathering
  • 17:30 - 18:00 - Apache Airflow - Improve DAG authoring skills: Tips, Tricks and More! by Elad Kalif
  • 18:00- 18:30 - Apache Spark Optimization Techniques and Tuning by Almog Gelber
  • 18:30 - 18:45 Break
  • 18:45- 19:15 Not Only Spark! Introducing Dask - A Pythonic Big Data Framework for Data Science by Itamar Faran
  • 19:15 -19:45 - Q&A

---

// Airflow - Improve DAG authoring skills: Tips, Tricks and More! / Speaker: Elad Kalif

A broken DAG surprised you? How about a non-templated Jinja format? Join me to learn about this and other crucial Airflow practices/usages. You will learn about the features and the must-know practices, plus the common pitfalls of working with Apache-Airflow.

// bio
Elad is a data engineer @Wix for 3 years and part of the infrastructure team which enables solutions to serve a wide range of developers. He is an open source advocate - Apache Airflow committer and PMC member.

social:
https://github.com/eladkal
https://twitter.com/eladkal
https://stackoverflow.com/users/14624409/elad-kalif
https://www.linkedin.com/in/elad-kalif-811b4887/

// Apache Spark Optimization Techniques and Tuning / Speaker : Almog Gelber

This session will cover the common bottlenecks and pain points when building a spark pipeline, ways to fix them and make the application more efficient.

// bio
Almog is a big data engineer. @Wix for 2 years as part of the infrastructure team.
Almog is Apache spark tech lead and responsible for promoting tools and infrastructure that will make spark more accessible, aside optimizing spark jobs across the data organization.

social:
https://www.linkedin.com/in/almog-gelber-7a6ab493/

// Not Only Spark! Introducing Dask - A Pythonic Big Data Framework for Data Science / Itamar Faran

While Spark is the state-of-the-art technology for huge out-of-memory data, its infrastructure-overhead may sometimes be “not worth it” for data science projects. Introducing Dask, a lightweight and pure-pythonic framework for out-of-memory dataframes built on numpy and pandas that integrates within the python data science ecosystem.

// bio
Itamar is a Data Scientist at Vesttoo. He is experienced with integration of big-data tools in data science projects.

Social: https://www.linkedin.com/in/itamar-faran-748054149/
Image

COVID-19 safety measures
Event will be indoors
The event host is instituting the above safety measures for this event. Meetup is not responsible for ensuring, and will not independently verify, that these precautions are followed.
Meetups at Wix
Meetups at Wix
See more events
Namal Tel Aviv · Tel Aviv, id
1 spot left