Skip to content

Data Engineering at Scale

Photo of Itai Yaffe
Hosted By
Itai Y. and Meital B.
Data Engineering at Scale

Details

Data Engineering at Scale: Techniques for Efficient Data Processing, Automation, and Spark Application Management

Details
17:30-18:00 - Mingling and food
18:00-18:05 - Opening notes
18:05-18:35 - Mastering Partitioning for High-Volume Data Processing - Yulia Antonovsky @ Akamai
18:35-19:05 - Solving Data Engineers Velocity - Wix’s Data Warehouse Automation - Maayan Gad @ Wix
19:05-19:35 - Lessons Learnt from Running Thousands of On-demand Spark Applications - Ada Sharoni @ Hunters

*********************** Note: ***********************
- The event is open for everyone (regardless of gender)
- All sessions will be delivered in Hebrew
*****************************************************

Title: Mastering Partitioning for High-Volume Data Processing
Abstract:
Our cloud-based ingest pipeline processes over 10 Gb of security events data per second, which demands high-performance processing and analysis. To achieve this, we've implemented efficient partitioning using Java and Spark applications running on AKS and leveraging Kafka. This allows us to provide real-time analytics within two minutes and heavy batch processing for deeper analysis hourly. During this talk, we will cover how we use Kafka to scale our Spark application on K8s, partitioning strategies for high-volume data processing, and how partitioning helps avoid storage throttling issues.
Bio: Yulia Antonovsky is a Senior Software Engineer II at Akamai.
With more than 15 years of experience in the tech industry, Yulia has been focusing on Data Engineering for the past 6 years.
She holds a BSc in Computer and Information Sciences from the Technion (Israel Institute of Technology).

Title: Solving Data Engineers Velocity - Wix’s Data Warehouse Automation
Abstract:
Who likes to maintain long SQL files? No one.
Column renames, additions, deletions, changes in the KPIs your tables are based on, or the date your table should start from - all are annoying, time consuming tasks your data engineers waste their time on.
In this talk we’ll uncover how we solved this problem at Wix, allowing our data engineers to easily create Data Warehouse grade tables using configuration files, and even transfer some of the business-related-only parts to the data analysts.
Bio: Maayan Gad is a Senior Big Data Engineer at Wix for the last 4 years. She works at the Automation team in the Data engineering guild, leading the data warehouse automation development.

Title: Lessons Learnt from Running Thousands of On-demand Spark Applications
Abstract:
Imagine you had to manage thousands of Spark applications that are automatically spinning up on-demand upon every customer interaction.
Our unique constraints in Hunters have led us to adopt an architecture and concepts that we believe many other companies will find useful.
In this lecture we will share our solutions and insights in running many lightweight, cheap Spark applications on Kubernetes, that can easily survive frequent restarts and smartly share resources on Spot EC2 instances.
Bio: Ada Sharoni is a Software Engineering Architect at Hunters. Originally started as an algorithm developer in Signal Processing and for the last 8 years have been a backend developer, specializing in the fields of Big Data and ML

Photo of Women in Big Data Israel group
Women in Big Data Israel
See more events
Totseret ha-Arets St 8 · Tel Aviv-Yafo