Skip to content

Data Engineering Brick by Brick: Exploring DBT, Databricks, and AutoLoader

Photo of Ori Tal
Hosted By
Ori T. and 2 others
Data Engineering Brick by Brick: Exploring DBT, Databricks, and AutoLoader

Details

Join our data engineering meetup to explore scalable pipelines, efficient workflows, and transformative transformations with DBT and Databricks.

Agenda:
18:00 - 18:30 - Mingling, drinks, and snacks
18:30 - 19:00 - DBT as a Self-Service for Data Lake Gold Layer - Tal Peretz, Data Engineer at Riskified
19:00 - 19:45 - Let’s See It Live: Unlocking the Magic of Databricks - Ofer Ohana, Solution Architect at Databricks
19:45 - 20:15 - Leveraging Databricks AutoLoader: Better Visibility of CloudTrail Logs - Yoni Eilon, Data Engineer at Riskified
20:15 - More drinks and mingling

The talks will be delivered in Hebrew.

------------
// DBT as a Self-Service for Data Lake Gold Layer - Tal Peretz, Data Engineer at Riskified

As a component of our Medallion architecture, we aimed to develop a self-service tool with deploying and testing pipelines that allow our different departments to generate tailored Gold layer tables for their specific needs. To do so, we decided to use DBT (Data Build Tool), a powerful tool for managing data transformations in modern data architectures.

In this talk, we’ll show how, by using DBT with Databricks, you can easily create and manage data pipelines, transformations, and workflows in a scalable and efficient way. By the end of the talk, you will know how to quickly and easily build a high-quality Gold layer that is optimized for performance and accuracy, giving you the confidence to make informed decisions based on reliable data.

About the speaker:
Tal Peretz is a Data Engineer at Riskified. His expertise lies in developing data pipelines, building scalable infrastructure, data science and implementing data quality frameworks, enabling Riskified to make data-driven decisions and prevent fraud effectively.
As a Harry Potter and data engineering fan, Tal combines his love for technology and entertainment, appreciates the magic of storytelling, and strives to contribute to the data engineering field.
LinkedIn

// Let’s See It Live: Unlocking the Magic of Databricks - Ofer Ohana, Solution Architect at Databricks

Join us for a unique session where we’ll dive into the enhanced capabilities of Delta Lake, explore various machine-learning workflows, and give a live demonstration of Databricks in action. We'll discuss some of the new features and updates from Databricks, and provide a preview of what are Databricks’ recent announcements of LakehouseIQ and Lakehouse Federation and others from the Data+AI Summit.

About the speaker:
Ofer Ohana is an accomplished Solution Architect at Databricks, where he specializes in helping organizations unlock the potential of big data and analytics. With an extensive background in data engineering, analytics, and cloud computing, Ofer is adept at designing and implementing data solutions that are scalable, efficient, and insightful.

As a Solution Architect, Ofer works closely with clients to understand their unique data challenges and goals. He leverages Databricks’ Unified Data Analytics Platform, incorporating best practices in data processing and machine learning to craft solutions that empower data-driven decision-making.

Ofer is also an active participant in the data community, sharing his knowledge and insights through speaking engagements, workshops, and meetups. His commitment to continuous learning and passion for all things data make him a valued asset to teams and projects.
LinkedIn

// Leveraging Databricks AutoLoader: Better Visibility of CloudTrail Logs - Yoni Eilon, Data Engineer at Riskified

S3 logs generated by AWS CloudTrail provide organizations with essential visibility into user activity and resource utilization within their AWS infrastructure.

However, working with raw CloudTrail logs can be challenging due to their size, complexity, and the need for optimal storage and query performance.
Our SecOps team had 180TB of these logs in an S3 bucket, which took forever to query, so they came to us looking for a better solution.

In this talk, we will discuss our journey to find the best solution for this problem, and why we ended up using Databricks AutoLoader, an automatic and scalable data ingestion mechanism, to do it. We will talk about the various approaches we attempted to use; AutoLoader with its advantages and features and the lessons we learned along the way.

About the speaker:
Yoni is an accomplished DBA and Data Engineer with a wealth of experience across various database systems and platforms. His expertise spans from relational databases to NoSQL and cloud-based solutions, as well as DWH systems and Spark.

When he's not helping Riskified manage its Data Platform, Yoni enjoys spending time with his wife and 3 kids, playing basketball and cooking for friends and family.
LinkedIn

COVID-19 safety measures

Event will be indoors
The event host is instituting the above safety measures for this event. Meetup is not responsible for ensuring, and will not independently verify, that these precautions are followed.
Photo of Meetups at Riskified group
Meetups at Riskified
See more events
Sderot Sha'ul HaMelech 37 · Tel Aviv-Jaffa