Skip to content

Details

Join us on March 17th from 6:00pm for a Data Streaming meetup hosted by Snowflake and supported by Dremio and Ryft!

📍Venue:
Snowflake
One Crown Place, London EC2A 4EF, U.K. 5th & 6th floors · London

PLEASE bring your PHOTO ID and REGISTER with your details for security purposes.

🗓 Agenda:

  • 6:00pm – 6:30pm: Food/Drinks and Networking
  • 6:30pm - 7:00pm: Celeste Hogan, Developer Advocate, Snowflake
  • 7:00pm - 7:30pm: Yuval Yogev, CTO @ Ryft
  • 7:30pm - 8:00pm: Will Martin, EMEA Evangelist, Dremio
  • 8:00pm: Q&A Networking

💡Speaker One:
Celeste Hogan, Developer Advocate, Snowflake

Title of Talk:
I have no idea what I’m doing, or: building an Apache Spark to Apache Iceberg pipeline from scratch for data ingestion

Abstract:
Transitioning to Apache Iceberg requires a shift in how we think about data storage and ingestion. Enter: Apache Spark… maybe? Exactly how is an open question. In this talk we’ll go through the trial-and-error process of ingesting data into an Apache Iceberg table using Apache Spark. We’ll learn about file formats for Apache Iceberg tables, when to partition (and what the consequences are), common integration mistakes (as made by me in the learning process), and what Apache Spark does best, and when a different tool might work better. Most importantly, we’ll learn how to learn a new piece of software fast using AI to help rather than hinder.

Bio:
Celeste Horgan is a Sr. OSS Advocate at Snowflake. She got her start in open source at the Linux Foundation, where she supported the Kubernetes project on their documentation. From there, she went on to work at Aiven on open source data platforms, and now continues that work evangelizing data systems for Snowflake. Her work on inclusive language has been featured in the New York Times, and she lives in London.

💡Speaker Two:
Yuval Yogev, CTO @ Ryft

Title of Talk:
Implementing Intelligent Snapshot Management

Abstract:
Apache Iceberg snapshots enable time travel and rollback, but they are not free - What do you do when you can only afford to keep a few thousand of them?

With streaming ingestion, frequent commits, and compaction, tables can accumulate thousands of snapshots per day. Retention quickly becomes expensive without actually preserving useful restore points.
This session dives into how we implemented intelligent snapshot management. We present time-aware retention models that preserve what matters: high-resolution snapshots for recent history, and calendar-aligned restore points for long-term recovery. Instead of treating snapshots as temporary logs or hoarding them indefinitely, we apply backup patterns from databases and filesystems - leveraging Iceberg’s native snapshot and tagging semantics to make retention predictable, and operationally sustainable.

Bio:
I’m passionate about building high-throughput distributed systems and making complex data platforms simple, resilient, and scalable. Today, I’m the Co-Founder and CTO of Ryft focused on next-generation data infrastructure. Before that, I spent several years as Chief Architect at Sygnia, helping companies strengthen their cyber resilience through scalable platforms and fast data pipelines.

💡Speaker Three:
Will Martin, EMEA Evangelist, Dremio

Title of Talk:
From Stream to Table: Building Kafka-to-Iceberg Pipelines

Abstract:
While Kafka excels at streaming data, the real challenge lies in making that data analytically useful without sacrificing consistency or performance. This talk explores why Apache Iceberg has emerged as the ideal streaming destination, offering ACID transactions, schema evolution, and time travel capabilities that traditional data lakes can't match. Learn about some foundational tools that enable streaming pipelines and why they all converge on this industry standard table format.

Bio:
Will Martin is the EMEA Evangelist at Dremio. Beginning with a background of statistical analysis in particle physics, his professional journey has spanned multiple industries, including banking, shipping logistics, entertainment, healthcare, defence, and customer 360. With experience as a data engineer, solutions architect, and software developer, he has spent 15+ years collaborating with companies and organisations across EMEA and APAC, including roles at CERN, Deloitte, and Tamr.

***
If you are interested in hosting/speaking at a meetup, please email community@confluent.io

Related topics

Events in London, GB
Apache Kafka
Big Data
Open Source
Technology
Apache Flink

You may also like