Zum Inhalt springen

Details

## Open Lakehouse & AI Data Infrastructure Meetup – NYC

📅 Tuesday, February 17
6:00 PM – 8:30 PM EST
📍 Registration accepted only Through Luma Link
https://luma.com/gmc5k8gr

📌 New York, New York
We’re bringing together the Apache Iceberg, Lance, and Apache DataFusion communities in NYC for an evening of deep technical discussions around open lakehouse architectures and modern data infrastructure—hosted at Cloudflare’s NYC office.
This meetup is a great opportunity to learn from industry experts, connect with fellow data engineers and AI practitioners, and explore how open technologies are shaping the future of analytics and AI.

***

## 🤝 Hosted By

Cloudera | LanceDB | Cloudflare

***

## 🎤 Agenda

### 🕕 6:00 – 6:30 PM

Registration & Networking

***

### 🗣️ Talk 1: Apache Iceberg – Spec Evolution (v1 to v4) and How Cloudera’s Data Platform Supports It

Speaker: Dipankar Mazumdar, Director – Developers, Cloudera
This session explores the evolution of Apache Iceberg, the challenges each specification aimed to solve, and what’s coming next. After a brief overview of v1 and v2, we’ll deep dive into v3 and upcoming v4+ work, including:

  • Lineage
  • Deletion vectors
  • Metadata redesign
  • File format APIs
  • Why these changes matter for large-scale lakehouse pipelines

The talk will also cover how Cloudera’s data platform has supported Iceberg’s core capabilities from its early days.

***

### 🗣️ Talk 2: Multimodal AI Lakehouse with Lance & LanceDB

Speaker: Chang She, Co-Founder & CEO, LanceDB
Modern AI applications demand seamless access to text, images, embeddings, and other complex data types, but existing lakehouse solutions often force teams into closed systems—re-introducing silos.
In this talk, you’ll learn about:

  • Lance, a next-generation columnar data format optimized for AI
  • LanceDB, a multimodal lakehouse built on top of Lance
  • Unified vector, full-text, and SQL search
  • Flexible schema evolution across the multimodal AI lifecycle

See how companies like Midjourney, WorldLabs, and Runway are building open, scalable, production-grade AI systems.

***

### 🗣️ Talk 3: Cloudflare’s Data Platform with Apache Iceberg & DataFusion

Speaker: Jonathan Chen, Software Engineer, Cloudflare
An introduction to Cloudflare’s new data platform, built on Apache Iceberg and Apache DataFusion, including:

  • R2 Data Catalog
  • R2 SQL
  • Pipelines

We’ll walk through the architecture and show how Cloudflare enables SQL analytics directly on object storage, allowing users to query continuously ingested data without managing separate compute or storage systems.

***

### 🤝 8:00 – 8:30 PM

Networking & Conversations

***

## ⚠️ Important Note on Registration

Please ensure that the name used for registration exactly matches the full name on your government-issued ID. This is required for building security access.

***

🎯 Who Should Attend?

  • Data engineers & architects
  • AI/ML practitioners
  • Open-source contributors
  • Developers building lakehouse or analytics platforms

Looking forward to an evening of learning, sharing, and networking with the NYC data community!

KI-Zusammenfassung

Von Meetup

Open lakehouse & AI data infrastructure meetup for data engineers and AI practitioners, in-person. Learn how to deploy Iceberg-based lakehouses.

Verwandte Themen

Apache

Sponsoren

Cloudera

Cloudera

We deliver an enterprise data cloud for any data from the Edge to AI.

Pivotal

Pivotal

Hosting

Microsoft

Microsoft

Hosting

IBM

IBM

Hosting, food, giveaways

Das könnte dir auch gefallen