Wed, Jul 15 · 5:00 PM PDT
Join us for an exciting evening of deep dives into next-generation data infrastructure, open table formats, and architectural frameworks purpose-built for production-grade AI applications and autonomous agents.
Whether you are looking to unify multimodal data silos or scale complex agentic reasoning workloads over enterprise data systems, this meetup brings together top engineering minds from Cloudera, LanceDB, Google Cloud, and PuppyGraph to share live demos, architecture breakdowns, and real-world production insights.
🗓️ Event Details:
Date: Wednesday, July 15, 2026
Time: 05:00 PM to 08:30 PM PDT
Format: In-Person with Socialive Broadcast
Location: Cloudera San Jose Office, 6220 America Center Dr, 5th Floor, San Jose, CA 95002
⏰ Agenda
5:00 PM – 5:30 PM: Register & Settle Down
5:30 PM – 6:00 PM: Talk 1: Building the Multimodal Lakehouse for AI with LanceDB
6:00 PM – 6:30 PM: Talk 2: Putting Agents in your Data Platforms - Are we Ready?
6:30 PM – 7:00 PM: Talk 3: Agent Context at Scale: Graph + SQL on Apache Iceberg
7:00 PM – 7:30 PM: Talk 4: Architecting the AI-Native, Cross-Cloud Lakehouse
7:30 PM – 8:30 PM: Networking & Snacks 🍕✨
📚 Session Breakdowns & Speakers
#### Talk 1: Building the Multimodal Lakehouse for AI with LanceDB
The next wave of AI applications demands seamless, scalable access to text, images, embeddings, and other complex modalities—but current lakehouse solutions still force teams into closed systems for vector search, full-text search, or feature engineering, reintroducing data silos. In this talk, we introduce Lance, a next-generation columnar data format optimized for AI, and LanceDB, the multimodal lakehouse built on top of it. Together, they provide low-latency access, unified vector, full-text, and SQL search, and flexible schema evolution across the entire multimodal AI lifecycle. From application serving to feature engineering and large-scale training, learn how innovators build open, performant, and production-grade multimodal systems at scale.
Speakers: ChanChan Mao (DevRel @ LanceDB) & Lu Qiu (Database Engineer @ LanceDB)
#### Talk 2: Putting Agents in your Data Platforms - Are we Ready? (with Apache Iceberg & Cloudera AI)
Data platforms traditionally use deterministic pipelines for predictable query patterns, but Agentic AI introduces an execution model where agents dynamically explore data systems by probing schemas, issuing iterative queries, validating hypotheses, and refining their approach based on intermediate results. This session will cover the architectural primitives required to manage these unpredictable workloads, the core building blocks for isolation, context, governance, and auditability, and how Apache Iceberg's snapshot-based storage and branching semantics support building robust agentic workflows on enterprise data platforms.
Speaker: Dipankar Mazumdar (Director-Developers @ Cloudera)
#### Talk 3: Agent Context at Scale: Graph + SQL on Apache Iceberg
Agentic systems place new demands on data infrastructure: scalability, performance, and guardrails to keep agents grounded in accurate context. At the same time, they push natural language interfaces beyond text-to-SQL, freeing retrieval to use the right tool for the right job. In this talk, we introduce a pluggable text-to-insight framework built on Apache Iceberg that runs both SQL and Cypher over the same underlying data, giving agents richer context for better reasoning without duplication or new silos. We’ll end with a live proof-of-concept demo showing it in action.
Speaker: Jaz Ku (Solution Architect @ PuppyGraph)
#### Talk 4: Architecting the AI-Native, Cross-Cloud Lakehouse
Adopting open table formats like Apache Iceberg has historically meant navigating a trade-off between true open interoperability and the operational ease of a fully managed platform. In this session, we’ll explore how to architect a borderless, cross-cloud data foundation built for the agentic era. We will dive into how Google Cloud’s Lakehouse architecture leverages the open Iceberg REST Catalog to provide a unified metadata layer across any compatible engine (BigQuery, Managed Spark, or Trino). Finally, we’ll demonstrate how to pair this open foundation with GCP's Knowledge Catalog to transform passive Iceberg metadata into an active semantic knowledge engine for AI agents.
Speaker: Vinod Ramachandran (Google)
🎟️ RSVP & Important Notes
Space is limited at the Cloudera San Jose Office
Registration through Meetup does not guarantee admission. Please register through the official Luma page (https://luma.com/n8aycq3j) for consideration and event updates.
We look forward to seeing you there! 🙌
##