Bangalore Streams In-Person meetup - May 2024


Details
Please note: Since there are limited seats, we can only accommodate people who register at https://forms.gle/W5VQzvm4rQvwiRup9. Only people completing the form and receiving a confirmation email will be able to attend.
---
Hello Bengaluru
Platformatory is excited to host a special edition of the Bengaluru Streams meetup on May 4, right after Kafka Summit. Join us for an evening of exciting talks from Warpstream, Redpanda, RisingWave and OneHouse. This is a great opportunity to talk to, meet and learn from some of the newest innovators in the data infra space.
📅 When: May 4 2024 04:00pm - 08:00pm
📌Where: Rakuten India Enterprise Private Limited, Bengaluru
🗣Talks:
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Speaker: Richard Artoul, Co-Founder @ WarpStream Labs
About the talk: Separation of compute and storage has become the de-facto standard in the data industry for batch processing.
The addition of tiered storage to open source Apache Kafka is the first step in bringing true separation of compute and storage to the streaming world.
In this talk, we'll discuss in technical detail how to take the concept of tiered storage to its logical extreme by building an Apache Kafka protocol compatible system that has zero local disks.
Eliminating all local disks in the system requires not only separating storage from compute, but also separating data from metadata. This is a monumental task that requires reimagining Kafka's architecture from the ground up, but the benefits are worth it.
Real-Time Predictions with Machine Learning & Redpanda Streaming Data Transforms
Speaker: Christina Lin, Developer Advocate, Redpanda
About the talk: In this session, we'll address how to simplify data structures in AI applications, emphasizing the importance of not overcomplicating data architecture while constructing stateless pipelines for real-time analytics.
We'll cover the creation of an efficient data platform using Redpanda data transforms powered by WebAssembly (WASM), particularly tailored for dynamic industries (we will use food delivery as an example). We'll show how to simplify your data stack and demonstrate with a lab how complex data structures often hinder the agility and performance of AI systems. The lab will focus on stateless pipelines, where each data item is processed independently, and showcases how to build scalable and robust AI applications without the burden of cumbersome data frameworks. Attendees will see how Redpanda's integration facilitates seamless real-time data processing and instant transformations that are crucial to responsive and accurate AI-driven predictions.
We will cover:
• Streamlined data ingestion and transformation
• Real-time machine learning
• Simplified infrastructure setup
Participants will learn how to avoid common pitfalls associated with complex data structures and data stacks, and will gain insights into creating more effective, agile, and responsive applications – especially for AI.
Stream Processing in SQL: One Approach
Speaker: Noel Kwan, Software Engineer at RisingWave Labs
About the talk: Join us for an exploration of real-time streaming data processing, where we’ll delve into RisingWave’s Stream Processing Model and interact with data streams using SQL.
In this session, we will begin by demonstrating the difference between batch and streaming data processing. We will then cover the internals of RisingWave’s architecture, such as decoupled compute and storage, and discuss how each RisingWave service operates.
We will further dive into stateful and stateless streaming computations, examining aspects like the internal state of stateful computations and how it can be observed from RisingWave.
We will also explore RisingWave’s handling of batch queries and discuss the serving scenarios in which RisingWave excels.
Next, we will cover data delivery and ingestion with external systems like Kafka, seamlessly integrating them to showcase how different systems can collaborate to provide various features.
Finally, we will review a simple dashboard application for ride-hailing data and demonstrate these concepts.
Unlocking Seamless Streaming Ingestion with Apache Hudi and Kafka
Speaker: Sagar Sumit & Vinish Reddy, Software Engineers at Onehouse
About the talk: Join us as we explore Apache Hudi and its transformative utility, Hudi Streamer, for seamless data ingestion from various sources, including Apache Kafka. Learn how Hudi Streamer simplifies workflows with pluggable interfaces for extraction, key generation, and schema provision. We’ll also showcase real-time database replication using CDC, bridging Confluent Cloud platform and Onehouse managed lakehouse. Throughout the session, we’ll highlight the synergy between Hudi and Kafka, empowering organizations to streamline data workflows and drive innovation.
COVID-19 safety measures

Bangalore Streams In-Person meetup - May 2024