Bangalore Streams In-Person meetup - March 2024
Details
Hello Bengaluru👋
We're excited to continue our new series of events in Bengaluru focused on data streaming and adjacent technologies. Our objective is to share knowledge and provide a platform for thought leadership around:
- Event Streaming Technologies (Apache Kafka and more)
- Event Driven Architecture
- Stream Processing
- Streaming Databases
- Real-time analytics
- Data Mesh
..and more.
We're hosting our next in-person event on March 2. Join us for exciting discussions in the streaming world with opportunities to network with peers and leaders in the industry.
📅 When: March 2 2024 10:00am - 02:00pm
📌Where: Nasdaq Corporate Solutions Pvt. Ltd
🗺Directions: https://maps.app.goo.gl/FW61fHVeDEjBVyko6
(Affluence, No. 72/1, St. Mark's Road, Bangalore 560001)
Thank you to our sponsors, Confluent, for sponsoring F&B for the meetup.
🕒 Schedule :
10:00 am - 10:20 am: Welcome & registrations
10:30 am - 11:15 am: `Real time Analytics with Apache Kafka and Apache Druid` by Tijo Thomas, Solutions Architect at Imply
11:20 pm - 12:00 pm: `Real time data lake for CDC with Apache Paimon and Flink` by Avinash Upadhyaya, Platform engineer at Platformatory
12:00 pm - 12:15 pm: Networking break
12:15 pm - 1:00 pm: `Time Series Text Indexing in Apache Pinot` by Atri Sharma, Senior Principal Engineer at Atlassian
1:00 pm - 2:00 pm: Lunch & Networking
🗣Talks:
Real time Analytics with Apache Kafka and Apache Druid
Speaker: Tijo Thomas, Manager, Solutions Architect @ Imply
About the talk: In the digital era, the ability to process and analyze data in real-time is becoming increasingly vital for organizations aiming to stay competitive and make informed decisions swiftly. Kafka, a distributed event streaming platform, has emerged as a powerful tool for businesses looking to capitalize on real-time data analytics. This presentation will guide you through the essentials of generating and managing large volumes of events with Kafka, and how to leverage these capabilities for real-time analytics using Druid.
We will explore the architectural underpinnings druid internal that facilitate the construction of robust real-time analytics applications.
The discussion will extend to practical strategies for navigating the complexities of data streaming, focusing on how to effectively utilize Kafka alongside other cutting-edge tools to build scalable, efficient, and high-performing real-time analytics solutions.
Real time data lake for CDC with Apache Paimon and Flink
Speaker: Avinash Upadhyaya, Platform Engineer at Platformatory
About the talk: Apache Paimon is a streaming data lake platform with high-speed data ingestion, changelog tracking and efficient real-time analytics. In this discussion, we aim to explain how Paimon addresses the issue of bringing in change data (CDC) into the data lake. This includes the process from CDC ingestion, updating parts of the data, and reading the change log in a stream. Paimon simplifies CDC data and closely integrates with Flink.
Dealing with CDC in the Data Lake poses challenges such as syncing CDC data with schema changes, using a partial-update merge engine, and tracking changes in the data stream.
To sum up, we'll provide an overview and discuss what's on the horizon for Paimon.
Time Series Text Indexing in Apache Pinot
Speaker: Atri Sharma, Senior Principal Engineer at Atlassian
About the talk: Time series engines such as Apache Pinot are great at streaming aggregations - but text search is a different beast. This talk will focus on native text index and engine built for Apache Pinot which works well for time series engines and maintains the invariant that latest data is most valuable
