Building a Real-Time Analytics Platform From Open Source Components


Details
Note: please fill out this simple form in advance of attending the meeting to allow for smoother flow at the front door, and help us with organizing this and upcoming events. Thank you!
Abstract:
This presentation provides a technical overview of a real-time analytics architecture, built with open source components including Apache Pinot, Flink, and Kafka. We will explore the challenges and best practices associated with deploying a robust streaming data pipeline using Apache Flink for real-time processing and transformation, and Apache Pinot for implementing external APIs. The proposed architecture is intended to complement Cloudera Streaming Analytics in an open form factor that is convenient for prototyping. Key topics will include strategies for infrastructure and deployment automation, while managing dependencies in air-gap environments. Attendees will gain insights into the operational considerations of managing Kubernetes operator based deployments for both CSA and the open-source stack, and practical advice for building secure real-time analytics platforms.
Speaker Bio:
Ryan Hill is a Solutions Architect at Cloudera, focused on high performance computing and computational science. He is a polyglot software engineer with experience in energy, biotechnology, financial services, and geospatial analytics. He lives in Colorado where he enjoys hiking, snowboarding, fly fishing, and photography.
Timing Details:
7:00--7:30 Social Session
7:30--7:45 Announcements & Introductions
7:45-ish Presentation

Every 3rd Thursday of the month until December 31, 2025
Building a Real-Time Analytics Platform From Open Source Components