Skip to content
This event was canceled

Apache Spark Streaming with Apache NiFi and Apache Kafka

Photo of Future of Data
Hosted By
Future of D.

Details

Agenda

6:30 PM – 7:00 PM: Food, drinks, mingling

7:00 PM – 7:45 PM: Lecture & demo

7:45 PM – 8:00 PM: Q&A

Apache Spark is a unified framework for big data analytics. Spark provides one integrated API for use by developers, data scientists, and analysts to perform diverse tasks that would have previously required separate processing engines such as batch analytics, stream processing and statistical modeling. Spark supports a wide range of popular languages including Python, R, Scala, SQL, and Java. Spark can read from diverse data sources and scale to thousands of nodes.

We will demo a Spark Streaming application ingesting simulated financial transactions in NiFi via a Kafka messaging broker. During the lecture we will cover Spark Streaming, NiFi, and Kafka in detail.

• Apache Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams.

• Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data.

• Apache Kafka is a high-throughput distributed publish-subscribe messaging system.

Robert Hryniewicz has over 15 years of experience working on Machine Learning, AI, Robotics, cloud products and more. He’s been a principal consultant at TiVo, CTO at a Singularity Labs company, Sr. Engineer at Cisco, NASA, Concurrent et al. Robert has been developing in Apache Spark since 2014. As a consultant he developed several interesting products including a Graph Analytics platform, as well as multiple Machine Learning and IoT prototypes. Robert’s interests range anywhere from distributed systems to advanced analytics, deep learning, NLP, general AI, robotics, DNA Sequencing, personalized medicine, vertical farms, and blockchain related technologies. He comes up with best ideas when hiking in Yosemite and other Northern California parks.

Photo of Future of Data: Silicon Valley group
Future of Data: Silicon Valley
See more events

Canceled

Hortonworks HQ
5470 Great America Parkway · Santa Clara, CA