Skip to content

Intro to predictive data analysis using Apache Kafka, Spark, Zeppelin on JVM

Photo of Pooja Mistry
Hosted By
Pooja M. and 3 others
Intro to predictive data analysis using Apache Kafka, Spark, Zeppelin on JVM

Details

Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.

Predictive Analysis is increasingly becoming a necessary skill set for data analysts. There are many ways to accomplish this, but there are many offerings, solutions and alternative methods in the marketplace today. Choosing a starting point by choosing a practical, intuitive and effective platform is not so easy.

A viable alternative and a good way to start, is to use Apache Kafka and Spark in conjunction with Apache Zeppelin. In unison this stack provides both a powerful analytics engine for responsive large-scale data processing and computation. We will show linear regression techniques to illustrate predictive analysis.

At this meetup, you will learn

  • How Kafka can be an event source to collect weather data from say IoT sensors.
  • See how Spark can be called from and execute from a few simple Java applications - Word Count, Basic functional aspects like map/reduce/fold
  • How Apache Zeppelin `Notebook` allows for interaction with in memory Resilient Distributed Datasets (RDD) which provide parallelized predictive analysis on single and multiple raw datasets (e.g. How Flight Delay data may collate with Weather datasets.)

Pre-requisites

Since we will be showing off Java code, you should already be comfortable with Java 8.

Get a head start on IBM Analytic services and get a free IBM Cloud account via: https://ibm.biz/BdzgmP

Please bring your Meetup RSVP Confirmation & ID to check-in

Agenda:

  • 630pm - doors open, check-in, and get some food
  • 645pm - round-robin introductions & get to know each other
  • 7-830pm - presentation
  • 830-845pm - Q&A
  • 9pm - venue closes

About Grant

Grant Steinfeld (@gsteinfeld) is the IBM Developer Advocate for Blockchain, Java, and NodeJS. Grant is an accomplished and innovative senior software architect and engineer with a reputation for delivering client-focused solutions. He is a problem solver and team mentor with the ability to work with and manage development teams. He is able to interface with senior management and product teams in order to translate business requirements and challenges into project plans and solutions.

Pratik Patel is a Java Champion and developer advocate at IBM. He wrote the first book on 'enterprise Java' in 1996, "Java Database Programming with JDBC." An all around software and hardware enthusiast with experience in the healthcare, telecom, financial services, and startup sectors. Helps to organize the Atlanta Java User Group and North Atlanta JavaScript meetup, frequent speaker at tech events, and master builder of nachos.

Photo of IBM Developer New York group
IBM Developer New York
See more events
NYC Blockchain Center
54 W 21st St · New York, NY