Intro to predictive data analysis using Apache Kafka, Spark, Zeppelin on JVM


Details
Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.
Predictive Analysis is increasingly becoming a necessary skill set for data analysts. There are many ways to accomplish this, but there are many offerings, solutions and alternative methods in the marketplace today. Choosing a starting point by choosing a practical, intuitive and effective platform is not so easy.
A viable alternative and a good way to start, is to use Apache Kafka and Spark in conjunction with Apache Zeppelin. In unison this stack provides both a powerful analytics engine for responsive large-scale data processing and computation. We will show linear regression techniques to illustrate predictive analysis.
At this meetup, you will learn
- How Kafka can be an event source to collect weather data from say IoT sensors.
- See how Spark can be called from and execute from a few simple Java applications - Word Count, Basic functional aspects like map/reduce/fold
- How Apache Zeppelin `Notebook` allows for interaction with in memory Resilient Distributed Datasets (RDD) which provide parallelized predictive analysis on single and multiple raw datasets (e.g. How Flight Delay data may collate with Weather datasets.)
Pre-requisites
Since we will be showing off Java code, you should already be comfortable with Java 8.
Get a head start on IBM Analytic services and get a free IBM Cloud account via: https://ibm.biz/BdzgmP
Please bring your Meetup RSVP Confirmation & ID to check-in
Agenda:
- 630pm - doors open, check-in, and get some food
- 645pm - round-robin introductions & get to know each other
- 7-830pm - presentation
- 830-845pm - Q&A
- 9pm - venue closes
About Grant
Grant Steinfeld (@gsteinfeld) is the IBM Developer Advocate for Blockchain, Java, and NodeJS. Grant is an accomplished and innovative senior software architect and engineer with a reputation for delivering client-focused solutions. He is a problem solver and team mentor with the ability to work with and manage development teams. He is able to interface with senior management and product teams in order to translate business requirements and challenges into project plans and solutions.
Pratik Patel is a Java Champion and developer advocate at IBM. He wrote the first book on 'enterprise Java' in 1996, "Java Database Programming with JDBC." An all around software and hardware enthusiast with experience in the healthcare, telecom, financial services, and startup sectors. Helps to organize the Atlanta Java User Group and North Atlanta JavaScript meetup, frequent speaker at tech events, and master builder of nachos.

Intro to predictive data analysis using Apache Kafka, Spark, Zeppelin on JVM