Modeling Heating Problems using Zeppelin, Spark, R and System ML


Details
Grab your scarf and mittens and join us for an informative talk about heating. Heating Complaints are among the top five complaints during the winter in most US big cities.
Using heating complaints as an example, in this meet up we shall demonstrate how Spark, Zeppelin, R, Spark SQL, Spark ML Lib, and System ML can be used on publicly available data sources to form a seamless data science pipeline and models for prediction of Heating Complaints.
We'll show how individual data sources can be explored, curated/prepared, merged and then machine learning models can be developed on the resultant data sets. We shall use Zeppelin notebook as the tool for step by step interactive Data Exploration, Data Preparation and Data Modeling activities using Spark as the back end cluster computing framework. We'll showcase how data sets can be created and shared across Spark SQL, Spark MLLib, Spark R, R and IBM System ML through Zeppelin.
Also we shall show how visualization libraries in R can be used from Zeppelin on the source and predicted data for the interactive visualization.
Speaker: Sourav Mazumder, Big Data Architect and Spark Advisor
Sourav has 19 years of IT experience and 7 years in Big Data. He is part of IBM Analytics Stampede and Spark Technology center and has experience in architecting High Throughput Scalable Data Applications, Real Time Analytics, Petabyte Scale Database Systems using the concepts of Distributed Computing, Performance Modeling and Big Data Technologies. Sourav is influencing key decision makers in fortune 500 companies to explore and institutionalize various Big Data technologies for over 5 years. Sourav regularly speaks in Big Data conferences and meetups. Sourav is co-chair of Big Data Applications for Enterprise, Industry and Business Track in IEEE BigDataService 2016.

Modeling Heating Problems using Zeppelin, Spark, R and System ML