Spark working with a Cloud IDE: Notebook/Shiny Apps


Details
Abstract:-
The Problem: Energy inefficiency within public/private buildings in the City of New York.
The Goal: Take meter(Sensor) data, solve the inefficiencies through better insights.
The Solution: Visualization and Reporting through the Shiny App to gain knowledge in past, and present usage patterns. In addition to those patterns, compare and gain insights/predictions on energy usage.
Spark's Dataframes and RDD's will be used in concert with panda (library) to clean and model/prepare data for the R Shiny App. The message to convey in this meetup discussion is to show the capabilities of Spark while using DSX and RStudio/Shiny App to create visualization/reporting that will be able to give insights to the end user.
There are a few techniques that we will present in this notebook with both modeling and ML: Linear Regression, K-Means clustering for identifying inefficient buildings, (Statistical) Classification Modeling, followed by a confusion matrix (error matrices).
Bio:-
Thomas Liakos has been an Open Source Systems Engineer for 11 years and he has 8 years of experience in Cloud and hybrid environments. Prior to IBM Thomas was at Gem.co: Sr. Systems Architect. and CrowdStrike: DevOps / Systems Engineer - Cloud Operations. Thomas has expertise in Spark, Python, Systems and Configuration Management, Architecture, Data Warehousing, and Data Engineering.
Parking & Other Info
• Enter from Hannum Ave into the parking structure for the 200-300 buildings • Once you park, exit the parking structure on P4 near the elevators.
• Head across the courtyard veering slightly to the right and you will see the 200 building. If you find the 300 or 100 buildings you are in the wrong place. Take elevator to second floor
• Parking is free in the lot, after 6:30 PM the gates are open so people can just leave

Spark working with a Cloud IDE: Notebook/Shiny Apps