Energy Usage Insights with Hadoop and HBase
Oren Benjamin, Opower
Scott Kuehn, Opower
Opower software has already helped millions of people collectively save
over 2 terrawatt hours of energy and $300 million in energy bills.
Opower's expanding energy usage dataset includes meter and smart meter data streams from around the globe. The engineering team is leveraging Hadoop, HBase, and Kiji to build a scalable energy usage insight engine capable of ingesting, maintaining, and analyzing these data streams to provide actionable insights that help people manage their energy use. In this talk we'll cover the system architecture, data schema, and the generation of energy usage insights with M/R. We'll also discuss how Hadoop and HBase integrate with Opower's substantial relational database backed software.
Oren is a Senior Software Engineer working on the Opower data platform. He lead the development of the Opower social product, using Hadoop and Hive along with Opower's nationwide dataset to make a home energy use comparison freely available to anyone in the US, while encouraging users to save energy via social interaction and game mechanics. Prior to Opower, he lead engineering for AddThis analytics where he designed and implemented the analytics API, share counter, and the open source stream-lib for realtime data stream summarization of web traffic from AddThis' 1B unique monthly users.
Scott is a Data Architect at Opower. As a member of the Data Services Team, Scott creates Hadoop and HBase services that provide a scalable core to Opower's various products. Scott's career has mirrored his passion for working with data and distributed systems. As a research consultant at the University of Washington, he created the first high-throughput DNA analysis pipeline capable of processing billions of genomic data reads, resulting in publications of multiple novel insights into the nature of genetic diseases. More recently, Scott was a Lead Associate at Booz Allen, where he developed a commercial cyber data analytics platform that used Hadoop, Storm, and Accumulo to successfully detect malicious cyber activity from streaming log data.
Cloudera support for Accumulo
Joey Echeverria, Cloudera
Apache Accumulo is a vibrant and important member of the Apache Hadoop Ecosystem. Come to hear how Cloudera, the Platform for Big Data, is building out support for Accumulo in our distribution and ensuring tight integration with the rest of the Ecosystem.
Joey Echeverria is a Principal Solutions Architect at Cloudera where he oversees the technical success of projects in the Federal and Mid-Atlantic regions. Joey also works directly with customers to deploy production Hadoop clusters and solve a diverse range of business and technical problems. Joey joined Cloudera from the NSA where he worked on data mining, network security, and clustered data processing using Hadoop. Prior to working full time for NSA, Joey attended Carnegie Mellon University where he attained an M.S. and a B.S. in Electrical and Computer Engineering.
6:30-7:15 - Snacks and Networking
7:15-7:20 - Announcements
7:20-8:05 - Oren Benjamin and Scott Kuehn
8:05-8:15 - Break
8:15-9:00 - Joey Echeverria