August 19, 2014 · 6:00 PM
RVA Data Hackers meets monthly to hear, present and discuss topics in machine learning & big data. Come on down and help us build Richmond's Data Hacking community.
This month, Robert Fehrmann presents: Building a fully functional Hadoop Cluster in 1 hour for less than a $1.
Hadoop is one of the hottest trends in data management. Even a small Hadoop cluster can manage 10th of Terabytes of data providing a SQL based user interface allowing you to analyze billions of rows in seconds. But how do you get started?
This talk is about building a fully functional Hadoop cluster from start to finish using Cloudera’s Hadoop Distribution (CDH). We will use cloud based Infrastructure as a Service (IAAS) provider DigitalOcean to build a 4 node cluster and then create a 100 million row test table to run some queries and all of that for less than $1.
Robert Fehrmann is a 20 year veteran in data management. He received his master’s degree in computer science from "Technische Universitaet Braunschweig" in Germany. Robert is a member of the snagajob engineering team, helping snagajob to implement a polyglot platform utilizing best of breed tools for data management.