What You'll Learn
This talk serves as an extremely basic introduction to retrieving data from Hadoop, with a special focus on HBase (the Hadoop Database). Using Pig, non-developers can gather data from the seemingly unwieldy HBase, freeing them to analyze that data with their tool of choice (e.g., Excel, Python). Note: there’s no getting around coding, but if you can write an Excel formula you can write Pig.
We will begin with a quick and dirty overview of Hadoop and NoSQL databases, with a specific focus on HBase because its flexible schema presents a challenge to Pig beginners. You will learn how to use some short, very kludgy python in the form of a user defined function (UDF) to get around this problem. By the end of this talk, individuals new to code should feel emboldened to start retrieving data from Hadoop.
Gus Cavanaugh is a recovering business major who once sought recognition as a thought leader without realizing how Orwellian it sounded to lead someone’s thoughts. To that end, he honed his ability to make copies, rush around the office, tie a full windsor knot, and nod his head in a sign of vigorous agreement whenever his seniors sought buy-in around a new strategy of leveraging core competencies. While he has barely risen from his nadir, he no longer views the powerpoint deck as the ultimate expression of business acumen and realizes it can be fun to actually build something rather than just speculate.
6:30 - Food and beverages
7:00 - Intro and announcements
7:15 - Talk
8:30 - Head to Tonic for drinks