Skip to content

Hands On: Introduction to the Hadoop Ecosystem

Photo of Bob Wakefield
Hosted By
Bob W.
Hands On: Introduction to the Hadoop Ecosystem

Details

I've already spoken at length about what Hadoop is and what it can do. You can find that talk here: https://www.youtube.com/watch?v=fbf7jFWGzYs

We'll spend a little bit of time talking about exactly what Hadoop is and what you can do with it, and then we're going to crack open our laptops and go to work.

We'll be coming at it from the perspective of an analyst and not an administrator or data engineer. I'll show you how to interact with the Hadoop distributed file system both through the command line and through a GUI.

Once you have the hang of that, we'll spend some time playing with some of the analyst tools in the ecosystem.

Hadoop is a HUGE massive topic, so we'll be hitting things at high speed from 10,000 ft. Doors open at 6:30 and class will start at 6:45. We'll go till 8 or till everybody gets bored and wants to go home; whichever comes first.

You'll need the following software installed for this class.

VirtualBox (I use VirtualBox. You're welcome to use any VM but I may not be able to troubleshoot it if you run into problems.)

https://www.virtualbox.org/wiki/Downloads

WinSCP (You can use any SFTP client but the same warning as above.)

https://winscp.net/eng/docs/guide_install

The Hortonworks Sandbox appropriate for your VM

https://hortonworks.com/downloads/#sandbox

Any SSH client but I recommend putty:

http://www.putty.org/

A MS Azure cloud account

https://azure.microsoft.com/en-us/free/?v=17.15

Photo of KC Data Professionals group
KC Data Professionals
See more events
Kansas City Public Library Plaza Branch
4801 Main Street · Kansas City, MO