Hands-on Programming using Hadoop, MapReduce and Hive

Hi everyone,

When we sent out the member survey a few months ago, the most frequent request was to set up a hands-on meetup for folks could code and deploy some of the Big Data techniques we have heard about in our past meetups. I am happy to announce that the next two meetups will be focused on this goal.

In the spirit of Wisconsin's favorite sport, we will be using a large dataset of NFL data from Jesse Anderson (http://www.jesse-anderson.com/2013/01/nfl-play-by-play-analysis/). Jesse took NFL data + arrest data + weather data and combined it (using MapReduce) and made it queryable (using Hive). Here is another link about this project:
http://techcrunch.com/2013/08/04/how-data-changes-preconceptions-about-nfl-football-the-weather-and-the-parallel-universe/
We will go over the concepts of MapReduce and Hive, and show examples of putting data in HDFS, running the MapReduce job, running a SQL query through HUE, etc.

Before class if possible:

Download the class materials will make sure we can get right to the Hadoop!
• Please download the Cloudera Quickstart VM - available in VMware, KVM, and VirtualBox formats: http://www.cloudera.com/content/support/en/downloads/download-components/download-products.html
• For help and additional information on the quickstart VM: http://www.cloudera.com/content/cloudera-content/cloudera-docs/DemoVMs/Cloudera-QuickStart-VM/cloudera_quickstart_vm.html
Run the VM. Feel free to check out Cloudera Manager and HUE from the splash screen.

Then you'll clone Jesse Anderson's NFL project. Open up a terminal window:
[cloudera@localhost ~]$ cd workspace/
[cloudera@localhost workspace]$ git clone https://github.com/eljefe6a/nfldata

Bring laptop to class!

I hope everyone can make it. I'm excited!

Thanks,
Pitt Fagan

Join or login to comment.

  • Michael

    Did you get a chance to post the Slide? Great event Looking forward to the next .

    October 23, 2013

  • Goutham C.

    Thanks Ryan, it was very technical and informational hands-on programming. Looking for more of these in coming months. Thanks.

    October 23, 2013

  • Bob C.

    A great hands-on intro that really helps me to move ahead in my own personal learning while avoiding the learning curve of doing the basic tool "stuff".

    October 22, 2013

  • Mona J.

    This meetup was pretty much cool and I am happy that I attended it. Madison Public Library is definitely a good place for the meetups. The pizza was so delicious as well. we definitely need more hands-on such this in future. Maybe every two week? Thanks a ton Ryan and everyone who organized it.

    October 22, 2013

  • Hani E.

    Hey Pitt and Mathew,
    thanks for organizing this. I am looking forward to it.
    Question about memory requirements of the Virtual Machine. 4 GB or RAM. My MacBookPro has a total of 4 GB of RAM... Even if I shut down everything while running the VM, the OS still needs some memory to use. So do you think it will work if I allocated say 3 to 3.5 GB to the VM and left 0.5 GB for the Mac OS?
    Or do I need a different machine?
    Again thanks for organizing this.
    Hani

    October 17, 2013

    • Ryan B.

      Awesome, Hani - looks good. Thanks for letting us know! Don't forget to clone the project too!

      October 20, 2013

    • Mona J.

      @Ryan: How do you open the .qcow2 file? I opened it in Ubuntu13.04 using the following command in terminal: kvm -hda cloudera-quickstart-vm[m­asked]-kvm.qcow2 -net nic -net user -m 512 (Could it be the reason it's very slow?)

      October 22, 2013

  • Andy R.

    The download is taking a really long time... does anyone know of a mirror or torrent for the quickstart vm?

    October 22, 2013

    • Pitt F.

      I am bringing the VMs on a portable hard drive so they will be available offline. I do not know of another location to download them from.

      October 22, 2013

    • Michael J.

      Mine took about 20 minutes to download.

      October 22, 2013

  • Michael J.

    It's working now.....

    1 · October 22, 2013

  • Shankar M.

    I am unable to download the VM.

    It is taking me to the following page that shows 404 Not Found.
    http://www.cloudera.com/system/...­

    October 22, 2013

    • Michael J.

      Just saw your response Matt, thanks!

      October 22, 2013

  • Michael J.

    Ryan, I had the same problem when trying to download the VMware, it takes you to the Cloudera Connect terms and conditions and when you click ACCEPT, it takes you to the 404 Not Found page that Shankar indicated below. Thoughts?

    October 22, 2013

  • Hemachandra

    My VM runs well on 2.5GB RAM.Shrinking down RAM is also an option.

    October 18, 2013

  • Goutham C.

    Thanks Pitt!

    October 16, 2013

Our Sponsors

  • Cloudera

    Cloudera is the general sponsor of Big Data Madison.

Create your own Meetup Group

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Start your Meetup today

Act now and get 50% off.
Until February 1.

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy