addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Hadoop / Apache Pig and Hive Workshop: SQL-Like Languages for Big Data

Many thanks to Akamai for hosting the event!!! Doors will open at 9, and the event starts at 9:30 and goes to about 12, 12:30 as the room will be available until 1. Akamai is also providing coffee and breakfast snacks - many thanks again! For parking please note MIT has a few lots that should be accessible. E51 and Hayward Street among others. Please note that capacity can be much higher with theater style versus a classroom table setup. Hence, we have opened this up theater-style, and so there may be a slight inconvenience during the programming parts (though perhaps some empty seats to allow for elbow room and/or some folks may just be following along).


Diana Mears of Cloudera will be giving a short 2-3 hour tutorial on Pig and Hive which are SQL-like languages for Hadoop and Big Data. Many thanks to Diana for taking the time on a Saturday morning to help us learn about these languages and their capabilities. The workshop will be part lecture, part hands-on tutorial. Diana has forwarded the training VM file, and I have uploaded it to a 4shared account:

http://www.4shared.com/zip/EG-iulyF/Cloudera_training_VM_17vmwarev.html

Please note that this account has a 100GB bandwidth allotment, and the file is 1.8GB. It took a while to upload (over 45 min - albeit not sure where the bottleneck was), and so I am looking to get another file sharing site going (Dropbox for whatever reason did not like the zip). If you know of any free alternatives, then please let me know. Perhaps we can get a few mirrors up for people who are interested in downloading this VM for the event. Also, Diana mentioned in her email that:

Participants will need to have laptops with either VMWare Player (if they're using PCs) or VMWare Fusion (if they're using Macs) installed. VMWare Player is a free download from vmware.com; there is a free 30-day trial version of VMWare Fusion available, again from vmware.com.




Cloudera is hosting a Hadoop Developer class in Boston on June 26th – 29th and would like to offer the workshop participants a 10% discount on the class with the coupon code: BPA_10

http://university.cloudera.com/training/event/apache_hadoop/developer/new_horizons_-_boston/2012-06-26/253.html

Cloudera has given similar talks recently at Strata - here are some really, really detailed docs on this.

Introduction To Hadoop:
http://strataconf.com/strata2012/public/schedule/detail/22360

Developing Applications for Hadoop:

http://strataconf.com/strata2012/public/schedule/detail/22665


If you have any Hive or Pig experience, and would like to demo a short example, then please let me know.

Join or login to comment.

  • A former member
    A former member

    Great introduction to Hadoop/Hive. Would be nice to follow-up with more in-depth sessions in a smaller setting.

    June 21, 2012

  • Paul C.

    The VM that was provided in advance made the 3+ hour presentation a lot more valuable and interactive for all of us. Diana's presentation and delivery was also top notch. Thanks to all who organized this!

    June 19, 2012

  • Edmund W.

    This was a great meetup. It really clarified some things for me that I had been struggling with. Thanks to the organizers and to Cloudera for making it happen! Only one suggestion. Because we were trying to squeeze a 5 hour talk into 3 hours things went very quickly, having the files pre-written (nyc.txt and emp.txt) would have made following the examples easier.

    June 18, 2012

  • Joseph K.

    The talk was very good. Unfortunately, it seemed like many of the people were more concerned with eating than paying attention. I was also shocked how many people hadn't installed the required software. Also, some of the questions were not thought out at all. One person asked "What does the word replicate mean?", which is a question just as easily answered by a dictionary than a human. These kind of people are inconsiderate and waste hundreds of other peoples' time.

    June 18, 2012

  • Liying C.

    Excellent trainer! The introduction to Hadoop was fairly extensive and practice was to the point albeit brief due to time constraint.

    June 17, 2012

  • Loren B

    I enjoyed this meetup, the slides were helpful on a conceptual level. We probably went into as much depth as you could on such tight schedule, but definitely only scratched the surface with regards to the technology.

    June 17, 2012

  • Larry

    One of the best meetups ever!

    June 17, 2012

  • A former member
    A former member

    great presentation

    June 17, 2012

  • Chandra P.

    The instructor presentation was very good.

    June 17, 2012

  • Jeff M.

    The course and instructor were very good and I got the very basics about Pig and Hive.
    However, I think it focused too long on MapReduce and not enough on Pig/Hive itself. Also, while the instructor was excellent, I think she tolerated off-topic questions (like what if a block is split between key and value) too much, forcing her to rush at the end. Finally it would have been good to have data sets (like nyc) pre-loaded so we could more easily follow along.

    June 17, 2012

  • Peter G.

    I hoped for a bit more focus on Hive/Pig vs. Hadoop and MapReduce basics. Since the former are higher level abstractions, avoiding long description of Hadoop internals would have been justified for this meeting.

    June 17, 2012

  • A former member
    A former member

    Nice summary intro to Hadoop. Pig/Hive overview was fast, but expected given a short-duration meeting. Ask presenters to keep notes higher up on the physical slides when if non-theater style rooms so that we can see words that the bottom of slide. And I'll move up closer to the front next time too. More plugs for laptops to charge, but for a short duration meeting, that was OK at this one.

    June 17, 2012

  • Jim T.

    Great presenter, good topic, good venue. Thanks to Akamai and Cloudera for sponsoring the event

    June 17, 2012

  • Alex H.

    Excellent overview of Hadoop core concepts and Hive/Pig frameworks.

    June 17, 2012

  • Manish s.

    very good. Learnt a lot of things. Thanks to all

    June 17, 2012

  • Luiz F.

    Should have been split into two smaller workshops: one for Hadoop; one for Hive /Pig. Content was great, though. Presenter did a beautiful job.

    June 17, 2012

  • Gary L.

    Excellent. Very well presented overview of Hadoop, Hive, and a little Pig. Long Q&A with clear explanations. A great jump start on using Hadoop. Thanks Diana, Akamai, and John.

    June 16, 2012

  • John V.

    Many thanks to Diana & Cloudera, Rosie & Akamai, and everyone who attended and asked great questions to further help clarify Hive/Pig. Will start work on follow-up events so as to dig in deeper.

    June 16, 2012

  • Vinod K.

    It was a great express tour of the Hadoop, Hive and PIG.

    June 16, 2012

  • Vikas E.

    Koh - If you're unable to update VMPlayer, you can change the Guest OS to Linux>Other Linux 2.6.x Kernel under Virtual Machine Settings > Options

    June 16, 2012

  • Ngiap K.

    I have VMPlayer 2.5.5 build[masked] installed in Windows XP service pack 3 when I tried to Open the VM I receive the following message
    Guest Operating System 'centos' is not supported
    Please select a guest operating system from the General page on the Options tab of the Virtual Machine Setting.

    I do not have VM

    June 16, 2012

  • Noah W.

    Nvr mind, I guessed it. For those interested its 'training' - w/out the single ticks.

    June 16, 2012

  • Noah W.

    What is the root account/pwd for the VM?

    June 16, 2012

  • A former member
    A former member

    Jeff, run the VM Player (assuming PC) and "Open a Virtual Machine" and then point it to the .vmx file in the \Cloudera_training_VM_1.7.vmwarevm folder and you should be off and running.

    June 15, 2012

  • Jeff M.

    basic question: i downloaded the VM and I downloaded the zip file with the demo. Now what do I do to make it all work?

    June 15, 2012

  • A former member
    A former member

    Thank you. The link didnt work for me at first, that's why I ventured. It works now, thank you.

    June 10, 2012

  • John V.

    in the event description there is a link to a VM that Diana of Cloudera has provided for this workshop. no java is needed.

    June 10, 2012

  • A former member
    A former member

    Also, I downloaded this VM: https://ccp.cloudera.com/display/SUPPORT/Cloudera's+Hadoop+Demo+VM . Is that good for what we need?

    June 10, 2012

  • A former member
    A former member

    Are there any prerequisites for this tutorial? Should I have Java Development experience? I know that was something they bring up for the regular courses.

    June 10, 2012

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy