Skip to content

Intro to Big Data with Hadoop Workshop

Photo of Marck Vaisman
Hosted By
Marck V.
Intro to Big Data with Hadoop Workshop

Details

http://photos1.meetupstatic.com/photos/event/1/6/b/e/event_217865822.jpeg

To conclude Big Data Week, we are holding a hands-on introductory Hadoop workshop on Saturday, April 27th. You've heard and read about it everywhere, now come learn what it is and how to use it. By the end of the workshop, you will have gained a solid understanding of the Hadoop ecosystem, successfully set up a Hadoop cluster, and ran several different types of queries on the data.

The price per attendee is $150.

To maximize value to the attendees, this workshop is limited to the first 20 people that RSVP.

What to Bring:

Your laptop Printed copy of your ticket

What You Will Learn:

What Hadoop is and how it works How to run a MapReduce script Use cases where Hadoop should be used How to use Pig, Hive, and Mahout

Agenda

Introduction to Hadoop

HDFS & MapReduce Hadoop History, Adoption, & Maturity Hadoop Distributions
The Hadoop Analytics Ecosystem

Pig - high-level data-flow language and execution framework for parallel computation. Hive - data warehouse infrastructure that provides data summarization and ad hoc querying. Mahout - machine learning and data mining library.
Setting Up and Running Hadoop

Setting up clusters Running Hadoop on Amazon EC2 Hadoop Streaming - R, Python, Shell

Instructor

Marck Vaisman, Owner & Principal Data Scientist, DataXtract LLC

Marck is a co-founder of Data Community DC, runs the Statistical Programming DC Meetup group, and is the owner of data science consulting company DataXtract. He has an MBA from Vanderbilt and a MS in Mechanical Engineering from Boston University.

Photo of Data Community DC (DC2) group
Data Community DC (DC2)
See more events
Metro Offices
1250 Connecticut Ave., NW, Suite 200 · Washington, DC