Skip to content

Mad-Railers/Big Data Madison - MapReduce Edition!

Photo of Thomas C. Mueller
Hosted By
Thomas C. M.
Mad-Railers/Big Data Madison - MapReduce Edition!

Details

Matthew Rathbone (https://www.meetup.com/Mad-Railers/members/11183728/) will cover the general format of a MapReduce job, as well as how to build a job in ruby, and run it in Amazon EMR. In fact, he might even run one live.

MapReduce is how Google manages to search millions or billions of sources with snappy response times.

Proposed agenda:

  • General structure of a distributed map reduce framework
  • What is a mapper?
  • What is a partitioner?
  • What is a reducer?
  • How does data flow from one to the other?
  • How do I write this in ruby (or python, or bash, or even lisp)?
  • LIVE DEMO 1
  • Quick chat about more sophisticated use cases
  • Man that's slow, what higher level frameworks are there? [hive, pig, scoobi]
  • LIVE DEMO 2

This will be a joint meetup with Mad-Railers (https://www.meetup.com/Mad-Railers/). Enjoy!

Photo of Big Data Madison group
Big Data Madison
See more events