Invented by Google, and made popular by Hadoop, MapReduce is a practical programming model for gobbling up huge volumes of unstructured data distributed across clusters of commodity hardware. Oh, and you can do it all in Java!
Trivial? No, but Sandy Ryza from Cloudera offered to map this model to things we can relate to, and reduce the complexity to concepts we can understand :-)
Learn the basics of MapReduce, a programming abstraction that allows for parallel processing of massive datasets without the worries of distributed systems and fault tolerance. We'll talk at a high level about how it works and delve into what kinds of applications it's suited for, basic examples, and best practices. We'll cover some of the details of writing MapReduce programs on Hadoop, the open source implementation which is written in and for Java.
As always, the venue, food, drinks, and giveaways will be provided by our sponsors.
About Sandy Ryza
Sandy Ryza graduated with honors from Brown University in 2012, where he completed his undergrad thesis in combinatorial optimization on Hadoop. He recently joined the Resource Management and Scheduling team at Cloudera.