addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Machine Learning on BigData w. Map Reduce

Course objectives:
Participants will learn to adapt and execute machine learning algorithms in the map reduce framework.  Participants should finish the class able to author their own machine learning algorithms for map reduce and to run them on Amazon Web Services.


Participants will learn to use python code to author mappers and reducers for “hadoop-streaming”.  For most of the class we will employ “mrjob” - an open-source framework developed at Yelp.  Employing mrjob enables class members to program mappers and reducers in python.  The mrjob framework then submits the mapper-reducer to run locally without using hadoop, to run on Amazon Web Services, or to run them on a private hadoop cluster.  This will simplify the programming tasks.

Registration

Registration covers the cost of all 4 sessions.  If you register at least 5 days before the class, the price is $325.  You can register using credit card at http://machinelearningbigdata.eventbrite.com.  If you register in the last 5 days, the price is $375.  You register on eventbrite or you can pay by check or cash at the first class meeting.  You can also use paypal (mike at mbowles dot com)

 

Webcast

The class will be delivered by webcast - usually several people want to attend the class remotely.  In order to take the class be webcast, you'll need to register on http://machinelearningbigdata.eventbrite.com at least 24 hours before class starts.

Schedule

Here's a schedule to give an idea of what we intend to cover.  We can modify the schedule to match class interests - replace one of the algorithms with another or cover more algorithms at less depth etc.  We'll discuss the topics at the first class meeting.

 

Week/Date Topic

Week 1 Implementing Algorithms on Big Data - MrJob installation
MapReduce, Hadoop Streaming, Mahout, Amazon (AWS, EMR)

Week 2 Clustering
k-means, Canopy Clustering

Week 3 Supervised Learning
EM algo for mixture model, using canopy for speedup

Week 4 Other ML Tasks
Regularized Regression - glmnet algo for elasticnet    
SVM - Pegasos algo for two-class and one-class, extensions
Recommender Engine - Matrix Factorization by Gradient Descent

Other topics Decision Trees - Google PLANET, Text Mining, Ensemble Methods

Prerequisites:
-Facility with undergrad level math and stats (vector calculus, density functions, etc.)
-Comfortable programming  basic python (version 2.6 or 2.7 NOT version 3).

-You'll also need to develop some familiarity with Numpy - ("random" family of functions, matrix(), array())
-Install mrjob and boto (these are both python installations)
-Familiarity with basic machine learning.

Join or login to comment.

8 went

Your organizer's refund policy for Machine Learning on BigData w. Map Reduce

Refunds offered if:

  • the Meetup is cancelled
  • you can cancel at least 5 day(s) before the Meetup

Payments you make go to the organizer, not to Meetup. You must make refund requests to the organizer.

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy