align-toparrow-leftarrow-rightbackbellblockcalendarcamerachatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-crosscrosseditfacebookglobegoogleimagesinstagramlocation-pinmagnifying-glassmailmoremuplabelShape 3 + Rectangle 1outlookpersonplusImported LayersImported LayersImported Layersshieldstartwitteryahoo

Hadoop Based Machine Learning

  • Nov 6, 2013 · 7:00 PM

Course is limited to first 42 who signup each week.

Austin ACM SIGKDD 

Hadoop Based Machine Learning

Austin ACM SIGKDD is offering a two-semester course on Hadoop Based Machine Learning. Participants in the course will receive an official ACM certificate for completion of the course. A separate certificate, part 1 & part 2, will be offered for each semester. You do not have to be a member of ACM or SIGKDD to take the course. There is no cost for the course. The fall course is now closed to those who attended the first four meetings for the fall semester.

The course will meet every Wednesday evening from 7:00 pm – 8:30 pm at Paypal for the fall and spring semesters. The specific dates are below. The location is Paypal, 7700 W. Palmer Lane, Austin Texas, 78717, Building D, Conference Room, Bring a picture ID to get into the building.

The course will cover Hadoop based machine learning with a three-prong approach. One part of the course will be taught from the book “Data-Intensive Text Processing with MapReduce” by Jimmy Lin and Chris Dyer. The cloud9 map-reduce library written by Jimmy Lin for the book will also be reviewed. The second prong is once a month a session will be devoted to a machine learning techniques implemented in Mahout using map-reduce. The last prong will be bi-monthly reviews of the latest research papers on machine learning techniques using map-reduce.

Prerequisites: The course will cover the mathematics of machine learning. Understanding of linear algebra, probability, statistics, and optimization will be useful. All the coding examples will be in Java.

Required Text: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer. The book is available for free at the below URL.

http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf

Recommend Text: Hadoop: The Definitive Guide by Tom White

Grading: Attendance at 70% of the sessions each semester. End of the semester exam, 20 questions, multiple choice, take home exam.

Fall Semester

Session,  Date, Source,   Chapters,   Topic

1,  09/04/2013,  Book,  Ch. 1 & 2,  Map Reduce Basics

2,  09/11/2013,  Book,  Ch 1 & 2,  Map Reduce Basics

3,  09/25/2013,  Book,  3.1, 3.2,  MR Algorithm Design - Aggregation

4,  10/02/2013,  Mahout,  Mahout math and collections

5,  10/09/2013,  Book,  3.3, 3.4,  MR Algorithm Design – Counting & Sorting

6,  10/16/2013,  Book,  3.5, 3.6,  MR Algorithm Design - Joins

7,  10/23/2013,  Papers,  QR Factorization

8,  10/30/2013,  Mahout,  Classifier Naive Bayes

9,  11/06/2013,  Book,  4.1, 4.2,  Inverted Indexing

10,  11/13/2013,  Book,  4.3, 4.4,  Inverted Indexing

11,  11/20/2013,  Book,  4.5, 4.6, 4.7,  Index Compression

12,  12/04/2013,  Papers,  Singular Value Decomposition

13,  12/11/2013,  Mahout,  Singular Value Decomposition

14,  12/18/2013,  Book,  5.1,  Graphs

15,  01/08/2013,  Book,  5.2,  Graphs – Parallel Breath-First Search

Spring Semester

Session,  Date,  Source,  Chapters,  Topic

1,  01/15/2014,  Book,  5.3,  Graphs – Page Rank

2,  01/22/2014,  Book,  5.4, 5.4,  Graphs - Issues

3,  01/29/2014,  Book,  6.1,  Expectation Maximization

4,  02/05/2014,  Mahout,  Clustering – Spectral Clustering

5,  02/12/2014,  Book,  6.2,  Hidden Markov Models

6,  02/19/2014,  Book,  6.3,  EM in MapReduce

7,  02/26/2014,  Papers,  Decision Trees

8,  03/05/2014,  Mahout,  Decision Trees - Random Forest

9,  03/19/2014,  Book,  6.4,  Case Study

10,  03/26/2014,  Book,  6.5, 6.6,  EM Like Algorithms

11,  04/02/2014,  Book,  Ch. 7,  Closing Remarks

12,  04/09/2014,  Mahout,  Hidden Markov Models

13,  04/16/2014,  Mahout,  Clustering - Canopy Clustering

14,  04/23/2014,  Papers,  Bag of Little Bootstraps

15,  04/30/2014,  Papers,  Stochastic Subgradient Descent


Join or login to comment.

Our Sponsors

  • Visa

    Meeting space + pizza for the "ML with Python" course.

  • HomeAway

    Proud sustaining sponsor of Austin ACM KDD

  • Actian

    ACTIAN ANALYTICS PLATFORM ARCHITECTURE — Open, Fast and Enterprise-Grade

  • Cloudera

    Gold Pledge Sponsor for the Large Scale Machine Learning Workshop

  • AWS

    Platinum Pledge Sponsor for the Large Scale Machine Learning Workshop

  • Association for Computing Machinery

    Parent Organziation

  • ACM SIGKDD

    We are the local Austin chapter of ACM SIGKDD

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy