Advanced Hadoop Based Machine Learning

Details
Course is limited to first 42 who signup each week.
Austin ACM SIGKDD Advanced Hadoop Based Machine Learning
Austin ACM SIGKDD is offering a two-semester course on Hadoop Based Machine Learning. Participants in the course will receive an official ACM certificate for completion of the course. A separate certificate, Hadoop Based Machine Learning for the fall, and Advanced Hadoop Based Machine Learning for the spring, will be offered for each semester. You do not have to be a member of ACM or SIGKDD to take the course. There is no cost for the course. The fall course is now closed to those who attended the first four meetings for the fall semester.
The course will meet every Wednesday evening from 7:00 pm – 8:30 pm at Paypal for the fall and spring semesters. The specific dates are below. The location is Paypal, 7700 W. Palmer Lane, Austin Texas, 78717, Building D, Conference Room, Bring a picture ID to get into the building.
The course will cover Hadoop based machine learning with a three-prong approach. One part of the course will be taught from the book “Data-Intensive Text Processing with MapReduce” by Jimmy Lin and Chris Dyer. The cloud9 map-reduce library written by Jimmy Lin for the book will also be reviewed. The second prong is once a month a session will be devoted to a machine learning techniques implemented in Mahout using map-reduce. The last prong will be bi-monthly reviews of the latest research papers on machine learning techniques using map-reduce.
Prerequisites: The course will cover the mathematics of machine learning. Understanding of linear algebra, probability, statistics, and optimization will be useful. All the coding examples will be in Java.
Required Text: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer. The book is available for free at the below URL.
http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf
Recommend Text: Hadoop: The Definitive Guide by Tom White
Grading: Attendance at 70% of the sessions each semester. End of the semester exam, 20 questions, multiple choice, take home exam.
Fall Semester
Session, Date, Source, Chapters, Topic
1, 09/04/2013, Book, Ch. 1 & 2, Map Reduce Basics
2, 09/11/2013, Book, Ch 1 & 2, Map Reduce Basics
3, 09/25/2013, Book, 3.1, 3.2, MR Algorithm Design - Aggregation
4, 10/02/2013, Mahout, Mahout math and collections
5, 10/09/2013, Book, 3.3, 3.4, MR Algorithm Design – Counting & Sorting
6, 10/16/2013, Book, 3.5, 3.6, MR Algorithm Design - Joins
7, 10/23/2013, Papers, QR Factorization
8, 10/30/2013, Mahout, Classifier Naive Bayes
9, 11/06/2013, Book, 4.1-7, Inverted Indexing
10, 11/13/2013, Slides, Singular Value Decomposition
11, 11/20/2013, Video, Singular Value Decomposition
12, 12/04/2013, Papers, Singular Value Decomposition
13, 12/11/2013, Mahout, Singular Value Decomposition
14, 12/18/2013, Slides, Latent Semantic Indexing
Spring Semester
Session, Date, Source, Chapters, Topic
1, 01/15/2014, Book, 5.1, Graphs
2, 01/22/2014, Book, 5.2, Graphs – Parallel Breath-First Search
3, 01/29/2014, Book, 5.3, Graphs – Page Rank
4, 02/05/2014, Book, 5.4, 5.4, Graphs - Issues
5, 02/12/2014 Book, Graphs
6, 02/19/2014, Book, 6.1, Expectation Maximization
7, 03/12/2014, Mahout, Clustering – Spectral Clustering
8, 03/19/2014, Book, 6.2, Hidden Markov Models
9, 03/26/2014, Book, 6.3, EM in MapReduce
10, 04/02/2014, Papers, Decision Trees
11, 04/09/2014, Mahout, Decision Trees - Random Forest
12, 04/16/2014, Book, 6.4, Case Study
13, 04/23/2014, Book, 6.5, 6.6, EM Like Algorithms
14, 04/30/2014, Book, Ch. 7, Closing Remarks
15, 05/07/2014, Mahout, Hidden Markov Models
16, 05/14/2014, Mahout, Clustering - Canopy Clustering
17, 05/21/2014, Papers, Bag of Little Bootstraps
18, 05/28/2014, Papers, Stochastic Subgradient Optimization

Advanced Hadoop Based Machine Learning