Skip to content
Hadoop Based Machine Learning

Details

Austin ACM SIGKDD

Hadoop Based Machine Learning

Austin ACM SIGKDD is offering a two-semester course on Hadoop Based Machine Learning. Participants in the course will receive an official ACM certificate for completion of the course. A separate certificate, part 1 & part 2, will be offered for each semester. You do not have to be a member of ACM or SIGKDD to take the course. There is not cost for the course.

The course will meet every Wednesday evening from 7:00 pm – 8:30 pm at Paypal for the fall and spring semesters. The specific dates are below. The location is Paypal, 7700 W. Palmer Lane, Austin Texas, 78717, Building D, Conference Room, Bring a picture ID to get into the building.

The course will cover Hadoop based machine learning with a three-prong approach. One part of the course will be taught from the book “Data-Intensive Text Processing with MapReduce” by Jimmy Lin and Chris Dyer. The cloud9 map-reduce library written by Jimmy Lin for the book will also be reviewed. The second prong is once a month a session will be devoted to a machine learning techniques implemented in Mahout using map-reduce. The last prong will be bi-monthly reviews of the latest research papers on machine learning techniques using map-reduce.

Prerequisites: The course will cover the mathematics of machine learning. Understanding of linear algebra, probability, statistics, and optimization will be useful. All the coding examples will be in Java.

Required Text: Data-Intensive Text Processing with MapReduce by Jimmy Lin and Chris Dyer. The book is available for free at the below URL.

http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf

Recommend Text: Hadoop: The Definitive Guide by Tom White

Grading: Attendance at 70% of the sessions each semester. End of the semester exam, 20 questions, multiple choice, take home exam.

Fall Semester

Session, Date, Source, Chapters, Topic

1, 09/04/2013, Book, Ch. 1 & 2, Map Reduce Basics

2, 09/11/2013, Book, Ch 1 & 2, Map Reduce Basics

3, 09/18/2013, Book, 3.1, 3.2, MR Algorithm Design - Aggregation

4, 09/25/2013, Mahout, Mahout math and collections

5, 10/02/2013, Book, 3.3, 3.4, MR Algorithm Design – Counting & Sorting

6, 10/09/2013, Book, 3.5, 3.6, MR Algorithm Design - Joins

7, 10/16/2013, Papers, QR Factorization

8, 10/23/2013, Mahout, Classifier Naive Bayes

9, 10/30/2013, Book, 4.1, 4.2, Inverted Indexing

10, 11/06/2013, Book, 4.3, 4.4, Inverted Indexing

11, 11/13/2013, Book, 4.5, 4.6, 4.7, Index Compression

12, 11/20/2013, Papers, Singular Value Decomposition

13, 12/04/2013, Mahout, Singular Value Decomposition

14, 12/11/2013, Book, 5.1, Graphs

15, 12/18/2013, Book, 5.2, Graphs – Parallel Breath-First Search

Spring Semester

Session, Date, Source, Chapters, Topic

1, 01/08/2014, Book, 5.3, Graphs – Page Rank

2, 01/15/2014, Book, 5.4, 5.4, Graphs - Issues

3, 01/22/2014, Book, 6.1, Expectation Maximization

4, 01/29/2014, Mahout, Clustering – Spectral Clustering

5, 02/05/2014, Book, 6.2, Hidden Markov Models

6, 02/12/2014, Book, 6.3, EM in MapReduce

7, 02/19/2014, Papers, Decision Trees

8, 02/26/2014, Mahout, Decision Trees - Random Forest

9, 03/05/2014, Book, 6.4, Case Study

10, 03/19/2014, Book, 6.5, 6.6, EM Like Algorithms

11, 03/26/2014, Book, Ch. 7, Closing Remarks

12, 04/02/2014, Mahout, Hidden Markov Models

13, 04/09/2014, Mahout, Clustering - Canopy Clustering

14, 04/16/2016, Papers, Bag of Little Bootstraps

15, 04/23/2016, Wrapup, Certificate Awards

Photo of Austin ACM SIGKDD - Austin's Big Data Machine Learning Group group
Austin ACM SIGKDD - Austin's Big Data Machine Learning Group
See more events
Paypal
7700 W. Palmer Lane, Building D · Austin, TX