Big Data by Hadoop (beginner level, 17 weeks,online)--H001


Details
Instructors: SupStat Data Scientist Team
Date: April 1st, 2014 to July 29,2014
This 17 week long class takes new enrollment every first week of the month. You may enroll on April 1st, May 1st and etc.
Venue: Online course through NYC Data Science Academy (http://nycdatascience.com/course/hadoop-data-analytic-platform/)
Cost: $2900
For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com
It is preferred if you can paypal stoneapple@gmail.com (SupStat Inc business acct) to RSVP your seat and pay $1 on meetup.com since meetup charges 15% transaction fee.
Refund Policy:
Students may receive full tuition reimbursement prior to the start of the course up to the completion of the first class day. Students will receive 50% of paid tuition up to the completion of the second class. No reimbursement will be given after the completion of the second class.
Online Class Information:
NYC Data Science Academy is only offering this course online. Online Instructors will provide course work, class videos, powerpoint lectures, and home work assignments every week.
Use our online platform to study with other students in the class, or reach out to the instructor for some extra help-- just make sure you finish your homework ontime!
Course Overview:
NYC Data Science Academy is pleased to offer our introductory Hadoop class: a 15 week online course that will bring your basic understanding of Hadoop to a professional level. Hadoop Data Analytic Platform aims to provide a firm foundation for the start of a professional career using Hadoop.
Potential students should be familiar with Java and basic Linux commands.
What is Hadoop:
Hadoop is an open source, database framework that allows for the processing of large data sets using parallel computing methods. Utilizing Google’s MapReduce and the Hadoop Distributed File System (HDFS), Hadoop allows for scalability, flexibility and fault tolerance. Hadoop is optimized to handle massive quantities of data either structured, semi-structured, or unstructured-- meaning Hadoop is perfect for Big Data.
As part of the Apache Framework, there are a litany of Apache compliments such as Hive, Pig and Zookeeper, that further extend Hadoop’s applications and usability.
Project Demo Day and Certificates:
From building your first cluster to enterprise level application of Hadoop clusters, the course ends with a demonstration of a project of your choice on Project Demo Day. On Demo Day you will showcase a project of your choosing, utilizing the tools and skillsets taught to you throughout this course. We encourage you to be creative!
After the successful completion of the course, you will qualify for one of three certificates: Extraordinary Standing , Honorable Graduation , and Active Participation .
Certificates are awarded according to your understanding, skill, and participation.
Course Outline:
(Content may be adjusted based on the real experience of the class)
Week 1: Introduction to the origin and system of Hadoop
Week 2: Build a Hadoop cluster
Week 3: The principle and operation of Hadoop Distributed File System(HDFS)
Week 4: HDFS API programming
Week 5: The principle, system, Working mechanism of Map-Reduce; Hadoop data flow
Week 6: Practice on Map-Reduce programming, the connection of eclipse and Hadoop cluster
Week 7: Advanced Hadoop application
Week 8: Installation and application of Pig
Week 9: Architecture and installation of Hive; Application of HiveQL
Week 10: Data Mining with Mahout
Week 11: Architecture of HBase and Zookeeper
Week 12: Installation and management of HBase
Week 13: Data model of HBase; Analysis of application
Week 14: Interact with application; Rest and Thrift interface; Application of UDF
Week 15: Data integration in Sqoop; Flume; Chukwa; Business database and connection with Hadoop cluster; RHadoop
Week 16: Advance to the source code of Hadoop
Week 17: Enterprise-level application of Hadoop cluster; Business application cases of Hadoop.
Homework Requirement:
To encourage our students to submit homework on time, NYC Data Science Academy is providing a 10% reimbursement of the cost of your course, to be applied as a credit on your next Data Academy course.
Students are eligible for the course credit provided they complete all required coursework and pass the final exam.
There are two types of HW: written submission and forum activities.
• Written Submission
Each homework submission is given three types of feedback: Pass, Yellow Warning, and Red Warning.
Note: Late submissions will not be accepted!
○ Pass: Good quality work, reflective of a dedicated effort
○ Yellow Warning: poor quality. Constitutes sloppy work, rife with errors. (Two
Yellow Warnings is equivalent to one Red Warning.)
○Red Warning: terrible quality, reflective of no effort. Acquiring a Red Warning will disallow you from obtaining the 10% discount.
• Forum Activities
Students are required to post several times on the class forum before given deadlines. Missing four posts will disallow you from obtaining the Extraordinary Standing Pass Certificate. Continually missing postings will result in a failure to obtain the next highest certificate level.

Big Data by Hadoop (beginner level, 17 weeks,online)--H001