Skip to content

Big Data by Hadoop (Intermediate level, 15 weeks,online)--H002

Photo of Vivian Zhang
Hosted By
Vivian Z.
Big Data by Hadoop (Intermediate level, 15 weeks,online)--H002

Details

Instructor: Vivian Zhang

Date: April 1st, 2014 to July 15,2014

This 15 weeks long class takes new enrollment every 1st of the month. You may enroll on April 1st, May 1st and etc.

Venue: Online Course through NYC Data Science Academy (http://nycdatascience.com/course/hadoop-application-development-with-real-cases/)

Cost: $2500

For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

It is preferred if you can paypal stoneapple@gmail.com (SupStat Inc business acct) to RSVP your seat and pay $1 on meetup.com since meetup charges 15% transaction fee.

Refund Policy:

Students may receive full tuition reimbursement prior to the start of the course up to the completion of the first class day. Students will receive 50% of paid tuition up to the completion of the second class. No reimbursement will be given after the completion of the second class.

Online Class Information:

NYC Data Science Academy is only offering this course online. Online Instructors will provide course work, class videos, powerpoint lectures, and home work assignments every week. Use our online platform to study with other students in the class, or reach out to the instructor for some extra help-- just make sure you finish your homework ontime!

Course Overview:

NYC Data Science Academy is pleased to offer our intermediate Hadoop class: a 15 week online course suited to the needs of students who have completed Hadoop Data Analytic Platform class or students who already have a professional understanding of Hadoop.

This course will analyze in-depth, real world examples of Hadoop usage to further advance our students understanding and competency with Hadoop.

Potential students, please thoroughly read the course syllabus to ensure you are at an appropriate level.

What is Hadoop:
Hadoop is an open source, database framework that allows for the processing of large data sets using parallel computing methods. Utilizing Google’s MapReduce and the Hadoop Distributed File System (HDFS), Hadoop allows for scalability, flexibility and fault tolerance. Hadoop is optimized to handle massive quantities of data either structured, semi-structured, or unstructured-- meaning Hadoop is perfect for Big Data.
As part of the Apache Framework, there are a litany of Apache compliments such as Hive, Pig and Zookeeper, that further extend Hadoop’s applications and usability.

Project Demo Day and Certificates:
From building your first cluster to enterprise level application of Hadoop clusters, the course ends with a demonstration of a project of your choice on Project Demo Day. On Demo Day you will showcase a project of your choosing, utilizing the tools and skillsets taught to you throughout this course. We encourage you to be creative!
After the successful completion of the course, you will qualify for one of three certificates: Extraordinary Standing, Honorable Graduation, and Active Participation.
Certificates are awarded according to your understanding, skill, and participation.

Course Outline:

(Content may be adjusted based on need of the class)

Week 1: Review of Hadoop Basic
Week 2: Summary of application of Hadoop
Week 3: Analysis of high volume website log system; Retrieve KPI data(Map-Reduce)
Week 4: LBS application for telecommunication company; Analysis of trace of user's mobile phone(Map-Reduce)
Week 5: User analysis for telecommunication company; Labeling duplicated users by the fingerprint of calls(Map-Reduce)
Week 6: Recommendation system for E-commerce company(Map-Reduce)
Week 7: Complicated recommendation system application(mahout)
Week 8: Social network; Distance between users; Community detection(Pig)
Week 9: Importance of nodes in a social network(Map-Reduce)
Week 10: Application of clustering algorithm; Analysis of VIP(Map-Reduce, Mahout)
Week 11: Financial data analysis; Retrieve reverse repurchase information from historical data(Hive)
Week 12: Set stock strategies with data analysis(Map-Reduce, Hive)
Week 13: GPS application; Sign-in data analysis(Pig)
Week 14: Implementation and optimization of sorting on Map-Reduce
Week 15: Middleware development; Cooperation of multiple Hadoop clusters

Homework Requirement:
To encourage our students to submit homework on time, NYC Data Science Academy is providing a 10% reimbursement of the cost of your course, to be applied as a credit on your next Data Academy course.
Students are eligible for the course credit provided they complete all required coursework and pass the final exam.
There are two types of HW: written submission and forum activities.

•Written Submission

Each homework submission is given three types of feedback: Pass, Yellow Warning, and Red Warning.
Note: Late submissions will not be accepted!
○ Pass: Good quality work, reflective of a dedicated effort
○ Yellow Warning: poor quality. Constitutes sloppy work, rife with errors. (Two
Yellow Warnings is equivalent to one Red Warning.)
○Red Warning: terrible quality, reflective of no effort. Acquiring a Red Warning will disallow you from obtaining the 10% discount.

• Forum Activities

Students are required to post several times on the class forum before given deadlines. Missing four posts will disallow you from obtaining the Extraordinary Standing Pass Certificate. Continually missing postings will result in a failure to obtain the next highest certificate level.

Photo of NYC Data Science Academy group
NYC Data Science Academy
See more events
NYC Data Science Academy Online Courses
www.nycdatascience.com · New York, NY