Data Science by R Programming(Intensive Intermediate level, Four Sundays) R004


Details
Length of Time: 35 hours
Date: April 27th, 27th, May 4th, 11th, 18th(four Sundays)
Time: 10:00am-5:30pm
We take two breaks at 12:30pm-1:00pm and at 2:30pm-3:00pm.
Extra teaching: 1 hour video/ week * 5 weeks
Instructor :
Vivian Zhang, CTO@Supstat Inc, Founder of NYC Data Science Academy http://www.nycdatascience.com
Venue:
http://photos3.meetupstatic.com/photos/event/6/d/b/2/600_353668082.jpeg
http://photos1.meetupstatic.com/photos/event/6/e/2/a/600_353668202.jpeg
Course Overview:
NYC Data Science Academy is now offering an R Intensive Intermediate course: a five week course designed around students who have taken NYC Data Science Academy’s R Beginner course or for those who already have a firm skillset and understanding of R.
Project Demo Day and Certificates:
From data mining to time series models, the course ends with a demonstration of a project of your choice on Project Demo Day. On Demo Day you will showcase a project of your choosing, utilizing the tools and skill sets taught to you throughout this course. We encourage you to be creative! Students have chosen projects ranging from digital marketing simulation to finding the relation between people using natural language processing. The possibilities are nearly endless! All the instructors will help you to implement your own project.
After the successful completion of the course, you will qualify for one of three certificates: Extraordinary Standing , Honorable Graduation , and Active Participation.
Certificates are awarded according to your understanding, skill, and participation.
Cost: $2100/ five classes
For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com
It is preferred if you can paypal stoneapple@gmail.com (SupStat Inc business acct) to RSVP your seat and pay $1 on meetup.com since meetup charges 15% transaction fee.
Refund Policy:
We offer full refund if you are not happy with the first class and decide to drop it.
Course Outline:
(Content may be adjusted based on the real teaching condition)
Part 1: Introducing Data mining (6 hours)
- what is and How to data mining
- steps to apply data mining to your data
- supervised versus unsupervised learning
- regression versus classification problems
review of linear models
- simple linear regression
- logistic regression
- generalized linear models
part 2: Performance Measure and Dimension Reduction (6 hours)
evaluation model performance
- confusion matrices
- beyond accuracy
- estimating future performance
extension of linear models
- subset selection
- shrinkage methods
- dimension reduction methods
part 3: KNN and NB model (6 hours)
K-nearest neighbors models
- understanding kNN algorithm
- calculating distance
- choosing an appropriate k
- case study
Naive Bayes models
- understanding joint probability
- the naive bayes algorithm
- the laplace estimator
- case study
part 4: Tree and SVM (6 hours)
tree models
- regression trees- classification trees
- tree model with party
- tree model with rpart
- random forest model
- GBM model
support vector machines
- maximal margin classifier
- support vector classifiers
- support vector machines
part 5: Association Rule and More Models (6 hours)
market basket analysis
- understanding association rules
- the apriori algorithm
- case study
unsupervised learning
- K-means clustering
- Hierarchicl clustering
- case study
time series models
- fundamental concepts
- stationary time series
- ARIMA model
- seasonal model

Data Science by R Programming(Intensive Intermediate level, Four Sundays) R004