Data Science by R programming (Intermediate level) R006

Hosted By
Vivian Z.

Details
Instructor : Vivian Zhang, CTO@Supstat Inc
Outline:
Week 1:
Introducing Data mining
- what is and How to data mining
- steps to apply data mining to your data
- supervised versus unsupervised learning
- regression versus classification problems
Review of linear models
- simple linear regression
- logistic regression
- generalized linear models
Week 2:
evaluation model performance
- confusion matrices
- beyond accuracy
- estimating future performance
extension of linear models
- subset selection
- shrinkage methods
- dimension reduction methods
Week 3:
K-nearest neighbors models
- understanding kNN algorithm
- calculating distance
- choosing an appropriate k
- case study
Naive Bayes models
- understanding joint probability
- the naive bayes algorithm
- the laplace estimator
- case study
Week 4:
tree models
- regression trees- classification trees
- tree model with party
- tree model with rpart
- randomforest model
- GBM model
support vector machines
- maximal margin classifier
- support vector classifiers
- support vector machines
Week 5:
market basket analysis
- understanding association rules
- the apriori algorithm
- case study
unsupervised learning
- K-means clustering
- Hierarchicl clustering
- case study
time series models
- fundamental concepts
- stationary time series
- ARIMA model
- seasonal model

NYC Data Science Academy
See more events
AlleyNYC
500 7th ave 17th floor · New York, NY
Data Science by R programming (Intermediate level) R006