Skip to content

Data Science by R programming (Interme­diate level) R006

Photo of Vivian Zhang
Hosted By
Vivian Z.
Data Science by R programming (Interme­diate level) R006

Details

Instructor : Vivian Zhang, CTO@Supstat Inc

Outline:

Week 1:
Introducing Data mining

  • what is and How to data mining
  • steps to apply data mining to your data
  • supervised versus unsupervised learning
  • regression versus classification problems

Review of linear models

  • simple linear regression
  • logistic regression
  • generalized linear models

Week 2:

evaluation model performance

  • confusion matrices
  • beyond accuracy
  • estimating future performance

extension of linear models

  • subset selection
  • shrinkage methods
  • dimension reduction methods

Week 3:

K-nearest neighbors models

  • understanding kNN algorithm
  • calculating distance
  • choosing an appropriate k
  • case study

Naive Bayes models

  • understanding joint probability
  • the naive bayes algorithm
  • the laplace estimator
  • case study

Week 4:
tree models

  • regression trees- classification trees
  • tree model with party
  • tree model with rpart
  • randomforest model
  • GBM model

support vector machines

  • maximal margin classifier
  • support vector classifiers
  • support vector machines
    Week 5:

market basket analysis

  • understanding association rules
  • the apriori algorithm
  • case study

unsupervised learning

  • K-means clustering
  • Hierarchicl clustering
  • case study

time series models

  • fundamental concepts
  • stationary time series
  • ARIMA model
  • seasonal model
Photo of NYC Data Science Academy group
NYC Data Science Academy
See more events
AlleyNYC
500 7th ave 17th floor · New York, NY