Skip to content

Data Science by R programming(Intensive Beginne­r level, Five Sun) R003

Photo of Vivian Zhang
Hosted By
Vivian Z.
Data Science by R programming(Intensive Beginne­r level, Five Sun) R003

Details

Date: Mar 16th, 23th, 30th, April 6th,13th (five Sundays)

Venue:

http://photos1.meetupstatic.com/photos/event/a/c/6/e/600_341384142.jpeg

http://photos3.meetupstatic.com/photos/event/a/f/2/a/600_341384842.jpeg

Time: 10:00am -5:00pm

Instructors:

Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)

Cost:

Individual: $220/class, $1100 for all five classes.

Note,we don't sell individual class. It is a big commitment for us to assist you and be able to do significant analytic work and also your commitment to do a good job in the class.

For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com

The class is extended from first offering 20 hours to 35 hours, the new charge is $1100 for five classes. It is preferred if you can paypal stoneapple@gmail.com(SupStat Inc business acct) to RSVP you seat and pay $1 on meetup.com since meetup charges 15% transaction fee.

Refund Policy:

We offer full refund if you are not happy with the first class and decide to drop it.

Course Outline:

(Content may be adjusted based on the real teaching condition)

Basics: 12 hours
Abstract: explain the basic operation of knowledge through this unit of study , students can learn the characteristics of R , resource acquisition mode , and mastery of basic programming
Case and Exercise: Using the R language completion of certain Euler Project (euler project)

  • How to learn R
  • How to get help
  • R language resources and books
  • RStudio
  • Expansion Pack
  • Workspace
  • Custom Startup Items
  • Batch Mode
  • Data Objects
  • Custom Functions
  • Control statements
  • Vectorized operations

Getting data: 6 hours

Abstract: explain the various ways the R language read data , the participants through the basic WEB knowledge of web crawling , connect to the database via sql statement calling data from a variety of local read excel file data .
Case studies and exercises: crawl watercress data on the site , write a custom function .

  • Web data capture
  • API data source
  • Connect to the database
  • Local Documentation
  • Other data sources
  • Data Export

Data manipulation: 6 hours

Abstract: how to manipulate the data use R for the all kinds of data conversion, especially for string operation processing .
Case studies and exercises : Find the QQ(the most used instant messager tool) group , then discuss research options with text features.

  • Data sorting
  • Merge Data
  • Summary data
  • Remodeling Data
  • Take a subset of data
  • String manipulation
  • Date Actions

Data Visualization 6 hours

Abstract: cover two advanced drawing package , lattice and ggplot2, understand the various methods of visualization to explore.
Case and Exercise: Using graphics to right before the movie , text and other data to describe

  • Histogram
  • Point
  • Column
  • Line
  • Pie
  • Box Plot
  • Scatter
  • Matrix related
  • Map

( If we finish the class early, we will cover selected topics based on your need)

Elementary statistical methods:
Abstract: The primary explanation to use R for statistical analysis , regression analysis, students can master the basic statistical significance and role model.
Case and Exercise: Using regression to predict commodity prices ; simulated casino game winner.

  • Descriptive Statistics
  • Statistical Distributions
  • Frequency and contingency tables
  • Correlation
  • T test
  • Non-parametric statistics
  • Linear Regression
  • Regression Diagnostics
  • Robust Regression
  • Nonlinear regression
  • Principal Component Analysis
  • Logistic Regression
  • Statistical Simulation

Preliminary data mining:

Abstract: explain the R language for data mining expansion pack and functions use , students can master the supervised learning and unsupervised learning two mining methods .
Case and Exercise: Use R to participate in Kaggle Data Mining Competition

  • General Mining Process
  • Rattle bag
  • Hierarchical clustering
  • K -means clustering
  • Decision Trees
  • BP neural network
Photo of NYC Data Science Academy group
NYC Data Science Academy
See more events
SumAll Foundataion SumAll.org
247 / 241 Centre Street, 6th Floor (between Broome & Grand) · New York, NY