Date: Mar 16th, 23th, 30th, April 6th,13th (five Sundays)
Time: 10:00am -5:00pm
Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)
Individual: $220/class, $1100 for all five classes.
Note,we don't sell individual class. It is a big commitment for us to assist you and be able to do significant analytic work and also your commitment to do a good job in the class.
For group(5 or more persons) and enterprise pricing, please email [masked]
The class is extended from first offering 20 hours to 35 hours, the new charge is $1100 for five classes. It is preferred if you can paypal [masked](SupStat Inc business acct) to RSVP you seat and pay $1 on meetup.com since meetup charges 15% transaction fee.
We offer full refund if you are not happy with the first class and decide to drop it.
(Content may be adjusted based on the real teaching condition)
Basics: 12 hours
Abstract: explain the basic operation of knowledge through this unit of study , students can learn the characteristics of R , resource acquisition mode , and mastery of basic programming
Case and Exercise: Using the R language completion of certain Euler Project (euler project)
* How to learn R
* How to get help
* R language resources and books
* Expansion Pack
* Custom Startup Items
* Batch Mode
* Data Objects
* Custom Functions
* Control statements
* Vectorized operations
Getting data: 6 hours
Abstract: explain the various ways the R language read data , the participants through the basic WEB knowledge of web crawling , connect to the database via sql statement calling data from a variety of local read excel file data .
Case studies and exercises: crawl watercress data on the site , write a custom function .
* Web data capture
* API data source
* Connect to the database
* Local Documentation
* Other data sources
* Data Export
Data manipulation: 6 hours
Abstract: how to manipulate the data use R for the all kinds of data conversion, especially for string operation processing .
Case studies and exercises : Find the QQ(the most used instant messager tool) group , then discuss research options with text features.
* Data sorting
* Merge Data
* Summary data
* Remodeling Data
* Take a subset of data
* String manipulation
* Date Actions
Data Visualization 6 hours
Abstract: cover two advanced drawing package , lattice and ggplot2, understand the various methods of visualization to explore.
Case and Exercise: Using graphics to right before the movie , text and other data to describe
* Box Plot
* Matrix related
( If we finish the class early, we will cover selected topics based on your need)
Elementary statistical methods:
Abstract: The primary explanation to use R for statistical analysis , regression analysis, students can master the basic statistical significance and role model.
Case and Exercise: Using regression to predict commodity prices ; simulated casino game winner.
* Descriptive Statistics
* Statistical Distributions
* Frequency and contingency tables
* T test
* Non-parametric statistics
* Linear Regression
* Regression Diagnostics
* Robust Regression
* Nonlinear regression
* Principal Component Analysis
* Logistic Regression
* Statistical Simulation
Preliminary data mining:
Abstract: explain the R language for data mining expansion pack and functions use , students can master the supervised learning and unsupervised learning two mining methods .
Case and Exercise: Use R to participate in Kaggle Data Mining Competition
* General Mining Process
* Rattle bag
* Hierarchical clustering
* K -means clustering
* Decision Trees
* BP neural network