Data Science by R programming(Intensive Beginner level, Five Sun) R003


Details
Date: Mar 16th, 23th, 30th, April 6th,13th (five Sundays)
Venue:
http://photos1.meetupstatic.com/photos/event/a/c/6/e/600_341384142.jpeg
http://photos3.meetupstatic.com/photos/event/a/f/2/a/600_341384842.jpeg
Time: 10:00am -5:00pm
Instructors:
Vivian Zhang (CTO @Supstat Inc, Master degrees in Computer Science and Statistics)
Cost:
Individual: $220/class, $1100 for all five classes.
Note,we don't sell individual class. It is a big commitment for us to assist you and be able to do significant analytic work and also your commitment to do a good job in the class.
For group(5 or more persons) and enterprise pricing, please email vivian.zhang@supstat.com
The class is extended from first offering 20 hours to 35 hours, the new charge is $1100 for five classes. It is preferred if you can paypal stoneapple@gmail.com(SupStat Inc business acct) to RSVP you seat and pay $1 on meetup.com since meetup charges 15% transaction fee.
Refund Policy:
We offer full refund if you are not happy with the first class and decide to drop it.
Course Outline:
(Content may be adjusted based on the real teaching condition)
Basics: 12 hours
Abstract: explain the basic operation of knowledge through this unit of study , students can learn the characteristics of R , resource acquisition mode , and mastery of basic programming
Case and Exercise: Using the R language completion of certain Euler Project (euler project)
- How to learn R
- How to get help
- R language resources and books
- RStudio
- Expansion Pack
- Workspace
- Custom Startup Items
- Batch Mode
- Data Objects
- Custom Functions
- Control statements
- Vectorized operations
Getting data: 6 hours
Abstract: explain the various ways the R language read data , the participants through the basic WEB knowledge of web crawling , connect to the database via sql statement calling data from a variety of local read excel file data .
Case studies and exercises: crawl watercress data on the site , write a custom function .
- Web data capture
- API data source
- Connect to the database
- Local Documentation
- Other data sources
- Data Export
Data manipulation: 6 hours
Abstract: how to manipulate the data use R for the all kinds of data conversion, especially for string operation processing .
Case studies and exercises : Find the QQ(the most used instant messager tool) group , then discuss research options with text features.
- Data sorting
- Merge Data
- Summary data
- Remodeling Data
- Take a subset of data
- String manipulation
- Date Actions
Data Visualization 6 hours
Abstract: cover two advanced drawing package , lattice and ggplot2, understand the various methods of visualization to explore.
Case and Exercise: Using graphics to right before the movie , text and other data to describe
- Histogram
- Point
- Column
- Line
- Pie
- Box Plot
- Scatter
- Matrix related
- Map
( If we finish the class early, we will cover selected topics based on your need)
Elementary statistical methods:
Abstract: The primary explanation to use R for statistical analysis , regression analysis, students can master the basic statistical significance and role model.
Case and Exercise: Using regression to predict commodity prices ; simulated casino game winner.
- Descriptive Statistics
- Statistical Distributions
- Frequency and contingency tables
- Correlation
- T test
- Non-parametric statistics
- Linear Regression
- Regression Diagnostics
- Robust Regression
- Nonlinear regression
- Principal Component Analysis
- Logistic Regression
- Statistical Simulation
Preliminary data mining:
Abstract: explain the R language for data mining expansion pack and functions use , students can master the supervised learning and unsupervised learning two mining methods .
Case and Exercise: Use R to participate in Kaggle Data Mining Competition
- General Mining Process
- Rattle bag
- Hierarchical clustering
- K -means clustering
- Decision Trees
- BP neural network

Data Science by R programming(Intensive Beginner level, Five Sun) R003