This is a group of R enthusiasts living in the Greater Houston area.
R is an open source statistical platform used in a wide range of academic, hobbyist, and professional applications. This group exists to promote R's use in the Houston area, as well as to provide a forum to exchange tips, tricks, ideas, and code.
People of all experience levels with R, from novice to expert are welcome. We have members who are just starting out with R as well as several that have contributed packages and code to the R project.
We will be using the package RODBC to connect to a SQL database. We will review basic SQL syntax, and the relational database model. Then, we'll try out an analysis from data stored in a database and write our results back to the database.
We will be using rspark to connect to a distributed database and run queries that are too large to run on a laptop. We'll review the basics of distributed systems and what spark is. Then, we'll take an analysis to completion using the rspark package.