The dplyr (https://github.com/hadley/dplyr) package for R provides a unified domain-specific language for tabular data. A powerful feature is that it supports multiple sources of tabular data, including data.frames, data.tables, and several dialects of relational database. But data.frames are not databases! How can this work? In this talk, I will dive into the details of how high-level dplyr verbs get translated into SQL, and walk through tips and tricks, optimizations, and gotchas.
Our Instructor - Harlan Harris
Data Scientist with experience and skills in Statistics and Statistical Programming, Machine Learning, Enterprise Software, Operations Research, Analytical Marketing, Enterprise Analytics, Higher Education, Cognitive Psychology, Cognitive Modeling, Linguistics/Psycholinguistics, Artificial Intelligence, and more. Harlan applies statistical, mathematical, computational, and scientific methods to understanding how things, processes, and people work. He's also deeply involved in the data science professional community, and help organize professional events and services in the DC area.