Skip to content

Deep Dive Into Dplyr And Databases

Deep Dive Into Dplyr And Databases

Details

The dplyr (https://github.com/hadley/dplyr) package for R provides a unified domain-specific language for tabular data. A powerful feature is that it supports multiple sources of tabular data, including data.frames, data.tables, and several dialects of relational database. But data.frames are not databases! How can this work? In this talk, I will dive into the details of how high-level dplyr verbs get translated into SQL, and walk through tips and tricks, optimizations, and gotchas.

Our Instructor - Harlan Harris

Data Scientist with experience and skills in Statistics and Statistical Programming, Machine Learning, Enterprise Software, Operations Research, Analytical Marketing, Enterprise Analytics, Higher Education, Cognitive Psychology, Cognitive Modeling, Linguistics/Psycholinguistics, Artificial Intelligence, and more. Harlan applies statistical, mathematical, computational, and scientific methods to understanding how things, processes, and people work. He's also deeply involved in the data science professional community, and help organize professional events and services in the DC area.

Photo of Data Engineers DC group
Data Engineers DC
See more events
GWU, Funger Hall, Room 103
2201 G St. NW · Washington, DC