In this sequel to John Tillinghast's March talk, Ryan Harvey will discuss good tools that exist for connecting to databases to help with repeatable and reversible data wrangling and analysis, including:
• The basics of using a relational database and querying it with SQL
• Database pros and cons, and database selection (spoiler: the examples use PostgreSQL)
• Doing analysis directly within the database using PostgreSQL's native functionality
• Connecting R to databases for more manageable data storage and easier exploratory analysis
• Doing hybrid analysis between R and the database using PostgreSQL
Ryan will discuss the importance of repeatable and reversible processes and scripting for enterprise settings and other production environments, why you should care about doing repeatable and reversible scripting, and how to do it more easily with good tools.
Ryan Harvey is a local coder, datahead, project manager, wonk and dad. For work, he manages several government web apps at the Office of Management and Budget, in addition to doing data science, software architecture and application and database engineering for Kitchology, Inc. on the side. He's also a PhD candidate in the Applied Mathematics, Applied Statistics and Scientific Computing program at the University of Maryland's Norbert Wiener Center for Harmonic Analysis and Applications. Ryan lives in Lanham, MD, with his wife and two children and several pets.