Past Meetup

The unsexy part of data science: data munging

This Meetup is past

99 people went

Location image of event venue


Real world data is usually dirty, messy, full of errors. If you want to get reasonable results from your statistical modeling, you'll need to explore the data, clean it up, transform it, prepare it for modeling. Data scientists commonly spend 80% of their time doing this tedious task of data munging. But how to do it?

In this meetup we'll have 5 short 10-15 min talks on data munging. We'll then open for Q&A where we'll address further issues.


1. Szilard Pafka: Intro and overview

2. Yasmin Lucero: Munging date-times in R: tools, tricks, gotchas

3. Daniel Gutierrez: Data Munging: the Good, the Bad and the Ugly

4. Neal Fultz: Tidy Data, Facts & Rules for R

5. Eric Klusman: Plyr for split-apply-combine


- 6:00pm food/drinks and networking

- 7:00pm talks starts promptly

Please arrive by 6:55pm the latest.

Please RSVP as places are limited.

Venue: General Assembly ( ) will kindly host this meetup. There is no parking provided. Q ( ) will kindly sponsor/provide the food and drinks.