addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1light-bulblinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Tom Levine on Data Cleaning

6:30 Doors open - socializing & individual questions

7:00 Announcements & Introduction

7:10 Tom Levine on Data Cleaning

8:30 Wrap up & socializing

9:00 Out the door! 

Join or login to comment.

  • Harold B.

    I'd be interested in doing a presentation on how to use R in a Microsoft Office environment. Essentially, tricks and tips from working in an environment that isn't R friendly.

    It would cover issues of interfacing with Excel, and how to use Knitr and Markdown to produce a formated report, and how to convert it to a Word document.

    1 · February 19, 2014

    • Jim P.

      Harold, Love it! For many of us, this is an important topic. Would you be ready next month on the 18th? -Jim

      February 19, 2014

    • Harold B.

      Yeah, the 18th would be fine.

      February 19, 2014

  • Ira S.

    I am VERY interested in Excel interface issues. I hope that you will be our next speaker. I also suggest that each speaker provide a bit of a write up prior to the talk.

    February 19, 2014

    • Jim P.

      Ira, agree we need short abstract for each talk. I'll be more tough minded in future - Jim

      February 19, 2014

  • Harold B.

    Agree with Chris, and I think it is important to be clear that what Tom presented is called web scraping. The extraction of data from web pages that were not designed to serve as an analytic data source.
    I think of it more as raking, but there are better metaphors, like harvesting. :-)

    Data cleaning is using logic and code to correct errors in existing data tables. Most of us do a lot of data cleaning, but it isn't as interesting as scraping web pages. ;-)

    February 19, 2014

  • Chris S.

    Tom has a lot of room for improvement. He needs to prepare his talk ahead of time. i.e make sure the Internet is accessible. He needs to have run through his talk at least once before hand, so he is familiar with the R commands that he will be using (like how to read in a file and how to use grep for matching). He also needs to make sure his actual talk matches the title of the talk. With the title Data Cleaning in R, I had expected a talk on how one cleans data, not just generally, but in R. How to parse names, or dates, and how to detect and clean up inconsistencies in R itself. Tom has a lot of enthusiasm and has clearly done data cleaning work, but the talk itself, while giving some good advice about poppler utils, did not focus on using R itself for data cleaning. A 2.0 version of this talk could be very interesting,.

    February 18, 2014

  • A former member
    A former member

    Any room available for a few from the waiting list to make it through?

    February 18, 2014

35 went

Our Sponsors

  • O'Reilly

    Discounted books and conferences.

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy