addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Obtaining, Scrubbing, and Exploring Data at the Command Line

To start the new year we have Jeroen Janssens from YPlan discussing how the command line can be used for data science.

About the talk:

We data scientists love to create exciting data visualizations and insightful models. However, before we get to that point, usually much effort goes into obtaining, scrubbing, and exploring the required data.

The *nix command line, although invented decades ago, remains a powerful environment for such data science tasks. It provides a read-eval-print loop (REPL) that is often much more convenient for exploratory data analysis than the edit-compile-run-debug cycle associated with scripts or even programs. Even if you're already comfortable processing data with, for example, R or Python, being able to also leverage the power of the command line can make you a more efficient data scientist.

In this one-hour presentation we'll look at the following subjects:

• Essential concepts of the *nix command line;

• Setting up an efficient environment;

• Filters such as cut, gre, sed, and awk;

• Scraping websites using curl, scrape, xml2json, and jq;

• Managing your data science workflow using drake;

• Parallelizing and distributing data-intensive pipelines; and

• Turning one-liners and existing code into reusable command-line tools.

The main goal of this presentation is to give you have an understanding of why, when, and how you could use the command line for your next data science project.

Bio:

Jeroen Janssens is a senior data scientist at YPlan, tonight's going out app, where he's responsible for making event recommendations more personal. Jeroen holds an M.Sc. in Artificial Intelligence from Maastricht University and a Ph.D. in Machine Learning from Tilburg University. He is authoring a book called "Data Science at the Command Line", which will be published by O'Reilly in summer 2014. Jeroen enjoys biking the Brooklyn Bridge, building tools, and blogging at http://jeroenjanssens.com. He can be found on Twitter @jeroenhjanssens.


As per usual, pizza begins at 6:30, the speaker at 7, and then the bar whenever he finishes.

Join or login to comment.

  • Jeroen J.

    A video of the presentation, which was kindly recorded by Hakka Labs, is now available at http://www.hakkalabs.co/articles/obtaining-scrubbing-exploring-data-command-line

    2 · February 23, 2014

  • Joshua H.

    Another useful website:
    http://www.commandlinefu.com/commands/browse Very useful to see how some complex commands are constructed that you can tweak to your needs.

    1 · February 3, 2014

  • Jeroen J.

    Thanks everybody for coming out last night. All your questions, feedback, and ratings are much appreciated!

    2 · January 30, 2014

    • Jeff E.

      Thank you, Jeroen! Could you post a copy of the presentation? It would be extremely helpful.

      4 · January 31, 2014

    • Jeroen J.

      The presentation can be found in the files section (top menu: "More" > "Files").

      3 · February 1, 2014

  • Yana K.

    This was really useful and fun. Many thanks. As others already mentioned, getting a copy of the presentation would be great.

    January 30, 2014

    • Jared L.

      Will work on that.

      1 · January 30, 2014

    • Joshua H.

      Agreed, that was awesome. Hope to see some more shell programming related programming (no pun, well maybe a little) in the future!

      January 30, 2014

  • Arnab B.

    Jared, I had to leave after yesterday's talk in a hurry, so couldn't have a word with either Jeroen & Saar. Do you happen to have their emails? I was really interested in talking with them about some job postings about which they were mentioning yesterday.
    Thanks,
    Arnab
    [masked]

    January 30, 2014

    • Jared L.

      Sent you an email.

      January 30, 2014

  • Greg W.

    Any chance of a video?

    January 28, 2014

    • Andy E.

      yeah, would love a slideshare or screencast

      1 · January 29, 2014

    • Jared L.

      Sorry for the poor quality. Hekka Labs usually puts out their high quality version a few weeks after.

      January 30, 2014

  • Daniel K.

    Many thanks to Mr. Janssens aka the Stroopwafel Savant. Really good overview of the command line tools available for data work. Hit a nice middle ground irrespective of where in the spectrum of shell proficiency you are coming from.

    January 30, 2014

    • Christopher E.

      Tom for president! Good work man.

      January 29, 2014

    • Daniel C.

      Tom i'm at a software carpentry workshop right now, Do you mind if I share this document to show what is possible with bash?

      January 30, 2014

  • Arnab B.

    Great meetup. Jeroen was great.

    1 · January 29, 2014

  • A former member
    A former member

    I cannot make it either, something came up at the office. I can't seem to change my RSVP now, but there should be an extra seat.

    January 29, 2014

  • Bernard W.

    Afraid it looks like a late night at the office, changed my rvsp to no if someone wants to snag a spot.

    January 29, 2014

  • Ahmad R.

    Will we need or would bringing laptops be productive?

    January 29, 2014

    • Jared L.

      This should be presentation style so no laptop needed. Though if you want to try what is being demonstrated then by all means code along with the presentation.

      January 29, 2014

  • Oliver S.

    I'm releasing my RSVP back into the pool. My lady's insisting on a date night.

    1 · January 29, 2014

    • Phil K.

      Nice...my wife would go to sleep!

      3 · January 29, 2014

    • Cristiana G.

      So would my husband.

      3 · January 29, 2014

  • Jose C.

    I've changed my Rsvp to no, last minute work event.

    January 29, 2014

  • David M.

    I will not be able to attend tonight. The waitlist/refund is a little unclear as to who would get my place, so could the organizer please transfer my ticket to the next in line? I would accept either a refund or someone could Paypal me directly. Than

    January 29, 2014

    • David M.

      I've released my spot. If it is taken, would it be possible to get a refund?

      January 29, 2014

    • Jared L.

      Sent you a message.

      January 29, 2014

  • Filipe B.

    Would love to attend to this. Anyone not being able to go?

    January 29, 2014

  • Daniel M.

    Hey all! I heard there's a wait list for the event and I can't make it tomorrow. How can I give away my spot?

    January 29, 2014

    • Dan H.

      All good - just got a spot. Thanks!

      January 29, 2014

    • Daniel M.

      Nice! I'm trying to see if I can make it myself

      January 29, 2014

  • Melissa (Berry) D.

    Woo hoo, I snagged a spot. Can't wait. :)

    1 · January 28, 2014

    • Joshua H.

      Me too I've been trying all day! Finally haha

      January 28, 2014

    • Jared L.

      Nice, see you tonight.

      January 29, 2014

  • Arnab B.

    Really, really waiting for a spot to open up. Hi Jared, I'm really interested in attending this one. If no spot opens up till the last minute, can I still drop by? :)

    January 28, 2014

    • Arnab B.

      Jared, just to let you know, I was able to get a spot finally. Great. See you guys in the evening. :)

      January 29, 2014

    • Jared L.

      Yep, saw that, congrats.

      January 29, 2014

  • A former member
    A former member

    RSVP d but can't attend :(

    January 28, 2014

  • Parag P.

    any hands on at the end..
    after the presentation

    1 · January 28, 2014

  • Daniel C.

    Just realized I'm leaving for Boston to help with the MIT software boot camp that night. I'll be at the next one!

    January 25, 2014

  • Dyan G.

    I am a Home Theater enthusiast and like to tweak Audio/Visual systems during my spare time.

    1 · January 19, 2014

Your organizer's refund policy for Obtaining, Scrubbing, and Exploring Data at the Command Line

Refunds are not offered for this Meetup.

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy