Skip to content

Details

Now that you've acquired all your data, is it time to start crunching numbers? Not so fast! An important but often tedious step of data wrangling is parsing the textual data into a usable form. We've all written a short script or regex to accomplish this, but it's often a one-off and rarely is it flexible or maintainable.

In this talk we will do something better than parsing. We will write a grammar, a mini-language for our data. This paradigm shift allows a unique perspective on parsing and will help shape the way you approach these problems in the future. We will be using the python library pyarsing for all of our examples in this talk:

http://pyparsing.wikispaces.com/Download+and+Installation

About Travis Hoppe

Travis was born and raised in Las Vegas, and studied Mathematics (BS) and Physics (Ph.D) from Reno to Philadelphia. Currently a post-doc at the National Institutes of Health, he's a strong advocate for open academic publishing, and a data wrangler for fun and profit in Python and C++.

When you see him, ask him about his addiction to imaginary internet points on Stack Overflow.

Related topics

Sponsors

Booz Allen

Booz Allen

DC2 Org Sponsor

GWU

GWU

The skills you need to develop and apply modern data solutions.

Anant Corporation

Anant Corporation

Program Sponsor

ByteCubed

ByteCubed

Tech Innovators located in Crystal City

You may also like