addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1light-bulblinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

DataPhilly April 2013 Meetup - Machine Learning and Natural Language Processing

Hey Everyone,

Our April meetup will be a joint meetup with PhillyPUG. The topics will be machine learning and natural language processing on the python stack. If you RSVP to this meetup please don't RSVP to the PhillyPUG meetup as well. You only need to RSVP to one or the other.


6:30 PM - 7:00 PM - Food, Networking, and a word from our sponsors (SunGard and AWeber)

7:00 PM - 7:30 PM - scikit-learn, by Michael Becker

7:30 PM - 8:00 PM - NLTK, by Chris Brown

8:00 PM - 8:30 PM - Lightning Talks

8:30 PM - Leave for Prohibition Taproom


More details:

scikit-learn, by Michael Becker

In this talk Michael will lead us through section 2 of the scikit-learn tutorial. Topics covered will include: feature extraction, classification, regression, principal component analysis (PCA), clustering, detecting and avoiding overfitting, and measuring performance. During this tutorial we'll be making use of the iris dataset in most of the examples. If participants wish to follow along with the tutorial, they should follow the steps listed at before coming to the talk.

Michael Becker is the Senior Data Engineer at AWeber and founder of the DataPhilly Meetup group. On a day to day basis, he spends a majority of his time acquiring, scrubbing, exploring, and visualizing data. He loves machine learning and gets his kicks out of clustering, regression and classification algorithms.


NLTK (Natural Language Toolkit), by Chris Brown

This talk will be modeled after the NLTK talk given at PyData NYC 2012. The talk covers out-of-the-box features available in NLTK which include stemming, tokenization, stripping html, wordnet integration, named entity recognition, quickly building a corpus of data, and basic classifiers. Chris will motivate the use of these tools with a practical tutorial on building a topic classifier for news articles and then classifying political news according to political sentiment (conservative or liberal).

Chris Brown is a PhD candidate in political science at the University of Pennsylvania. He has worked on research projects that include determining how personal characteristics of leaders increase the likelihood of international conflict, measuring political polarization across different issue areas in Congress using roll call votes, and assigning political partisanship scores to blogs using natural language processing. He uses Python almost every day while working on his dissertation which uses natural language processing to test theories about political parties and polarization with Congressional speeches. In his free time Chris has been developing a Django-based website to help Pennsylvanians monitor and track their state legislators.

Join or login to comment.

  • Bobby H.

    Two really great talks. Everyone from the speakers to the attendees created a really interesting discussion.

    April 17, 2013

  • Christopher B.

    Hey everyone!

    You all had some great questions! Really enjoyed talking with everyone.

    My slides can be found on my website:

    And the code for the examples and everything is on my github I updated the documentation a little bit there as well.

    April 17, 2013

  • Michael B.

    Thanks everyone who came out tonight!
    the slides for my talk can be found on my github @

    While I have your attention... I'd like to plug a talk I'm giving next week (with Kelly O'Brien) on the 23rd on Data Processing with Mechanical Turk Tickets are free.

    Additionally, AWeber is having an open house on April 30th If you don't believe Bobby House about how awesome it is working @ AWeber, you can find out for yourself!

    April 16, 2013

  • A former member
    A former member

    Sorry I can't make it tonight.

    April 16, 2013

  • Mike C.

    Sorry for the late notice, but I'll be stuck at work for a while tonight - hopefully someone else can use my spot.

    April 16, 2013

  • Ramaa N.

    Just realized that this is at the same time as Understanding Data workshop. Wish that wasnt so. This meetup also seems very interesting. Is there any way I could details or presentation material for later reference?

    April 16, 2013

  • Alex Z.

    Is there onsite parking available @ SunGard or do we just park on the street?

    April 15, 2013

    • Michael B.

      There is street parking and there are a few commercial parking lots right near the building.

      April 15, 2013

  • Matthew W.

    Sorry! Fire to put out.

    April 15, 2013

  • A former member
    A former member

    Unfortunately a conflict has me unable to attend, hopefully someone else can enjoy the talk now!

    March 28, 2013

55 went

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy