add-memberalign-toparrow-leftarrow-rightbellblockcalendarcamerachatchevron-leftchevron-rightchevron-small-downchevron-upcircle-with-crosscomposecrossfacebookflagfolderglobegoogleimagesinstagramkeylocation-pinmedalmoremuplabelShape 3 + Rectangle 1pagepersonpluspollsImported LayersImported LayersImported LayersshieldstartwitterwinbackClosewinbackCompletewinbackDiscountyahoo

PhillyPUG April 2013 Meetup - Machine Learning and Natural Language Processing

  • Apr 16, 2013 · 6:30 PM
  • SunGard Availability Services


6:30 PM - 7:00 PM - Food, Networking, and a word from our sponsors (SunGard and AWeber)

7:00 PM - 7:30 PM - scikit-learn, by Michael Becker

7:30 PM - 8:00 PM - NLTK, by Chris Brown

8:00 PM - 8:30 PM - Lightning Talks

8:30 PM - Leave for Prohibition Taproom


More details:

scikit-learn, by Michael Becker

In this talk Michael will lead us through section 2 of the scikit-learn tutorial: Topics covered will include: feature extraction, classification, regression, principal component analysis (PCA), clustering, detecting and avoiding overfitting, and measuring performance. During this tutorial we'll be making use of the iris dataset ( in most of the examples. If participants wish to follow along with the tutorial, they should follow the steps listed at before coming to the talk.

Michael Becker is the Senior Data Engineer at AWeber and founder of the DataPhilly Meetup group. On a day to day basis, he spends a majority of his time acquiring, scrubbing, exploring, and visualizing
data. He loves machine learning and gets his kicks out of clustering, regression and classification algorithms.


NLTK (Natural Language Toolkit), by Chris Brown

This talk will be modeled after the NLTK talk given at PyData NYC 2012 The talk covers out-of-the-box features available in NLTK which include stemming, tokenization, stripping html, wordnet integration, named entity recognition, quickly building a corpus of data, and basic classifiers. Chris will motivate the use of these tools with a practical tutorial on building a topic classifier for news articles and then classifying political news according to political sentiment (conservative or liberal).

Chris Brown is a PhD candidate in political science at the University of Pennsylvania. He has worked on research projects that include determining how personal characteristics of leaders increase the likelihood of international conflict, measuring political polarization across different issue areas in Congress using roll call votes, and assigning political partisanship scores to blogs using natural language processing. He uses Python almost every day while working on his dissertation which uses natural language processing to test theories about political parties and polarization with Congressional speeches. In his free time Chris has been developing a Django-based website to help Pennsylvanians monitor and track their state legislators.


Join or login to comment.

  • A former member
    A former member

    Amazing, great topics, great presentations, and great followup question sessions.

    April 17, 2013

  • Christopher B.

    Hey everyone!

    You all had some great questions! Really enjoyed talking with everyone.

    My slides can be found on my website:

    And the code for the examples and everything is on my github I updated the documentation a little bit there as well.

    April 17, 2013

  • john A.

    Great talks; lots of good questions & answers; and thanks very much for putting the talks up on line!

    April 17, 2013

  • Michael B.

    Thanks everyone who came out tonight!
    the slides for my talk can be found on my github @

    While I have your attention... I'd like to plug a talk I'm giving next week (with Kelly O'Brien) on the 23rd on Data Processing with Mechanical Turk Tickets are free.

    Additionally, AWeber is having an open house on April 30th If you don't believe Bobby House about how awesome it is working @ AWeber, you can find out for yourself!

    April 16, 2013

  • Alexander P.

    Fascinating subject, interesting content and questions - sure will ignite a few ideas. My puzzler is sore.

    1 · April 16, 2013

  • John W. O.

    Thanks to Michael and Chris for their informative talks. I'll be chalking up these topics in my (infinitely expanding) TO-HACK queue.

    1 · April 16, 2013

  • Peter G.

    Good talks!

    1 · April 16, 2013

  • A former member
    A former member


    1 · April 16, 2013

  • Lauren

    I was wait listed in two groups - freeing up a spot for someone else

    April 15, 2013

  • Rob H.

    Have an Company meeting conflict, was really looking forward to these talks.

    April 15, 2013

  • Q

    I'm registered through the DataPhilly Meetup already. Trying to open a spot for someone else here.

    April 15, 2013

  • Chris H.

    I am using NLTK now and would really love to see Chris Browns talk. Pretty please!

    April 2, 2013

  • Steve D.

    Looking fowsrd to the first meet.

    March 16, 2013

  • A former member
    A former member

    RSVP'd through dataphilly. Looking forward to it.

    March 10, 2013

  • Anthony S.

    Already RSVP'ed with DataPhilly

    March 5, 2013

  • Ben M.

    Geospatial consultant at UD ... both topics sound very relevant to my work .. exciting!

    March 5, 2013

  • Rob H.

    Sounds like some really interesting talks.

    March 5, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy