PhillyPUG April 2013 Meetup - Machine Learning and Natural Language Processing

  • April 16, 2013 · 6:30 PM
  • SunGard Availability Services


6:30 PM - 7:00 PM - Food, Networking, and a word from our sponsors (SunGard and AWeber)

7:00 PM - 7:30 PM - scikit-learn, by Michael Becker

7:30 PM - 8:00 PM - NLTK, by Chris Brown

8:00 PM - 8:30 PM - Lightning Talks

8:30 PM - Leave for Prohibition Taproom


More details:

scikit-learn, by Michael Becker

In this talk Michael will lead us through section 2 of the scikit-learn tutorial: Topics covered will include: feature extraction, classification, regression, principal component analysis (PCA), clustering, detecting and avoiding overfitting, and measuring performance. During this tutorial we'll be making use of the iris dataset ( in most of the examples. If participants wish to follow along with the tutorial, they should follow the steps listed at before coming to the talk.

Michael Becker is the Senior Data Engineer at AWeber and founder of the DataPhilly Meetup group. On a day to day basis, he spends a majority of his time acquiring, scrubbing, exploring, and visualizing
data. He loves machine learning and gets his kicks out of clustering, regression and classification algorithms.


NLTK (Natural Language Toolkit), by Chris Brown

This talk will be modeled after the NLTK talk given at PyData NYC 2012 The talk covers out-of-the-box features available in NLTK which include stemming, tokenization, stripping html, wordnet integration, named entity recognition, quickly building a corpus of data, and basic classifiers. Chris will motivate the use of these tools with a practical tutorial on building a topic classifier for news articles and then classifying political news according to political sentiment (conservative or liberal).

Chris Brown is a PhD candidate in political science at the University of Pennsylvania. He has worked on research projects that include determining how personal characteristics of leaders increase the likelihood of international conflict, measuring political polarization across different issue areas in Congress using roll call votes, and assigning political partisanship scores to blogs using natural language processing. He uses Python almost every day while working on his dissertation which uses natural language processing to test theories about political parties and polarization with Congressional speeches. In his free time Chris has been developing a Django-based website to help Pennsylvanians monitor and track their state legislators.


Join or login to comment.

  • Edward Bujak

    Amazing, great topics, great presentations, and great followup question sessions.

    April 17, 2013

  • Christopher Brown

    Hey everyone!

    You all had some great questions! Really enjoyed talking with everyone.

    My slides can be found on my website:­

    And the code for the examples and everything is on my github­ I updated the documentation a little bit there as well.

    April 17, 2013

  • john Ashmead

    Great talks; lots of good questions & answers; and thanks very much for putting the talks up on line!

    April 17, 2013

  • Michael Becker

    Thanks everyone who came out tonight!
    the slides for my talk can be found on my github @­

    While I have your attention... I'd like to plug a talk I'm giving next week (with Kelly O'Brien) on the 23rd on Data Processing with Mechanical Turk­. Tickets are free.

    Additionally, AWeber is having an open house on April 30th http://aweberopenhouse.eventbri...­. If you don't believe Bobby House about how awesome it is working @ AWeber, you can find out for yourself!

    April 16, 2013

  • Alexander Popov

    Fascinating subject, interesting content and questions - sure will ignite a few ideas. My puzzler is sore.

    1 · April 16, 2013

  • John W. O'Brien

    Thanks to Michael and Chris for their informative talks. I'll be chalking up these topics in my (infinitely expanding) TO-HACK queue.

    1 · April 16, 2013

  • Peter Gebhard

    Good talks!

    1 · April 16, 2013

  • Casey Vaughan


    1 · April 16, 2013

  • Lauren

    I was wait listed in two groups - freeing up a spot for someone else

    April 15, 2013

  • Rob Harrigan

    Have an Company meeting conflict, was really looking forward to these talks.

    April 15, 2013

  • Q

    I'm registered through the DataPhilly Meetup already. Trying to open a spot for someone else here.

    April 15, 2013

  • Chris Hunter

    I am using NLTK now and would really love to see Chris Browns talk. Pretty please!

    April 2, 2013

  • Steve Devlin

    Looking fowsrd to the first meet.

    March 16, 2013

  • Patrick Sweet

    RSVP'd through dataphilly. Looking forward to it.

    March 10, 2013

  • Anthony So

    Already RSVP'ed with DataPhilly

    March 5, 2013

  • Ben Mearns

    Geospatial consultant at UD ... both topics sound very relevant to my work .. exciting!

    March 5, 2013

  • Rob Harrigan

    Sounds like some really interesting talks.

    March 5, 2013

Our Sponsors

People in this
Meetup are also in:


Start the perfect Meetup Group for you

It's quick and easy to create a lively community.

Start a Meetup Group

Log in

Not registered with us yet?

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy