addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

DataPhilly October 2013 - Data Science Tools and Techniques

This month will be a joint meetup with the newly formed DataBucks meetup group. To kick things off, we'll be discussing various data science tools and techniques.


6:00 PM - 6:30 PM - Food, Networking, and a word from AWeber

6:30 PM - 7:00 PM - What is Principal Component Analysis?, by Rob Lass

7:00 PM - 7:30 PM - Realtime predictive analytics using RabbitMQ & scikit-learn., by Michael Becker

7:30 PM - 8:00 PM - Lightning Talks

8:00 PM - Leave for Blue Dog

What is Principal Component Analysis?, by Rob Lass


Too many features got you down?  Come learn about principal component analysis (PCA)!  In this talk, you will receive a gentle introduction to the theory behind PCA, as well as instruction on using existing tools to make doing PCA for exploratory data analysis as easy as duck soup.


Rob Lass is a software engineer for the Data Analysis and Management Ninjas at AWeber Communications, and a PhD candidate and instructor at Drexel University.  He has over 30 publications focusing on constraint reasoning and mobile ad-hoc networking.  When not writing autobiographical text, he typically does not refer to himself in the third person.

Realtime predictive analytics using RabbitMQ & scikit-learn., by Michael Becker


In this talk, you'll learn how to deploy a predictive model in a production environment using RabbitMQ and scikit-learn. You'll see a realtime content classification system to demonstrate this design.


Michael Becker is a Software Engineer and Data Analysis and Management Ninja at AWeber Communications. He is passionate about using data analysis, machine learning and ninjas to provide business value. He refers to himself in the third person.


If you'd like a ride on the AWeber shuttle from the train station to the office and back, please post in the comments below.

Join or login to comment.

  • Michael B.

    Thanks everyone who came out last night! Let me know what you thought of the location. If people liked it, we'll definitely hold future joint meetups with DataBucks! A few links from my talk:
    A video of my talk can be found here:
    My Slides can be found at:
    Code for the language classifier can be found at:

    October 24, 2013

    • Joshua

      This was my first meet up and one of the reasons I attended was that the location was close to where I work in Huntingdon Vally. Love the location and the awesome building.

      October 27, 2013

    • Jane E.

      I wasn't able to make this meeting. Thanks for posting your talk materials!

      November 28, 2013

  • Sujan K.

    I liked how the scikit-learn / RabbitMQ talk used an example that everyone could mentally grasp.

    1 · October 24, 2013

    • Michael B.

      Thanks for the great feedback! I should have prefaced my talk with a very important disclaimer, I am not a linguist or a NLP expert. However I'm pretty sure that many languages share exact alphabets http://en.wikipedia.o...­. In these cases looking at the characters used alone would not be enough. Looking at the frequency of characters used would help distinguish the language in these cases, and that's essentially what the ngram approach does. I actually learned about this technique through the google machine. This is also a technique outlined in the Russell & Norvig book. It can be found in section[masked] N-gram character models. The solution I employed doesn't require me to be an expert. I can apply standard ML techniques ( which I didn't cover in my talk) to ensure that the algorithm is accurate. I can accomplish this without any deep knowledge of linguistics. This is the power of machine learning and it extends to many other problem domains.

      October 25, 2013

    • A former member
      A former member

      comparing very similar languages would be more enlightening. Eric: thanks for the suggestion of taking a broader view with abstract categories.

      October 25, 2013

  • Jeffrey N.

    Good meetup! Another cool application of PCA is in facial recognition, check out the Eigenfaces wiki:
    I think you can also get a hold of the reference images to try training your own models with it.

    1 · October 25, 2013

  • Alex Z.

    Good talk! I like the new locatiom!

    October 24, 2013

  • Ken L.

    I've played with enterprise data and even created tools for real-time data display for professional race team on weekends, but would be interested in the formalisms. Glad I don't have to drive into Philly for this :)

    1 · October 22, 2013

    • Ken L.

      Thanks for the useful entries to data science. Had very good chats with other members. And thanks to AWeber for the O'Reilly shirt :)

      October 24, 2013

  • Joshua

    I am a noob to Data Science and I picked up some direction on where to start learning

    October 23, 2013

  • john A.

    Very high content talks; looking forward to seeing the slides & the refs.

    October 23, 2013

  • Lauren

    won't make this one.

    October 23, 2013

50 went

Our Sponsors

  • Azavea

    Speakers, Space, food and more

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy