addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1linklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Big Open Data with Common Crawl

  • Jul 23, 2014 · 6:00 PM

Please join us for an evening of talking about big open data! 

There will be two excellent presentations describing projects done with Common Crawl data. As always, there will be lots of smart, interesting people in attendance and ample opportunity to talk with them. After the presentations (at 8pm) we will adjourn to a nearby bar to continue the conversations.

          Wednesday July 23rd


             22 Battery Street, San Francisco

Speaker: Stephen Merity


Have you ever been curious as to how widely Google Analytics is used across the web? Stop pondering, start coding! Stephen will discuss how he used the Common Crawl dataset to perform wide scale analysis over billions of web pages and what this means for privacy on the web at large.

Speaker: Oskar Singer


Performing text analytics and NLP on Twitter data can be a challenge because of the frequent disregard for standard spelling and semantic difference between homophones (e.g. two, too, to). In this presentation Oskar will discuss his experience addressing this challenge and the creative solution that he developed with his colleague at Lexalytics.

Thanks to RiskIQ for hosting the event! RiskIQ is a super cool security company - learn more about them here:

Thanks to O'Reilly for sponsoring the meeting! They will be providing delicious food, awesome O'Reilly books for a raffle, and have generously provided a discount code for the Strata conference.  Strata is *the* conference for big data and you should definitely attend. Click here for more information and a discount code:

Join or login to comment.

  • Raymond Y.

    I enjoyed the talks, the people, the location, the organization, and the food -- there wasn't anything to dislike. :-)

    1 · July 25, 2014

  • Stephen M.

    Slides from my talk "Measuring the impact of Google Analytics with Common Crawl":

    If anyone is interested in tackling a project of their own with the Common Crawl dataset, I'd love to hear about it and will offer any help I can! I'm Smerity on various social networks (see profile) and [masked] via email.

    Looking forward to seeing you at the next Open Data meetup =]

    July 23, 2014

    • Petr V.

      Great talk. Thanks for sharing.

      1 · July 24, 2014

  • Abhilash I.

    Is this going to be broadcasted? I will not be able to attend it.

    July 23, 2014

    • Ted F.

      I've spoke with the organizer and unfortunately there will be no recording.

      July 23, 2014

  • Tyler A.

    First time attending!

    2 · July 14, 2014

Our Sponsors

  • O'Reilly Media

    Provides books and discounts for the group members.

  • Code for America

    CfA team members participate in the planning and organization.

  • Common Crawl

    CC team members participate in planning and organization.

  • Internet Archive

    Internet Archive team members participate in planning and organization.

  • Jetpac

    Jetpac team members participate in planning and organizing this group.

  • Kaggle

    Kaggle team members participate in the planning and organization.

  • Mendeley

    Mendeley team members participate in the planning and organization.

  • Open Knowledge Foundation

    OKF team members participate in the planning and organization.

  • Wikimedia

    Wikimedia team members participate in planning and organization.

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy