addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Scraping together a dataset to predict Oscar winners

Pizza sponsored by DataXu, drinks afterward by MassChallenge.

Deborah Hanus, How to scrape together a dataset using things you found on the internet.

Using Jupyter notebooks and scikit-learn, I’ll predict whether a movie is likely to win an Oscar or be a box office hit. I’ll walk through the most important steps of creating an effective dataset using information that you find on the Internet: asking a question your data can answer, writing a web scraper, and answering those questions using nothing but Python libraries and data from the Internet. To illustrate how these steps fit together, I walk through building a dataset from IMDB data and use it to predict what makes a winning Oscar movie.

Plus a few lightning talks

Pizza will be provided by DataXu.

Mass Challenge is hosting drinks after the Meetup, so plan to stick around and say hello:

"MassChallenge is the most startup-friendly accelerator on the planet. No equity and not-for-profit, we are obsessed with helping entrepreneurs across any industry. We also reward the highest-impact startups through a competition to win a portion of several million dollars in equity-free cash awards. Through our global network of accelerators in Boston, London, Jerusalem, Lausanne and Mexico City and unrivaled access to our corporate partners, we can have a massive impact - driving growth and creating value the world over.

"We are expanding the use of our Accelerate Platform within our international programs and plan to make it available to a broader community of organizations with similar needs.  Currently the platform is a single Python Django web application that focuses on individual accelerator competitions.  To achieve MassChallenge's ambitious goals we need to re-architect the existing system and create entirely new web-services that will provide needed functionality at the increasing scale of the organization.  We are looking for an experienced Principal Software Engineer to join our team and help us catalyze a global startup renaissance that embraces diversity, creates real value, and takes on the world's biggest problems."



Join or login to comment.

Our Sponsors

  • OM1

    OM1 is sponsoring pizza on the 1/23 presentation night.

  • Man Numeric Investors

    Man Numeric Investors is feeding us on the January 10th project night.

  • PluralSight

    PluralSight is sponsoring the 2/9 project night.

  • Merrimack College

    Merrimack College is providing pizza for the 12/20 presentation night.

  • Carbon Black

    Carbon Black is sponsoring the 12/5 project night

  • Data Xu

    Data Xu is providing pizza for the November presentation night.

  • Mass Challenge

    Mass Challenge is sponsoring drinks on 11/21.

  • EverQuote

    EverQuote is sponsoring pizza at the October 20th presentation night.

  • InsightSquared

    InsightSquared is supplying drinks for the 10/20 presentation night.

  • Akamai

    Akamai is hosting and sponsoring the October project night

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy