Past Meetup

Machine Learning and Crowdsourcing

This Meetup is past

51 people went

Location image of event venue


Crowdsourcing has become an important technique in machine learning since it facilitates the construction of labeled data sets for supervised learning. Interestingly, different workers will often assign different labels to the same item, prompting the question ''what is the true label?". This can itself be treated as a machine learning problem, where workers provide different pieces of evidence about the true label. I'll discuss the use of generative models of crowdsourced data for the canonical tasks of 1) forced single choice between categorical labels; 2) forced single choice between ordinal labels (aka ``Hot or Not''); and 3) force choice of zero or more categorical labels. For these cases I will demonstrate how to model differences in crowdsource workers' ability, bias, and intent (sometimes adversarial!); along with differences in item difficulty. Examples of real software (from the open source project ) operating on real data will be provided.