Skip to content

Data Mining – A Winning Example

Photo of Sheamus McGovern
Hosted By
Sheamus M.
Data Mining – A Winning Example

Details

Data Mining – A Winning Example

For our June meet-up we are excited to bring you something a little different. We will host the winners and other competitors of the recent Dunnhumby data mining competition. We plan to have three sets of speakers give quick talks to discuss how they approached this data mining problem and came up with the winning solution.

The competition’s goal focused on retail product launches and predicting how successful each of a number of product launches will be 26 weeks after the launch, based only on information up to the 13th week after the launch.

Talk 1:

Winning Team - The winning team of MIT students will give a presentation on what models and techniques they used to build their prediction model to best forecast the outcome. Team members are:

Alex Levin is currently a fifth-year graduate student in Mathematics at MIT. He studies algorithms for speeding up computations on large networks and also does work in computational biology (most recently on using genetic data to infer human population history). Alex recently defended his thesis, and is excited to be joining a tech company next year, where he will apply his quantitative skills on challenging data analysis problems.

William Li is a fourth-year graduate student in Computer Science at MIT focusing on applied machine learning. Most recently, he’s been working on speech recognition and dialog systems, user activity prediction from mobile data, and natural language processing of large text collections.

George Tucker is currently a fifth-year PhD student in Mathematics at MIT. He studies applications of machine learning in computational biology. George has worked on problems related to protein-protein interactions and signaling networks and most recently to determining gene - disease associations in medical genomics.

Talk 2:

Daniel Gerlanc will be presenting a short talk on the Dunnhumby and hack/reduce Product Launch Challenge. He will discuss different statistical learning techniques for solving the problem and the corresponding R implementations.

Daniel Gerlanc has worked in data science and analytics for almost a decade. He is the founder of Enplus Advisors Inc, a data science consultancy. At Enplus he works with clients in different industries to improve build predictive models and design data-driven analytics applications. Before starting Enplus he spent 5 years as a quantitative analyst with two Boston hedge funds where he developed models and code used to manage hundreds of millions of dollars. He has coauthored several open source R packages, published in peer-reviewed journals, and is active in local predictive analytics groups. He is a graduate of Williams College.

Talk 3:

For non-programmers, Ralf Klinkenberg co-founder of Rapid-I, will demonstrate how to build some of the models discussed above in RapidMiner, a GUI based, popular data mining tool.

Ralf Klinkenberg holds Master of Science degrees in computer science with a focus on machine learning, data mining, text mining, and predictive analytics from the Technical University of Dortmund in Germany and Missouri University of Science and Technology in the USA. He initiated the RapidMiner open source data mining project in 2001 with Ingo Mierswa and Simon Fischer and founded the company Rapid-I with Ingo Mierswa in 2006. Rapid-I is the company behind the open source software solution RapidMiner and its server version RapidAnalytics offering all kinds of analytics solutions and services. Ralf Klinkenberg has more than 15 years of experience in consulting and training large and small corporations and organizations in many different sectors how to best leverage data mining for their needs. He performed data mining, text mining, and business analytics projects for companies like telecoms, banks, insurances, manufacturers, retailers, pharmaceutical companies, healthcare, IT, aviation, automotive, and market research companies, utility and energy providers, as well as government.

Photo of Open Data Science (Hosted by ODSC) group
Open Data Science (Hosted by ODSC)
See more events