Big Data, Search and Recommendation

Details
This Friday, we'll have two talks followed by drinks. The academic talk will be given by Claudia Minardi from UvA. Wilco van Duinkerken from Trivago will give the industry talk.
Program:
16:00 - 16:30 Claudia Minardi (UvA)
16:30 - 17:00 Wilco van Duinkerken (Trivago)
17:00 - 18:00 Drinks & Snacks
Details of the talks:
-----------------------------------------------------------------------------------
Claudia Minardi--Recommending documents using implicit user feedback
Innovation inside a multinational global retail company with thousands of employees generates a surprisingly high amount of documents: catalogs, regulations, spreadsheets, and so on. When we started this project, no specific effort was done to target the documents to the people that need to see them, and as a consequence a lot of them were lost in the clutter.
To solve this problem devised a ranking mechanism that allows a personalized selection of documents to be visible, according to the viewer's preferences. To do so, we exploit user interactions that are currently available, and can provide us with information about what is considered relevant to a specific user.
We designed and implemented a recommender system that applies Online Learning to Rank techniques to target documents to users. This is based on a combination of business rules and machine learning, exploiting user interaction to perform online evaluation of the ranker algorithms and improve precision of the suggestions.
Claudia Minardi has a MSc in Computer Science (Artificial Intelligence) at the UvA/VU joint program, supervised by Anne Schuth and Maarten de Rijke. Her main interest is applying Information Retrieval techniques for search optimization to recommender systems using real customer data for evaluation.
Currently works as a Software Engineer at Collaborne, where her focus is on combining data science and machine learning techniques on customer generated data, to guide users through the process of innovation and design thinking.
-----------------------------------------------------------------------------------
Wilco van Duinkerken--Keeping up the pace: experimenting on Big Data
When data gets big and technology hits “production” – experiments get slow – or don’t they? Over the last 9 months the Amsterdam team of trivago has been setting up data pipelines that calculate scores on various aspects of hotels using structured and unstructured data. Does a hotel have a pool? How good it is? Can you swim laps? Is it kid friendly? We can’t share all the algorithms with you mostly because we are figuring them out ourselves, but we can share our experience on designing data pipelines and data structures that make it possible to experiment freely and flexibly on our hotel “inventory” using a natural language processing pipeline, data transformations, mappings and scoring methods, all done in the amazon cloud.
Wilco van Duinkerken is currently building a team of software engineers and data scientist for trivago (www.trivago.com) in Amsterdam. The team’s goal is to get to know everything there is to know about as many hotels as possible using mainly public data sources. In the last 5 years Wilco has focused on building highly scalable data and research infrastructures using Amazon Webservices (a.k.a. the cloud). He was part of the successful FP7 funded Opener project, focusing on standardizing natural language processing pipelines across domains and languages (www.opener-project.eu) together with researchers from the VU as well as partners in Spain and Italy.

Big Data, Search and Recommendation