SEA: User Behaviour and Topical Diversity

This is a past event

36 people went

University of Amsterdam

Science Park 904, Room G3.02 · Amsterdam

How to find us

This is an even longer walk than where our last Meetup was held. Plan 5-10min extra time please!

Location image of event venue


This Friday, we'll again have two talks followed by drinks.

Andreas Brückner (SDL Fredhopper ( - Search behavior in e-commerce search

On-site search in online retail is a specific field in IR and requires different approaches than traditional full-text search. Even within the domain there are subfields with distinct searching and buying patterns. To further add complexity, balancing customer needs and business interests is crucial because of the strong financial incentives in retail.

In this talk we will delve into the specific use cases of online shops and their customers. Andreas will share real-world examples of user behavior in matching, ranking and language specifics. In general, we will not limit ourselves to end users but will also look at the needs and wants of marketers responsible for day-to-day maintenance of search.

Andreas Brückner works as Senior Merchandising Consultant at SDL Fredhopper. SDL Fredhopper is a search and merchandising solution for online retailers that enables marketers to control how shoppers explore their online shops. In his role Andreas is subject matter expert for search and visual merchandising and responsible for running optimization projects at some of Europe’s largest online shops, including Otto, Selfridges, and De Bijenkorf.

Hosein Azarbonyad (ILPS ( - Measuring Topical Diversity of Text Documents Using Parsimonious Topic Models

The availability of user-generated text-based reviews on the web stimulated research in automatically assessing the interestingness of texts. High topical diversity is considered an important characteristic of interesting documents. Text diversity is measured based on the number of topics a text covers and these topics are assigned using a topic model which is learned/trained using standard techniques like LDA.

Approaches proposed for measuring a text's diversity rely on the accuracy of topic models. The accuracy of topic models however is limited. Specifically, there are two key issues with topic models: First, there are some general topics extracted from the training corpus which are not informative and just contain the general information of the corpus. Second, there are some general words within some topics which make the extracted topics impure.

To address these issues we propose several topic refinement approaches based on parsimonious language models. We evaluate the performance of our approach in a text clustering and a topical diversity setting. The results show that in the best case, using parsimonious topic models (PTM) improves the Purity of text clustering method by more than 10% over the LDA without parsimonization on different standard text clustering datasets. Moreover, the accuracy of diversity classifier using PTM is 11% better than the accuracy of simple LDA based classifier.

Hosein Azarbonyad is a PhD student in Information and Language Processing Systems (ILPS) group in University of Amsterdam. His PhD is focused on exploratory search over political documents and he is supervised by Professor Maarten Marx and Professor Maarten De Rijke. Prior to joining the University of Amsterdam, he worked on learning to rank for cross language information retrieval in his master's thesis. His current research interest includes text mining, machine learning, and exploratory search.