Sparsely Activated Layers for NLP and Side Information for Recommendation

This is a past event

70 people went

Every last Friday of the month until September 26, 2019

Location image of event venue

Details

This Friday we'll have two talks followed by drinks.

16:00 Wilker Aziz (University of Amsterdam) Sparsely Activated Layers for NLP

16:30 Yifan Chen (University of Amsterdam) Leveraging High-Dimensional Side Information for Top-N Recommendation

========================

16:00 Wilker Aziz (University of Amsterdam) Sparsely Activated Layers for NLP

In this talk I will present sparsely activated layers for neural network classifiers. We achieve sparsity via a differentiable reparameterisation of a mixture of discrete and continuous distributions which allows for efficient and unbiased gradient estimation. I will show applications in natural language processing where we employ these layers to (1) improve interpretability of NN models, (2) reduce overfitting, and (3) speed up marginalisation over large lookup tables. Find more information here: http://wilkeraziz.github.io/

========================

16:30 Yifan Chen (University of Amsterdam) Leveraging High-Dimensional Side Information for Top-N Recommendation

Top-N recommendations have been widely adopted to recommend ranked lists of items so as to help users identify the items that best fit their personal tastes. By effectively utilizing rating information, i.e., the historical interactions between users and items, recent top-N recommendation methods achieve great success via Collaborative Filtering (CF). However, it is also widely recognized that CF-based methods suffer from the rating sparsity. While the auxiliary information associated with items, referred to as *side information*, is typically utilized for compensating rating sparsity, the high-dimensionality of side information brings a new challenge: how to effectively utilize side information for top-N recommendation while overcoming the issues brought by the curse of high-dimensionality. This talk will discuss how to utilize feature reductions for this problem.