10th Belgium NLP Meetup

Belgium NLP Meetup
Belgium NLP Meetup
Openbare groep
Locatieafbeelding van evenementslocatie

Wat we doen

It's time for celebration! On Thursday May 23rd the Belgium NLP Meetup celebrates its tenth edition. The place to be is Faktion in Antwerp, one of the leading NLP companies in Belgium. Many of you will know the drill by now: doors open at 7pm, talks start around 7.30pm and after 9pm there's time for networking.

Here are the three talks on our programme:

Insights into building the Faktion NLP pipeline
Aleksandra Vercauteren & Vilen Jumutcs (Faktion)
In this talk we are going to touch upon intricacies of dealing with training the latest language models such as BERT in the cloud with limited available corpora for non-English languages. We will exploit different training strategies as well as usher the necessity for extra-preprocessing needed to cope with language- and corpora-related limitations. We will present some practical hands-on examples of Google's TPU distribution strategy and possible problems which one might face when dealing with it. Finally we are going to exemplify a practical side of applying huge state-of-the-art language models in everyday NLP pipelines.

Fill the gap: Machine reading comprehension for medicine.
Simon Šuster (University of Antwerp)
Reading comprehension by machines has recently generated a lot of interest in the NLP community, where a number of datasets and model architectures have been proposed. In this talk, Simon will present his own work on a comprehension task in which the goal is to fill a query blank by reading the corresponding document. He will focus on the creation of a new dataset for the medical domain, show how to approach this task with neural networks and reflect on why this problem is so challenging.

Dealing with data scarcity in NLP
Yves Peirsman (NLP Town)
In this age of big data, NLP professionals are all too often faced with a lack of data: written language is abundant, but labeled text is much harder to come by. Yves will outline the most effective ways of addressing this challenge, from the semiautomatic construction of labeled training data to transfer learning approaches that reduce the need for labeled training examples.