20th Belgium NLP Meetup

Details
Now the summer break is over, the Belgium NLP Meetups are back! Our next event is hosted by the VUB AI Lab, which means that on Tuesday October 17th we'll finally be returning to Brussels.
The first talk is by the AI Lab's Paul Van Eecke, who will discuss some shortcomings of current large language models and present an alternative inspired by human language learning. Next, Miryam de Lhoneux (KU Leuven) will show how we can tackle the over-representation of English in multilingual NLP. Finally, Mike Friedman and Cesar Legendre will present their text-driven analysis of the Belgian data job market.
As usual, doors open at 7pm, talks start around 7.30pm. Detailed information about how to find us is at the bottom of this text. After the talks, you can join us for drinks at the Pilar bar, located in the same building.
Human-like language learning in artificial agents
Paul Van Eecke, VUB AI Lab
Today's large language models excel at exploiting the statistical properties of huge amounts of textual data to tackle a wide variety of NLP subtasks. They meticulously capture the co-occurrence of characters, syllables and words, and make use of numerical operations over these co-occurrences to perform mappings between linguistic input and output. While high-performing, these systems are increasingly being criticised for their lack of human-like understanding. In this talk, we will argue that the reasons why large language models struggle so much with logical and pragmatic inferencing are to a large extent ascribable to the fundamental differences between how large language models are constructed and how human languages are acquired. At the same time, we will discuss an alternative paradigm that aims to overcome these issues by mechanistically modelling the acquisition of language in artificial agents.
Typologically fair NLP
Miryam de Lhoneux, KU Leuven
The field of NLP has historically had a strong bias towards work that primarily uses English as a language of investigation. The situation is changing and multilingual NLP is booming. This talk first describes the state of multilingual NLP, highlighting both its successes and its limitations. In particular, large multilingual pretrained models (PLM) such as mBERT or XLM-R have shown surprising cross-lingual capabilities but they cover a small fraction of the world's languages with large inequalities in performance. These inequalities stem from at least two sources: 1) NLP datasets are highly imbalanced with regards to typological diversity and 2) NLP models tend to be developed for English first and then adapted to other languages, which leads to biases in the model assumptions. I describe attempts at overcoming both of these limitations, among other things, using pixel-based representations of language as well as methods from fairness in AI.
Beyond the Buzzwords: A Closer Look at Data Job Descriptions in Belgium
Mike Friedman & Cesar Legendre
This investigation examines the Belgian data job market by analyzing job posts using GPT-3.5-family models. The objective is to provide insights and guidance to job seekers, helping them identify career opportunities that match their skills and interests. The study gathers data from various online job portals, focusing on the period from January to April 2023. Using the search term "data scientist," relevant job posts are extracted for assessment. To extract valuable information from unstructured job descriptions, GPT-3.5-family models are employed via the API, enabling skills and tools extraction, document embeddings computation, and text generation and information extraction. The results highlight the enduring importance of programming languages, cloud-based tools, and software development skills in the data job market. Moreover, there is an increasing emphasis on soft skills such as communication and collaboration.
How to get to the event:
The meetup is in Building I (between entrances 6 and 7), in room I.2.03 (second floor, room 03) (Campus plan). The campus is easy to reach by public transport.
If you come by car:
- Your car can get access to the campus, if you register your license plate in advance (for free).
- You can only access the campus via entrance 6, 8, 11 or 13.
- Please observe the campus parking regulations, to avoid getting towed.
- The campus is in the Brussels Low Emission Zone (LEZ). You can check here if your vehicle is allowed in the LEZ. If not, you can buy a day pass.

Canceled
20th Belgium NLP Meetup