Worum es bei uns geht

🖖 Welcome to the Open NLP Group!
It’s about open source projects on natural language processing.
It’s open to people with all kinds of backgrounds: students, industry, …
It’s open access: watch the livestream from everywhere, join us on-site if the situation permits or check out the recording made available after the event.
And last but not least, the Open NLP Group is more than high-quality talks from industry and research perspectives. It’s also the place to discuss with NLP enthusiasts, connect with peers, get ideas on how to integrate NLP techniques into your applications, …

📣 Past Speakers

  • Max Callaghan (Postdoc at Mercator Research Institute on Global Commons and Climate Change)
  • Vladimir Blagojevic (Long-time open-source contributor and recently joined Senior Software Engineer at deepset)
  • Betty van Aken (PhD Student at BHT)
  • Michel Bartels (NLP Engineering Intern at deepset)
  • Sophia Althammer (PhD student at TU Vienna)
  • Max Irwin (Managing Consultant at OpenSource Connections)
  • Dmitry Kan (Principal AI Scientist at Silo AI)
  • Vincent Warmerdam (Research Advocate at Rasa)
  • Stefan Decker (Freelance Data Scientist)
  • Nandan Thakur (PhD Candidate at University of Waterloo)
  • Branden Chan (Machine Learning Engineer at deepset)

Bevorstehende Events (1)

Incorporating New Knowledge Into LMs & Building a Domain-Specific Search Engine

On September 29th, we will have our first hybrid meetup, which you can join remotely via Zoom or in person in Berlin on the AI Campus! Two great speakers have agreed to give presentations on "Incorporating New Knowledge Into Language Models" & "Building a Domain-Specific Search Engine": Nils Reimers from co:here and Matthias Richter from ML6! 🎉

The talks will start at 7pm. After the talks there will be small Zoom breakout rooms to connect and discuss. To coordinate the registration of remote participants and on-site participants, there are two separate meetup pages. This page here is for the registration of remote participants. A separate page is for the registration of on-site participants. There, admission begins at 6:30pm.

Save the date and register now either here to join remotely via Zoom or on the other page to join in person in Berlin! Looking forward to meeting you!

Incorporating New Knowledge Into Language Models
by Nils Reimers from co:here
Language models work well for many NLP tasks, but they have one big weakness: Each day passing since they have been pre-trained/fine-tuned, their knowledge becomes more and more obsolete. For example, the BERT model still thinks that Barack Obama is the current US president. Especially in semantic search this is a big issue, as we often search for the most recent events. In this talk, I will give an overview how to include new knowledge into language models like BERT with a special focus on search. I will then present Generative Pseudo Labeling (GPL), an efficient method to adapt semantic search models to new domains & datasets.

Building a Domain-Specific Search Engine
by Matthias Richter from ML6
Semantic search engines enjoy more and more attention and at ML6 we deal with a lot of different domains and datasets. In this talk, I will give some insights into practical use cases where semantic search beats classical lexical-based search engines. The latest version of the Haystack framework already integrated an implementation of Generative Pseudo Labeling (GPL). I will demonstrate how you can easily use GPL to adapt a dense retriever to any domain-specific dataset and build a semantic search engine on top. To this end, I will showcase a small demo that compares the results of different search approaches.

