VDSG Knowledgefeed - Natural Language Understanding and Feature Engineering


Details
Dear Data Enthusiasts,
it is great to be back after the summer break with a Meetup on 27th of September at Ernst&Young. Doors Open at 18:00
We invite you to learn why deep neural networks are not always the best solution in natural language processing, and how to improve machine learning workflows and feature engineering using the transform design pattern.
Schedule
- Welcoming by the VDSG team
- Welcoming by E&Y
- Talks
- Gábor Recski: Transparent Natural Language Understanding
- Heinz Eckert: Elegant Feature Engineering with the Transform Design Pattern
- OpenMic - announcements by the community
After the talks, as always, the favourite part of many comes: socializing and networking while enjoying refreshments from the buffet.
🎤🎤 Open Mic
We are going to open up the stage before the talks for community announcements. If you'd like to announce something, open this slide deck, make sure you are signed in with a Google account, and click "View Only" -> "Request Edit Access". Explain in the text box what you want to announce, and we'll give you edit access to the slide deck.
🎤🎤
Gábor Recski: Transparent Natural Language Understanding
Transformer-based deep learning models have become the most widely used tool in natural language processing (NLP). When the goal is to extract structured information from text, nearly all solutions involve the training of such neural networks using human-annotated data and then using the resulting models directly for text processing. But the black box nature of these solutions greatly limit their applicability in domains that require transparency, predictability, or configurability. Rule-based solutions can offer all of these, but for complex domains they are difficult and costly to build and maintain.
Our group at TU Wien has developed an approach to information extraction that uses human-in-the-loop (HITL) learning for the semi-automatic creation of rule-based solutions. Our tool allows domain experts to build white box solutions in highly technical domains such as legal or medical NLP. The method is based on graph-based representations of natural language syntax and semantics, which we will introduce together with 1-2 recent use cases.
Gábor Recski is a computational linguist and a postdoctoral researcher at TU Wien. His research focuses on symbolic models of natural language semantics and their applications to information extraction tasks. He has published over 60 peer-reviewed papers with more than 400 citations.
Heinz Eckert: Elegant Feature Engineering with the Transform Design Pattern
Inspired by insights from the book "Machine Learning Design Patterns" by Lakshmanan et al., this session will highlight how the Transform Design Pattern can improve your machine learning workflows. We will discuss how clearly separating raw input values from transformed features can yield significant advantages in model maintenance, code organization, and faster deployment.
Using a hands-on example from the telco-industry, we will see how building custom transformation elements into your model pipeline can help you streamline your projects and improve overall efficiency.
Heinz Eckert has worked as a data scientist in the telecommunications sector for over four years and currently leads Magenta’s data science team. Combining a background in psychology and computer science, he is passionate about delivering sustainable data applications and continuous improvement.
-----
A special thanks to EY for sponsoring the event location.
-----
During the event, photos will be made and later posted on social media. Please notify us if you do not agree.
-----
Attention attendees with food allergies. Please be aware that the food and drinks provided may contain or come into contact with common allergens, such as dairy, eggs, wheat, soybeans, tree nuts, peanuts, fish, shellfish, or wheat.

VDSG Knowledgefeed - Natural Language Understanding and Feature Engineering