From a Fintech lens: MCP server live-coding & feature selection data hacks


Details
PyData Amsterdam is excited to announce our next meetup on Tuesday, August 26, at the Mollie Amsterdam office. Join us for an evening focused on practical machine learning engineering featuring three compelling talks that cut through AI buzzwords to showcase real-world implementations and open-source innovations. Stick around for good chats, new connections, and a relaxed evening with an enthusiastic data community.
SCHEDULE:
- 17:30-18:10: Welcome with snacks and drinks
- 18:10-18:15: Mollie introduction
- 18:15-18:45: Talk 1 - "Don't over-egg the pudding: the MCP and its server", Wessel Huising
- 18:45-19:15: Talk 2 - "Genetic Algorithms + Feature Importance For Feature Selection", Claudio Salvatore Arcidiacono
- 19:15-19:45: Break + Food (🍕)
- 19:45-20:15: Talk 3 - "Addressing the Spark and Pandas duality. Smart feature creation with Stile", Gilles Verbockhaven
- 20:15-21:00: 🤝 Networking with drinks 🍻
TALKS:
[Talk 1] "Don't over-egg the pudding: the MCP and its server" by Wessel Huising
In an era where everything even slightly related to generative AI is considered the new meta, it is hard to keep track of the technical increments that are actually useful to our work and domains. The announcement of the Model Context Protocol from Anthropic has generated a lot of buzz and is a good attempt to become the leader among the various LLM providers. This talk will take a stab at creating an overview and an honest take on what the MCP server will bring us and what it feels like to develop one for Mollie, trying to combine all emotions and experiences together to answer the question of whether it really lives up to the promise. Defying established presentation best practices, I will try to live-code a new MCP server providing functionality for a to-be-chosen service.
Wessel Huising is a Senior Machine Learning Engineer at Mollie. He started scripting in PHP at the age of 14 and has never stopped since, except for the PHP part. He has dived into the rabbit hole of ML Platform engineering and MLOps for a few years now. Currently, he is trying to become full-stack in order to create end-to-end ML-focused products. He uses a lot of words to explain that he just likes to build stuff.
[Talk 2] "Genetic Algorithms + Feature Importance For Feature Selection" by Claudio Salvatore Arcidiacono
This presentation introduces the Genetic Algorithms + Feature Importance Feature Selection technique, implemented in the open source Python package felimination. Genetic algorithms are a powerful optimization technique that can be effectively utilized for feature selection in machine learning models. By combining genetic algorithms with feature importance, we aim to enhance the feature selection process, leading to more robust and interpretable models. We will start by reviewing genetic algorithms, detailing the steps of pool initialization, crossover, mutation, and selection. The presentation will continue by showcasing some code snippets using felimination, a Python package containing a suite of algorithms for feature selection, including the genetic algorithm with feature importance selector.
Claudio Salvatore Arcidiacono is a Senior Machine Learning Engineer at Mollie. He has been working in the fintech sector over the past 7 years with lots of experience in classical machine learning problems, mainly in binary classification problems. He loves to contribute to data science open source libraries like feature engine, scikit-learn, and narwhals. He maintains a couple of open source libraries himself (felimination and sklearo). In his free time, he is a coffee scientist, using a data-driven approach to dial in the perfect cup of espresso.
[Talk 3]: "Addressing the Spark and Pandas duality. Smart feature creation with Stile" by Gilles Verbockhaven
The presentation will introduce the “Stile eco-system” developed at ING Analytics to speed up the time to market of machine learning models for the instant lending domain. The main issue to solve is the duality between Spark and Pandas for feature generation. Spark is used for development while dealing with the billions of transactions stored in the data warehouse. Pandas is used in production when applications are scored one by one in a real-time situation.
During the presentation, Gilles will explain how the template for model development works, with a specific focus on feature creation. Additionally, Gilles will highlight how Pandas and PySpark are integrated in common functionalities, and the user-friendly testing framework developed to ensure consistency between the two worlds, and, finally, how to easily trim the code to only produce the features required for the final model.
Gilles Verbockhaven is Chapter Lead at ING Retail Banking Analytics and manages a team of five Data Scientists. He has been working at ING for 20 years now and has experience in various domains, ranging from market risk to modelling. Since 2017, he has been working in the Machine Learning area and has specialized in designing analytic solutions for collections and pricing. In his free time, he spends his energy running and biking.
DIRECTIONS:
Mollie Amsterdam Office
Address: Keizersgracht 126, 1015 CW, Amsterdam
Mollie’s office is right in Amsterdam’s historic city center. It’s a nice ~20-minute walk (about 1.5 km) from Central Station. If you’d rather hop on public transport, trams 2, 12, or 17 will get you from Central Station to Keizersgracht in ~8 minutes.

Sponsoren
From a Fintech lens: MCP server live-coding & feature selection data hacks