

What we’re about
london.pydata.org
Submit a talk: https://london.pydata.org/submit-a-talk/
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
The PyData Code of Conduct governs this meetup. To discuss any issues or concerns relating to the code of conduct or the behavior of anyone at a PyData meetup, please contact NumFOCUS Executive Director Leah Silen (+1 512-222-5449; leah@numfocus.org) or the group organizer.
Sponsors
See allUpcoming events (1)
See all- PyData London - 98th MeetupEC4R 3AD, London
Venue: Riverbank House, 2 Swan Ln, London EC4R 3AD
Please note:
- 🚨🚨🚨A valid photo ID is required by building security. 🚨🚨🚨
- This event follows the NumFOCUS Code of Conduct, please familiarise yourself with it before the event.
If your RSVP status says "You're going" you will be able to get in. No need to show your RSVP confirmation when signing in.
If you can no longer make it, please unRSVP as soon as you know.
***
Code of Conduct:
This event follows the NumFOCUS Code of Conduct. Please get in touch with the organisers with any questions or concerns regarding the Code of Conduct.
***
As always, there'll be free food & drinks, generously provided by our host, Man Group.
***Main Talks
1. How to prepare your AI Agents for the Ice(berg) Age - Serhii Sokolenko
In a future world where AI agents interact with billions of users, many of these agents will also have to interact with data querying tools to provide answers grounded in facts. As enterprise data analytics is rapidly moving towards open table formats like Apache Iceberg, these agents need to be able to speak to Iceberg-based data. In this talk, we will discuss how Apache Iceberg tooling and portable application runtimes make agents grounded in facts and enable them to run across different GPU stacks and deployment models.
2. Fuzzy, Not Fussy: Using AI to Tackle Data Entity Resolution at Scale - Yash Sakhuja @sakhuja_yash
Messy customer data — typos, inconsistent formats, and duplicates — can make it surprisingly difficult to answer basic questions, such as “How many unique customers do we have?” In this talk, I’ll share how I built an AI-powered fuzzy-matching system using Python and vector embeddings to accurately group similar customer records. Drawing on a real-world e-commerce use case, I’ll walk through a scalable, end-to-end solution that automates deduplication and delivers clean, reliable customer insights.
⚡ Lightning Talks
1. Continuous Prompt Evaluation and Optimisation in Production - Anand Rawat
Prompt engineering is no longer a one-time creative task; it has become an iterative, data-driven process. In production environments, prompts must be evaluated, optimised, and versioned just like code or machine-learning models. In this talk, I will demonstrate how to build a CI/CD pipeline for prompt development using tools such as TruLens, Weights & Biases, and custom evaluation metrics.
2. TBD
Logistics
Doors open at 6.30 pm (get there early as you have to sign-in via building security), talks start at 7 pm, drinks from 9 pm in the bar. We will have reduced capacity for this event but there will be plenty of people to discuss data science questions with!
Please unRSVP in good time if you realise you can't make it. We're limited by building security on the number of attendees, so please free up your place for your fellow community members!
Follow @pydatalondon (https://twitter.com/pydatalondon) for updates and early announcements.