Agenda:
• 18:30 - Opening doors of the venue
• 19:00 - Welcome to PyBerlin! // Organisers
• 19:10 - Welcome from the host - ThoughtWorks
• 19:20 - Conquering PDFs: document understanding beyond plain text //
Ines Montani
NLP and data science could be so easy if all of our data came as clean and plain text. But in practice, a lot of it is hidden away in PDFs, Word documents, scans and other formats that have been a nightmare to work with. In this talk, Ines will present a new and modular approach for building robust document understanding systems, using state-of-the-art models and the awesome Python ecosystem. Ines will show you how you can go from PDFs to structured data and even build fully custom information extraction pipelines for your specific use case.
Speaker's bio:
Ines Montani is a developer specializing in tools for AI and NLP technology. She’s the co-founder and CEO of Explosion and a core developer of spaCy, a popular open-source library for Natural Language Processing in Python, and Prodigy, a modern annotation tool for creating training data for machine learning models.
• 19:50 - Short break
• 20:20 - Building EU-AI Act Compliant AI Agents for Legacy Systems // Aemal Sayer
In this talk, Aemal will introduce a fully self-hosted, EU-AI Act compliant framework for building AI agents capable of operating any software system, legacy or modern, through its UI. Inspired by OpenAI’s Operator but designed for real-world compliance and flexibility, this framework combines LLMs, virtualization, and OS-level automation to let agents interact with applications as a human would: by clicking, typing, and navigating interfaces. Unlike API-bound or browser-focused tools, this solution enables true system-wide autonomy for AI agents, making integration with non-API systems not only possible, but seamless. The framework also embraces a human-in-the-loop design. When an agent encounters an issue it can’t resolve, it notifies a human operator, who can then remotely connect to the agent’s virtual environment, intervene to unblock the task, and hand control back to the agent to continue its work seamlessly.
Speaker's bio:
Aemal Sayer is a freelance AI engineer based in Berlin, Germany, with over 20 years of experience in software development and 8 years specializing in artificial intelligence. He works with small and medium-sized businesses to automate financial processes such as bookkeeping, invoicing, and compliance. His focus is on building privacy-first, self-hosted AI agents tailored to industries like e-commerce, logistics, and manufacturing. Currently, he’s building a GDPR- and EU AI Act-compliant agent that integrates with DATEV, Germany’s leading accounting platform, as a public, open-source project, aiming to drive transparency and innovation in AI-powered business automation.
• 20:50 - Implementing Distributed Systems in Python // Shahriyar Rzayev
In this talk, Shahriyar will demonstrate how to implement distributed systems patterns in Python through a step-by-step guide to building a distributed key-value storage system, filling a gap left by the abundance of examples in Go, Rust, and Java.
Speaker's bio:
Software Engineer. Moving forward on Clean Code and Clean Architecture. Previous accomplishments include contributing to open source, providing technical direction, and sharing knowledge about Clean Code and Clean Architecture. An empathetic team player and mentor. Azerbaijan Python User Group Leader. Former QA Engineer and Bug Hunter.
• 21:20 - Closing session // Organisers
This event will be only in-person. Please check our Code of Conduct and official health regulation in Berlin before coming. If you feel some signs of sickness, please consider skipping this event and attending another time. We will have plenty of events in different formats in the future.
Looking forward seeing you all soon!