Skip to content

Details

Join our in-person meetup on April 24th to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.

Register to reserve your seat. Space is limited!

Date, Time and Location

Apr 24, 2026
5:30 PM - 8:30 PM

MotionLab
Bouchéstraße 12/Halle 20
12435 Berlin

Kaputt: A Large-Scale Dataset for Visual Defect Detection

We present a novel large-scale dataset for defect detection in a logistics setting. Recent work on industrial anomaly detection has primarily focused on manufacturing scenarios with highly controlled poses and a limited number of object categories. Existing benchmarks like MVTec-AD (Bergmann et al., 2021) and VisA (Zou et al., 2022) have reached saturation, with state-of-the-art methods achieving up to 99.9% AUROC scores. In contrast to manufacturing, anomaly detection in retail logistics faces new challenges, particularly in the diversity and variability of object pose and appearance. Leading anomaly detection methods fall short when applied to this new setting.

To bridge this gap, we introduce a new benchmark that overcomes the current limitations of existing datasets. With over 230,000 images (and more than 29,000 defective instances), it is 40 times larger than MVTec and contains more than 48,000 distinct objects. To validate the difficulty of the problem, we conduct an extensive evaluation of multiple state-of-the-art anomaly detection methods, demonstrating that they do not surpass 56.96% AUROC on our dataset. Further qualitative analysis confirms that existing methods struggle to leverage normal samples under heavy pose and appearance variation. With our large-scale dataset, we set a new benchmark and encourage future research towards solving this challenging problem in retail logistics anomaly detection. The dataset is available for download under https://www.kaputt-dataset.com.

About the Speaker

Sebastian Höfer is an Applied Science Manager at Amazon Fulfillment Technologies & Robotics, leading machine learning and computer vision research for large-scale robotics and warehouse automation. He received his PhD from the Robotics & Biology Lab at TU Berlin, focusing on Sim2Real transfer and robotic perception. His recent work, “Kaputt: A Large-Scale Dataset for Visual Defect Detection” (ICCV 2025) [37], established a major benchmark for industrial anomaly detection, reflecting his expertise at the intersection of academic research and real-world deployment.

Data Foundations for Vision-Language-Action Models

Model architectures get the papers, but data decides whether robots actually work. This talk introduces VLAs from a data-centric perspective: what makes robot datasets fundamentally different from image classification or video understanding, how the field is organizing its data (Open X-Embodiment, LeRobot, RLDS), and what evaluation benchmarks actually measure. We'll examine the unique challenges such as temporal structure, proprioceptive signals, and heterogeneity in embodiment, and discuss why addressing them matters more than the next architectural innovation.

About the Speaker

Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in VLMs, Visual Agents, Document AI, and Physical AI.

Most AI Agents Are Broken. Let’s Fix That

AI agents are having a moment, but most of them are little more than fragile prototypes that break under pressure. Together, we’ll explore why so many agentic systems fail in practice, and how to fix that with real engineering principles. In this talk, you’ll learn how to build agents that are modular, observable, and ready for production. If you’re tired of shiny agent demos that don't deliver, this talk is your blueprint for building agents that actually work.

About the Speaker

Bilge Yücel is a Senior Developer Relations Engineer at deepset, helping developers build agentic AI apps with Haystack. Passionate about AI, she makes complex concepts approachable through hands-on tutorials, both online and at real-life events.

Operationalizing Computer Vision for Overhead Lines: Beyond the Demo

At first glance, visual inspection of high-voltage power lines seems straightforward: collect imagery, run one or two AI models, and report the findings. In practice, moving beyond a proof of concept reveals a range of issues that can make or break a campaign. Common concerns include data quality and coverage, scarcity of the most relevant cases and abundance everywhere else, variations in pylon geometry and asset types across regions, calibration and GIS alignment challenges, and a long tail of edge cases that emerge in real-world operations.

This talk introduces Siemens Energy’s end-to-end overhead line inspection solution and shares key learnings from inspecting more than 10,000 km of power lines for real customers across several continents. We will show how raw 2D/3D data is transformed into structured information, delivering insights into asset inventory as well as defects, and supporting maintenance and planning decisions for critical infrastructure. The focus is on the combination of algorithmic building blocks and scalable processing, designed for robustness and consistency at scale, where even low error rates can become operationally significant.

About the Speaker

Stefan Wakolbinger is the Development Team Lead for AI & Analytics at SIEAERO, Siemens Energy's digital powerline inspection service. He leads the development of cutting-edge AI and analytics solutions that transform aerial powerline inspection through multi-sensor technology. His team creates digital twins of powerline infrastructure, automates fault detection, and monitors vegetation management—making powerline inspection safer, more precise, and more efficient. Stefan has been driving innovation in this role since September 2022.

Search your video library like a database

Drop in YouTube URLs or upload files and query content four ways: exact keyword matching, semantic search across transcripts, visual scene search via SigLIP2, and LLM-generated answers that synthesise across segments.

Paras Mehta is a Berlin-based AI engineer and CTO/co-founder of Sylby, a language learning app he built from scratch, reaching 10,000 users and raising €350K. Previously: data scientist at Motionlogic, senior software engineer at Volkswagen, a PhD from Freie Universität Berlin, and a visiting stint at Cambridge. He now works as an AI engineer at HPI's AI Service Centre.

Related topics

Events in Berlin, DE
Artificial Intelligence
Computer Vision
Machine Learning
Data Science
Open Source

You may also like