Feb 26 - Exploring Video Datasets with FiftyOne and Vision-Language Models
62 Teilnehmer aus 48 Gruppen Gruppen veranstalten
Veranstaltet von Berlin AI Machine Learning and Computer Vision Meetup
Details
Join Harpreet Sahota for a virtual workshop to learn how to use Facebook's Action100M dataset and FiftyOne to build an end-to-end workflow.
Date, Time and Location
Feb 26, 2026
9am - 10am Pacific
Online. Register for the Zoom!
Video is the hardest modality to work with. You're dealing with more data, temporal complexity, and annotation workflows that don't scale. This hands-on workshop tackles a practical question: given a large video dataset, how do you understand what's in it without manually watching thousands of clips?
In this workshop you'll learn how to:
- Navigate and explore video data in the FiftyOne App, filter samples, and understand dataset structure
- Compute embeddings with Qwen3-VL to enable semantic search, zero-shot classification, and clustering
- Generate descriptions and localize events using vision-language models like Qwen3-VL and Molmo2
- Visualize patterns in your data through embedding projections and the FiftyOne App
- Evaluate model outputs against Action100M's hierarchical annotations to validate what the models actually capture
By the end of the session, you'll have a reusable toolkit for understanding any video dataset at scale, whether you're curating training data, debugging model performance, or exploring a new domain.
