Feb 26 - Exploring Video Datasets with FiftyOne and Vision-Language Models

Name: Feb 26 - Exploring Video Datasets with FiftyOne and Vision-Language Models
Start: 2026-02-26T12:00:00-05:00
End: 2026-02-26T13:00:00-05:00

Network event

61 attendees from 48 groups hosting

Hosted by NYC Computer Vision in Production

NYC Computer Vision in Production

Details

Join Harpreet Sahota for a virtual workshop to learn how to use Facebook's Action100M dataset and FiftyOne to build an end-to-end workflow.

Date, Time and Location

Feb 26, 2026
9am - 10am Pacific
Online. Register for the Zoom!

Video is the hardest modality to work with. You're dealing with more data, temporal complexity, and annotation workflows that don't scale. This hands-on workshop tackles a practical question: given a large video dataset, how do you understand what's in it without manually watching thousands of clips?

In this workshop you'll learn how to:

Navigate and explore video data in the FiftyOne App, filter samples, and understand dataset structure
Compute embeddings with Qwen3-VL to enable semantic search, zero-shot classification, and clustering
Generate descriptions and localize events using vision-language models like Qwen3-VL and Molmo2
Visualize patterns in your data through embedding projections and the FiftyOne App
Evaluate model outputs against Action100M's hierarchical annotations to validate what the models actually capture

By the end of the session, you'll have a reusable toolkit for understanding any video dataset at scale, whether you're curating training data, debugging model performance, or exploring a new domain.

NYC Computer Vision in Production

Feb 26 - Exploring Video Datasets with FiftyOne and Vision-Language Models

NYC Computer Vision in Production

Details

Related topics

You may also like