Skip to content

Practical Computer Vision with PyTorch and FiftyOne

Photo of Antonio Rueda Toicen
Hosted By
Antonio Rueda T.
Practical Computer Vision with PyTorch and FiftyOne

Details

About the Workshop Series

Join us for a 12-part, hands-on series that teaches you how to work with images, build and train models, and explore tasks like image classification, segmentation, object detection, and image generation. Each session combines straightforward explanations with practical coding in PyTorch and FiftyOne, allowing you to learn core skills in computer vision and apply them to real-world tasks.

These are hands-on maker workshops that make use of GitHub Codespaces, Kaggle notebooks, and Google Colab environments, so no local installation is required (though you are welcome to work locally if preferred!)

Workshop Resources

These hands-on workshops make use of GitHub Codespaces and Google Colab environments, so no local installation is required (though you are welcome to work locally if preferred!).

You can find the workshop materials in this GitHub repository: https://github.com/andandandand/practical-computer-vision

About the Instructor

Antonio Rueda-Toicen, has extensive experience in deploying machine learning models and has taught over 300 professionals. He is currently a Research Scientist at the Hasso Plattner Institute and an AI Engineer and DevRel for Voxel51. Since 2019, he has organized the Berlin Computer Vision Group and taught at Berlin’s Data Science Retreat. He specializes in computer vision, cloud technologies, and machine learning. Antonio is also a certified instructor of deep learning and diffusion models at NVIDIA’s Deep Learning Institute.

Workshop 1 – Foundations of Computer Vision

Tuesday, March 4, 2025

In this session, we’ll introduce core computer vision tasks and fundamental image representations using PIL, NumPy, and PyTorch. Build a simple neural network to classify handwritten digits and explore dataset visualizations with FiftyOne.

Workshop 2 – Neural Networks Fundamentals: Multilayer Perceptrons for Regression

Tuesday, March 11

In this session, we’ll cover the basics of neural networks, build a multilayer perceptron (MLP) and delve into matrix multiplications. Participants will create an MLP for regression (predicting car prices) and inspect its forward pass and predictions with FiftyOne.

Workshop 3 – Training & Evaluation of Classification Models

Tuesday, March 18

In this session, we’ll focus on training feedforward networks and evaluating model performance. Learn about datasets, data loaders, and classification metrics, then apply these concepts by classifying breeds of dogs and analyzing outputs with FiftyOne.

Workshop 4 – Convolutional Neural Networks - LeNet5

Tuesday, March 25

In this session, we’ll explore CNN fundamentals by diving into the mechanics of convolutions and pooling. Participants will implement LeNet5 to grasp how basic convolutional layers operate, with practical insights on the features produced by convolutions using FiftyOne.

Workshop 5 – Training Techniques for Convolutional Networks

Tuesday, April 1

In this session, we’ll examine strategies such as normalization and skip connections for better training. Build a model to run inference on a fruits dataset, then use FiftyOne to inspect predictions and performance.

Workshop 6 – Multi-label Classification with Binary Cross Entropy: Amazon Satellite Images

Tuesday, April 8

In this session, we’ll focus on multi-label classification for real-world imagery. Build a model that identifies multiple environmental labels from Amazon satellite images, applying binary cross-entropy for training and analyzing predictions with FiftyOne.

Workshop 7 – Interpretability in Computer Vision: CAM & Grad-CAM

Tuesday, April 15

In this session, we’ll learn interpretability techniques such as Class Activation Mapping and Grad-CAM. Build a model to analyze predictions and visualize important image regions with FiftyOne.

Workshop 8 – Convolutional Neural Networks: Advanced Upsampling & U-Net for Semantic Segmentation

Tuesday, April 22

In this session, we’ll delve deeper into CNN architectures by focusing on upsampling, channel mixing, and semantic segmentation techniques. Build a U-Net model for semantic image segmentation and inspect its predictions with FiftyOne.

Workshop 9 – Model Optimization: Data Augmentation & Regularization

Tuesday, April 29

In this session, we’ll introduce optimization strategies including data augmentation, dropout, batch normalization, and transfer learning. Implement an augmented network using a fruits dataset with models like VGG-16 and ResNet18, and analyze the results with FiftyOne.

Workshop 10 – Image Embeddings: Zero-shot Classification with CLIP

Tuesday, May 6

In this session, we’ll cover image embeddings, vision transformers, and CLIP. Build a model for zero-shot classification and semantic search using CLIP, then inspect how image embeddings influence predictions with FiftyOne.

Workshop 11 – Object Detection & Instance Segmentation: YOLO in Practice

Tuesday, May 13

In this session, we’ll introduce object detection and instance segmentation methods. Build a YOLO-based network to perform object detection and instance segmentation, and analyze detection results with FiftyOne.

Workshop 12 – Image Generation: Diffusion Models & U-Net

Tuesday, May 20

In this session, we’ll explore image generation techniques using diffusion models. Participants will build a U-Net-based model to generate MNIST-like images and then inspect the generated outputs with FiftyOne.

Photo of Berlin Computer Vision Group group
Berlin Computer Vision Group
See more events
FREE
19 spots left