About the Workshop Series
Join us for a 12-part, hands-on series that teaches you how to work with images, build and train models, and explore tasks like image classification, segmentation, object detection, and image generation. Each session combines straightforward explanations with practical coding in PyTorch and FiftyOne, allowing you to learn core skills in computer vision and apply them to real-world tasks.
These are hands-on maker workshops that make use of GitHub Codespaces, Kaggle notebooks, and Google Colab environments, so no local installation is required (though you are welcome to work locally if preferred!)
Workshop Resources
These hands-on workshops make use of GitHub Codespaces and Google Colab environments, so no local installation is required (though you are welcome to work locally if preferred!).
You can find the workshop materials in this GitHub repository: https://github.com/andandandand/practical-computer-vision
About the Instructor
Antonio Rueda-Toicen, has extensive experience in deploying machine learning models and has taught over 300 professionals. He is currently a Research Scientist at the Hasso Plattner Institute and an AI Engineer and DevRel for Voxel51. Since 2019, he has organized the Berlin Computer Vision Group and taught at Berlin’s Data Science Retreat. He specializes in computer vision, cloud technologies, and machine learning. Antonio is also a certified instructor of deep learning and diffusion models at NVIDIA’s Deep Learning Institute.
Workshop 1 – Foundations of Computer Vision
Tuesday, March 4, 2025
In this session, we’ll introduce core computer vision tasks and fundamental image representations using PIL, NumPy, and PyTorch. Build a simple neural network to classify handwritten digits and explore dataset visualizations with FiftyOne.
Workshop 2 – Neural Networks Fundamentals: Multilayer Perceptrons for Regression
Tuesday, March 11
In this session, we’ll cover the basics of neural networks, build a multilayer perceptron (MLP) and delve into matrix multiplications. Participants will create an MLP for regression (predicting car prices) and inspect its forward pass and predictions with FiftyOne.
Workshop 3 – Training & Evaluation of Classification Models
Tuesday, March 18
In this session, we’ll focus on training feedforward networks and evaluating model performance. Learn about datasets, data loaders, and classification metrics, then apply these concepts by classifying breeds of dogs and analyzing outputs with FiftyOne.
Workshop 4 – Convolutional Neural Networks - LeNet5
Tuesday, March 25
In this session, we’ll explore CNN fundamentals by diving into the mechanics of convolutions and pooling. Participants will implement LeNet5 to grasp how basic convolutional layers operate, with practical insights on the features produced by convolutions using FiftyOne.
Workshop 5 – Training Techniques for Convolutional Networks
Tuesday, April 1
In this session, we’ll examine strategies such as normalization and skip connections for better training. Build a model to run inference on a fruits dataset, then use FiftyOne to inspect predictions and performance.
Workshop 6 – Multi-label Classification with Binary Cross Entropy: Amazon Satellite Images
Tuesday, April 8
In this session, we’ll focus on multi-label classification for real-world imagery. Build a model that identifies multiple environmental labels from Amazon satellite images, applying binary cross-entropy for training and analyzing predictions with FiftyOne.
Workshop 7 – Interpretability in Computer Vision: CAM & Grad-CAM
Tuesday, April 15
In this session, we’ll learn interpretability techniques such as Class Activation Mapping and Grad-CAM. Build a model to analyze predictions and visualize important image regions with FiftyOne.
Workshop 8 – Convolutional Neural Networks: Advanced Upsampling & U-Net for Semantic Segmentation
Tuesday, April 22
In this session, we’ll delve deeper into CNN architectures by focusing on upsampling, channel mixing, and semantic segmentation techniques. Build a U-Net model for semantic image segmentation and inspect its predictions with FiftyOne.
Workshop 9 – Model Optimization: Data Augmentation & Regularization
Tuesday, April 29
In this session, we’ll introduce optimization strategies including data augmentation, dropout, batch normalization, and transfer learning. Implement an augmented network using a fruits dataset with models like VGG-16 and ResNet18, and analyze the results with FiftyOne.
Workshop 10 – Image Embeddings: Zero-shot Classification with CLIP
Tuesday, May 6
In this session, we’ll cover image embeddings, vision transformers, and CLIP. Build a model for zero-shot classification and semantic search using CLIP, then inspect how image embeddings influence predictions with FiftyOne.
Workshop 11 – Object Detection & Instance Segmentation: YOLO in Practice
Tuesday, May 13
In this session, we’ll introduce object detection and instance segmentation methods. Build a YOLO-based network to perform object detection and instance segmentation, and analyze detection results with FiftyOne.
Workshop 12 – Image Generation: Diffusion Models & U-Net
Tuesday, May 20
In this session, we’ll explore image generation techniques using diffusion models. Participants will build a U-Net-based model to generate MNIST-like images and then inspect the generated outputs with FiftyOne.