Skip to content

Berlin Computer Vision Group cover photo

Berlin Computer Vision Group

1,171 members · Public group

Organized by Antonio Rueda Toicen

Share:

Request to join

Request to join

What we’re about

This is a group for anyone interested in computer vision.

All skill levels are welcome.

We host free and practical workshops on computer vision with Python.

Upcoming events (3)

Sat, Sep 27, 2025, 8:30 AM UTCBuilding AI Agents with Multimodal Models: NVIDIA DLI Workshop for Academia
Link visible for attendees
Ready to build cutting-edge AI that understands the world through more than just text?
Join our hands-on workshop and learn how to build neural network agents that can see, read, and reason across multiple data types! We’ll explore advanced techniques like data fusion, OCR, and NVIDIA's powerful AI Blueprints to tackle real-world challenges in robotics, healthcare, and beyond.

We'll start with a robotics use case, apply those principles to supercharge Large Language Models (LLMs), and finish by orchestrating a team of models to work together seamlessly. You can find the full workshop description here: https://learn.nvidia.com/courses/course-detail?course_id=course-v1:DLI+C-FX-17+V1

Who is this for
This certification workshop is completely free for academic staff and students. A valid academic email address is required to access the NVIDIA DLI compute environment. If you are in industry, please contact info@kineto.ai to request a quote for you or your team.

Register
Please remember to fill in the form with your current institutional email.
https://forms.gle/YEETAidJqUzEkNS56 the access code to the NVIDIA DLI Platform will be shared through your academic email.

What You Will Learn

🧠 Data Fusion Mastery: Discover the difference between early, late, and intermediate fusion to combine camera, LiDAR, and other data types.

📄 PDF & Document AI: Learn to extract and process text from PDFs using Optical Character Recognition (OCR).

🌐 Agent Orchestration: Understand how to make multiple AI models collaborate to solve complex problems.

🪜 NVIDIA AI Blueprints: Get hands-on with the Video Search and Summarization (VSS) blueprint to build powerful applications.

🗣️ Vision-Language Models: Turn a standard Language Model into a Vision Language Model (VLM) that can process images and documents.

Agenda
Part 1: Early & Late Fusion (1.0 hr)

Fuse camera and LiDAR data to predict object positions.

Prep various data types for your neural networks.

Part 2: Intermediate Fusion (1.0 hr)

Dive into the theory of multimodal model architecture.

Train a Contrastive Pretraining model and create a vector database.

Part 3: Cross-modal Projection (2.0 hrs)

Transform an LLM into a Vision Language Model (VLM).

Process PDFs like a pro with OCR tools.

Part 4: Model Orchestration (2.0 hrs)

Analyze video with Cosmos Nemotron.

Use the VSS Blueprint to find answers in video content.

Part 5: Final Assessment (1.0 hr)

Put your new skills to the test by converting a pre-trained model to accept a new data type.
18 attendees+13
Sat, Oct 4, 2025, 8:30 AM UTCGenerative AI with Diffusion Models: NVIDIA Certification Workshop for Academia
Link visible for attendees
https://learn.nvidia.com/courses/course-detail?course_id=course-v1:DLI+C-FX-08+V1

Thanks to improvements in computing power and scientific theory, generative AI is more accessible than ever before. Generative AI plays a significant role across industries due to its numerous applications, such as creative content generation, data augmentation, simulation and planning, anomaly detection, drug discovery, personalized recommendations, and more.

In this course, learners will take a deeper dive into denoising diffusion models, which are a popular choice for text-to-image pipelines.

This course is online and offered free of charge to academic participants (students and staff). If you are in industry, and would like to take the training, please send an email to info@kineto.ai

Important
Access links to the NVIDIA DLI environment required to complete the graded assessment and obtain a certification will be sent to your academic email shortly before the event, please be sure to fill the access form to gain access to it.

Learning Objectives

Build a U-Net to generate images from pure noise

Improve the quality of generated images with the denoising diffusion process

Control the image output with context embeddings

Generate images from English text prompts using the Contrastive Language–Image Pretraining (CLIP) neural network

Topics Covered

U-Nets

Diffusion

CLIP

Text-to-image Models

From U-net to Diffusion

Build a U-Net architecture

Train a model to remove noise from an image

Diffusion Models

Define the forward diffusion function

Update the U-Net architecture to accommodate a timestep

Define a reverse diffusion function

Optimizations

Implement Group Normalization

Implement GELU

Implement Rearrange Pooling

Implement Sinusoidal Position Embeddings

Classifier-free Diffusion Guidance

Add categorical embeddings to a U-Net

Train a model with a Bernoulli mask

CLIP

Learn how to use CLIP Encodings

Use CLIP to create a text-to-image neural network
8 attendees+3
Sat, Oct 25, 2025, 8:30 AM UTCFundamentals of Deep Learning: NVIDIA DLI Certification Workshop for Academia
Link visible for attendees
https://www.nvidia.com/en-eu/training/instructor-led-workshops/fundamentals-of-deep-learning/

Deep Learning with PyTorch Workshop

In this workshop, you’ll learn how deep learning works through hands-on exercises in computer vision and natural language processing. You’ll train deep learning models from scratch, learning tools and tricks to achieve highly accurate results. You’ll also learn to leverage freely available, state-of-the-art pre-trained models to save time and get your deep learning application up and running quickly.

Learning Objectives

By participating in this workshop, you’ll:

Learn the fundamental techniques and tools required to train a deep learning model

Gain experience with common deep learning data types and model architectures

Enhance datasets through data augmentation to improve model accuracy

Leverage transfer learning between models to achieve efficient results with less data and computation

Build confidence to take on your own project with a modern deep learning framework

Download workshop datasheet (PDF, 318 KB)

Preparation for the Workshop

Fill in the form at https://forms.gle/8iNZN3PToUh6iveC9 to gain access codes to the event (will be emailed shortly before it)

Install Google Chrome or Mozilla Firefox to use the NVIDIA DLI Environment

Create an account at https://learn.nvidia.com/

Mechanics of Deep Learning
Explore the fundamental mechanics and tools involved in successfully training deep neural networks:

Train your first computer vision model to learn the process of training

Introduce convolutional neural networks to improve accuracy of predictions in vision applications

Apply data augmentation to enhance a dataset and improve model generalization

Pre-trained Models
Leverage pre-trained models to solve deep learning challenges quickly. Train recurrent neural networks on sequential data:

Integrate a pre-trained image classification model to create an automatic doggy door

Leverage transfer learning to create a personalized doggy door that only lets in your dog

Assessment Challenge: Image Classification
Apply computer vision to create a model that distinguishes between fresh and rotten fruit:

Create and train a model that interprets color images

Build a data generator to make the most out of small datasets

Improve training speed by combining transfer learning and feature extraction

Discuss advanced neural network architectures and recent areas of research where students can further improve their skills

Final Review

Review key learnings and answer questions

Complete the assessment and earn a certificate

Complete the workshop survey

Learn how to set up your own AI application development environment
9 attendees+4

Past events (84)

Fri, Aug 1, 2025, 10:00 AM CESTVoxel51 + HPI Hackathon - Computer Vision for Food Waste Reduction
This event has passed
6 attendees+1

Related topics