67th Vienna Deep Learning Meetup: CLIP Deep Dive


Details
Dear Deep Learners,
Our last meetup before summer takes place on June 16. Here is the agenda in short:
- A Deep Dive into CLIP embeddings - by Damian Stewart
- LLMs for Abusive Language Detection - by Julia Pardatscher (cancelled)
Please find the details below:
***
Agenda:
18:30
- Introduction & Welcome by the meetup organizers
18:45
- Talk 1: A Deep Dive into CLIP embeddings
by Damian Stewart
19:30
- Announcements
- Networking Break & Discussions
20:00
- ** CANCELLED ** Talk 2: Revisiting Implicitly Abusive Language Detection: Evaluating LLMs in Zero-Shot and Few-Shot Settings
by Julia Pardatscher
-> this talk will be postponed to fall
20:40
- Networking & Discussions
22:00 Wrap up & End
***
Talk Details:
Talk 1: A Deep Dive into CLIP embeddings
CLIP embeddings are at the heart of multimodal AI. This talk moves beyond basic applications to delve into how CLIP maps language to images, critically examining the power and unexpected limitations of its mathematical similarity measures through concrete examples. We’ll explore creative ways to manipulate CLIP’s latent space, uncovering untapped potential for generative and search applications. Finally we'll broaden our focus to the challenge of modelling visual meaning more generally. Taking a very gentle step into poststructuralist philosophy, we'll consider the logical limits of systems like CLIP, and the pitfalls of web-scale visual pre-training. By the end we'll have a solid understanding of what CLIP is, what it can and cannot do - and why.
Talk Outline:
1. Understanding CLIP Embeddings:
An introduction to how CLIP models map images and text into a shared latent space: what embeddings are, how they are trained, and what they enable. Examples: image search, text-to-image generation.
2. The Limits of Mathematical Meaning:
How cosine similarity, zero-shot classification, and semantic proximity work, and where these approaches break down. Examples: successful classifications, revealing failures.
3. Manipulating Conceptual Space:
Using embeddings as a creative tool: vector arithmetic (adding, subtracting, blending), semantic pathfinding, interpolation.
Examples: semantic exploration, search augmentations, prompt engineering beyond weighting and word selection.
4. Meaning Beyond Mathematics:
A deeper reflection on relational meaning through CLIP embeddings, drawing (very gently) on post-structuralist philosophy. - How CLIP mirrors Saussurian linguistics, what the means for the influence of culture and ideology on embedding spaces, and why understanding these forces is crucial for building next-generation ML-powered systems.
About the speaker:
Damian Stewart is a software engineer with a distinctive combination of technical depth and humanistic insight. With over 25 years of experience across industry and research, he designs and builds systems that extend capability, foster creativity, and make innovation accessible to a wider world.
We are very much looking forward to seeing you at our next meetup.
Thanks to SBA Research drinks and snacks will be provided at this meetup.
Your VDLM Organizers

67th Vienna Deep Learning Meetup: CLIP Deep Dive