Skip to content

66th Vienna Deep Learning Meetup

Photo of Tom Lidy
Hosted By
Tom L. and 4 others

Details

Dear Deep Learners,

Our next meetup is on May 22, featuring:

  • A deep dive into CLIP embeddings - by Damien Stewart
  • Insights from the winning team of the WSDM Cup Multilingual Chatbot Competition - by Michael Pieler

Please find the details below:

***
Agenda:

18:30

  • Introduction & Welcome by the meetup organizers

18:45

  • Talk 1: Multimodal Meaning: Math at the Limits of CLIP
    by Damian Stewart

19:30

  • Announcements
  • Networking Break & Discussions

20:00

  • Talk 2: Winner of the WSDM Cup Multilingual Chatbot Arena Kaggle Competition: Summary & Details
    by Michael Pieler

20:40

  • Networking & Discussions

22:00 Wrap up & End
***
Talk Details:
Talk 1: Multimodal Meaning: Math at the Limits of CLIP
CLIP embeddings are at the heart of multimodal AI. This talk moves beyond basic applications to delve into how CLIP maps language to images, critically examining the power and unexpected limitations of its mathematical similarity measures through concrete examples. We’ll explore creative ways to manipulate CLIP’s latent space, uncovering untapped potential for generative and search applications. Finally we'll broaden our focus to the challenge of modelling visual meaning more generally. Taking a very gentle step into poststructuralist philosophy, we'll consider the logical limits of systems like CLIP, and the pitfalls of web-scale visual pre-training. By the end we'll have a solid understanding of what CLIP is, what it can and cannot do - and why.

Outline:
1. Understanding CLIP Embeddings:
An introduction to how CLIP models map images and text into a shared latent space: what embeddings are, how they are trained, and what they enable. Examples: image search, text-to-image generation.
2. The Limits of Mathematical Meaning:
How cosine similarity, zero-shot classification, and semantic proximity work, and where these approaches break down. Examples: successful classifications, revealing failures.
3. Manipulating Conceptual Space:
Using embeddings as a creative tool: vector arithmetic (adding, subtracting, blending), semantic pathfinding, interpolation.
Examples: semantic exploration, search augmentations, prompt engineering beyond weighting and word selection.
4. Meaning Beyond Mathematics:
A deeper reflection on relational meaning through CLIP embeddings, drawing (very gently) on post-structuralist philosophy. - How CLIP mirrors Saussurian linguistics, what the means for the influence of culture and ideology on embedding spaces, and why understanding these forces is crucial for building next-generation ML-powered systems.

About the speaker:
Damian Stewart is a software engineer with a distinctive combination of technical depth and humanistic insight. With over 25 years of experience across industry and research, he designs and builds systems that extend capability, foster creativity, and make innovation accessible to a wider world.

Talk 2: Winner of the WSDM Cup Multilingual Chatbot Arena Kaggle Competition: Summary & Details
In the WSDM Cup Multilingual Chatbot Arena Kaggle competition the challenge was to predict which responses users will prefer in a head-to-head battle between LLM-powered chatbots. Our winning solution consisted of the model training which involved a pre-training, a teacher-training, and a distillation stage and an optimized inference setup to get the highest performance in a specific time-frame with the provided hardware.

Michael Pieler, an independent researcher, will give a short presentation of the winning solution of his team. In this talk he will summarize and share some details about the winning entry in this competition.

We are thankfully hosted by University of Vienna Biology center this time. Please note that we cannot provide food or drinks at this meetup.

Looking forward to seeing you at our next meetup,
Your VDLM Organizers

Photo of Vienna Deep Learning Meetup group
Vienna Deep Learning Meetup
See more events
FREE