Part of AI, Machine Learning and Computer Vision Meetup Network - 48 groups

London AI, Machine Learning and Computer Vision Meetup

4.5•140 ratings

About us

🖖 This virtual group is for data scientists, machine learning engineers, and open source enthusiasts.

Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.

Are you interested in speaking at a future Meetup?
Is your company interested in sponsoring a Meetup?

Send me a DM on Linkedin

This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.

Upcoming events

See all

Network event
March 5 - AI, ML and Computer Vision Meetup
Thu, Mar 5 · 5:00 PM GMT
·
Online
Online
405 attendees from 47 groups
Join our virtual meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.

Date and Location

Mar 5, 2026
9 - 11 AM Pacific
Online. Register for the Zoom!

MOSPA: Human Motion Generation Driven by Spatial Audio

Enabling virtual humans to dynamically and realistically respond to diverse auditory stimuli remains a key challenge in character animation, demanding the integration of perceptual modeling and motion synthesis. Despite its significance, this task remains largely unexplored. Most previous works have primarily focused on mapping modalities like speech, audio, and music to generate human motion. As of yet, these models typically overlook the impact of spatial features encoded in spatial audio signals on human motion.

To bridge this gap and enable high-quality modeling of human movements in response to spatial audio, we introduce the first comprehensive Spatial Audio-Driven Human Motion (SAM) dataset, which contains diverse and high-quality spatial audio and motion data. For benchmarking, we develop a simple yet effective diffusion-based generative framework for human MOtion generation driven by SPatial Audio, termed MOSPA, which faithfully captures the relationship between body motion and spatial audio through an effective fusion mechanism. Once trained, MOSPA can generate diverse, realistic human motions conditioned on varying spatial audio inputs. We perform a thorough investigation of the proposed dataset and conduct extensive experiments for benchmarking, where our method achieves state-of-the-art performance on this task.

About the Speaker

Zhiyang (Frank) Dou is a Ph.D. student at MIT CSAIL, advised by Prof. Wojciech Matusik. I work with the Computational Design and Fabrication Group and the Computer Graphics Group.

Securing the Autonomous Future: Navigating the Intersection of Agentic AI, Connected Devices, and Cyber Resilience

With billions of devices now in our infrastructure and emerging as autonomous agents (AI), we face a very real question: How can we create intelligent systems that are both secure and trusted? This talk will explore the intersection of agentic AI and IoT and demonstrate how the same AI systems can provide robust defense mechanisms. At its core, however, this is a challenge about trusting people with technology, ensuring their safety, and providing accountability. Therefore, creating a new way of thinking is required, one in which security is built in, and where autonomous action has oversight; and, ultimately, innovation leads to greater human well-being.

About the Speaker

Samaresh Kumar Singh is an engineering principal at HP Inc. with more than 21 years of experience in designing and implementing large-scale distributed systems, cloud native platform systems, and edge AI / ML systems. His expertise includes agentic AI systems, GenAI / LLMs, Edge AI, federated and privacy preserving learning, and secure hybrid cloud / edge computing.

Plugins as Products: Bringing Visual AI Research into Real-World Workflows with FiftyOne

Visual AI research often introduces new datasets, models, and analysis methods, but integrating these advances into everyday workflows can be challenging. FiftyOne is a data-centric platform designed to help teams explore, evaluate, and improve visual AI, and its plugin ecosystem is how the platform scales beyond the core. In this talk, we explore the FiftyOne plugin ecosystem from both perspectives: how users apply plugins to accelerate data-centric workflows, and how researchers and engineers can package their work as plugins to make it easier to share, reproduce, and build upon. Through practical examples, we show how plugins turn research artifacts into reusable components that integrate naturally into real-world visual AI workflows.

About the Speaker

Adonai Vera - Machine Learning Engineer & DevRel at Voxel51. With over 7 years of experience building computer vision and machine learning models using TensorFlow, Docker, and OpenCV.

Transforming Business with Agentic AI

Agentic AI is reshaping business operations by employing autonomous systems that learn, adapt, and optimize processes independently of human input. This session examines the essential differences between traditional AI agents and Agentic AI, emphasizing their significance for project professionals overseeing digital transformation initiatives. Real-world examples from eCommerce, insurance, and healthcare illustrate how autonomous AI achieves measurable outcomes across industries. The session addresses practical orchestration patterns in which specialized AI agents collaborate to resolve complex business challenges and enhance operational efficiency. Attendees will receive a practical framework for identifying high-impact use cases, developing infrastructure, establishing governance, and scaling Agentic AI within their organizations.

About the Speaker

Joyjit Roy is a senior technology and program management leader with over 21 years of experience delivering enterprise digital transformation, cloud modernization, and applied AI programs across insurance, financial services, and global eCommerce.
28 attendees from this group
Network event
March 11 - Strategies for Validating World Models and Action-Conditioned Video
Wed, Mar 11 · 5:00 PM GMT
·
Online
Online
142 attendees from 47 groups
Join us for a one hour hands-on workshop where we will explore emerging challenges in developing and validating world foundation models and video-generation AI systems for robotics and autonomous vehicles.

Time and Location

Mar 11, 2026
10-11am PST
Online, Register for the Zoom!

Industries from robotics to autonomous vehicles are converging on world foundation models (WFMs) and action-conditioned video generation, where the challenge is predicting physics, causality, and intent. But this shift has created a massive new bottleneck: validation.

How do you debug a model that imagines the future? How do you curate petabyte-scale video datasets to capture the "long tail" of rare events without drowning in storage costs? And how do you ensure temporal consistency when your training data lives in scattered data lakes?

In this session, we explore technical workflows for the next generation of Visual AI. We will dissect the "Video Data Monster," demonstrating how to build feedback loops that bridge the gap between generative imagination and physical reality. Learn how leading teams are using federated data strategies and collaborative evaluation to turn video from a storage burden into a structured, queryable asset for embodied intelligence.

About the Speaker

Nick Lotz is chemical process engineer-turned-developer who is currently a Technical Marketing Engineer at Voxel51. He is particularly interested in bringing observability and security to all layers of the AI stack.
9 attendees from this group
Network event
March 12 - Agents, MCP and Skills Virtual Meetup
Thu, Mar 12 · 7:00 PM GMT
·
Online
Online
513 attendees from 48 groups
Join us for a special edition of the AI, ML and Computer Vision Meetup where we will focus on Agents, MCP and Skills!

Date, Time, Location

Mar 12, 2026
9 - 11 AM PST
Online. Register for the Zoom!

Agents Building Agents on the Hugging Face Hub

Discover how coding agents can run or support your fine-tuning experiments. From quick dataset validation and preprocessing, to optimal GPU hardware selection, to automated job submission based on metric tracking, to evaluation. Ben will demonstrate how Hugging Face skills can be used to define best practices for agents to support machine learning experiments. Bring Claude, Codex, or Mistral Vibes, and we’ll show you to get it training models with GRPO, SFT, and DPO.

About the Speaker

Ben Burtenshaw is a Machine Learning Engineer at Hugging Face, focusing on building agents with fine-tuning and reinforcement learning. He led educational projects like the Agents Course, the MCP Course, and the LLM course, which bridge the gap between complex Reinforcement Learning (RL) techniques and practical application. Ben focuses on democratizing access to efficient AI, empowering the community to align, evaluate, and deploy transparent agentic systems.

Claude Code Templates

This talk explores how to configure and align Claude Code agents using templates and custom components. I'll demonstrate practical configuration patterns that ensure your CLI agent executes exactly what you intend, covering Skills setup, hooks implementation, and template customization. Drawing from real-world examples building Claude Code Templates, attendees will learn how to structure their agent configurations for consistent, reliable behavior and create reusable components that maintain alignment across different use cases.

About the Speaker

Daniel Avila is an AI Engineer at Hedgineer building agentic systems and creator of Claude Code Templates.

Move Faster in Computer Vision by Teaching Agents to See Your Data

Computer vision teams spend too much time writing scripts just to find bad labels, blurry images, and edge cases. In this talk, I’ll show how to move that work to agents by using FiftyOne as a visual operating system. With Skills and MCP, agents can see inside your datasets, explore them visually, and handle common data cleanup tasks, so you can spend less time on data and more time shipping models.

About the Speaker

Adonai Vera - Machine Learning Engineer & DevRel at Voxel51. With over 7 years of experience building computer vision and machine learning models using TensorFlow, Docker, and OpenCV. I started as a software developer, moved into AI, led teams, and served as CTO. Today, I connect code and community to build open, production-ready AI, making technology simple, accessible, and reliable.

Skills As Documentation

Skills are self-contained recipes - each one a piece of a larger puzzle. Instead of trying to modify human-centric documentation to better fit agents, skills let us build capabilities into our agents directly. This talk will showcase how to think about leveraging skills to enhance how users interact with your software!

About the Speaker

Chris Alexiuk is a deep learning developer advocate at NVIDIA, working on creating technical assets that help developers use the incredible suite of AI tools available at NVIDIA. Chris comes from a machine learning and data science background, and he is obsessed with everything and anything about large language models.
27 attendees from this group
Network event
March 18 - Vibe Coding Production-Ready Computer Vision Pipelines Workshop
Wed, Mar 18 · 4:00 PM GMT
·
Online
Online
276 attendees from 48 groups
Join us for an interactive workshop where we'll build production-ready computer vision pipelines using vibe coded FiftyOne plugins.

Register for the Zoom

Plugins enable you to customize the open-source FiftyOne computer vision app to match your exact workflow by easily incorporating data annotation, curation, model evaluation and inference.

We'll demonstrate how FiftyOne Skills and the MCP Server can streamline the journey from prototype to production-ready pipelines, keeping your development flow intact.

Perfect for open-source contributors, researchers, and enterprise teams seeking hands-on experience. All participants receive slides, notebooks, and access to GitHub repositories and videos from the workshop.
19 attendees from this group