
What we’re about
🖖 This virtual group is for data scientists, machine learning engineers, and open source enthusiasts.
Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.
- Are you interested in speaking at a future Meetup?
- Is your company interested in sponsoring a Meetup?
This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.
Upcoming events (4+)
See all- Network event421 attendees from 44 groups hostingAug 28 - AI, ML and Computer Vision MeetupLink visible for attendees
Date and Time
Aug 28, 2025 at 10 AM Pacific
Location
Virtual - Register for the Zoom
Exploiting Vulnerabilities In CV Models Through Adversarial Attacks
As AI and computer vision models are leveraged more broadly in society, we should be better prepared for adversarial attacks by bad actors. In this talk, we'll cover some of the common methods for performing adversarial attacks on CV models. Adversarial attacks are deliberate attempts to deceive neural networks into generating incorrect predictions by making subtle alterations to the input data.
About the Speaker
Elisa Chen is a data scientist at Meta on the Ads AI Infra team with 5+ years of experience in the industry.
EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation
Recent 3D deep networks such as SwinUNETR, SwinUNETRv2, and 3D UX-Net have shown promising performance by leveraging self-attention and large-kernel convolutions to capture the volumetric context. However, their substantial computational requirements limit their use in real-time and resource-constrained environments.
In this paper, we propose EffiDec3D, an optimized 3D decoder that employs a channel reduction strategy across all decoder stages and removes the high-resolution layers when their contribution to segmentation quality is minimal. Our optimized EffiDec3D decoder achieves a 96.4% reduction in #Params and a 93.0% reduction in #FLOPs compared to the decoder of original 3D UX-Net. Our extensive experiments on 12 different medical imaging tasks confirm that EffiDec3D not only significantly reduces the computational demands, but also maintains a performance level comparable to original models, thus establishing a new standard for efficient 3D medical image segmentation.
About the Speaker
Md Mostafijur Rahman is a final-year Ph.D. candidate in Electrical and Computer Engineering at The University of Texas at Austin, advised by Dr. Radu Marculescu, where he builds efficient AI methods for biomedical imaging tasks such as segmentation, synthesis, and diagnosis. By uniting efficient architectures with data-efficient training, his work delivers robust and efficient clinically deployable imaging solutions.
What Makes a Good AV Dataset? Lessons from the Front Lines of Sensor Calibration and Projection
Getting autonomous vehicle data ready for real use, whether for training, simulation, or evaluation, isn’t just about collecting LIDAR and camera frames. It’s about making sure every point lands where it should, in the right frame, at the right time.
In this talk, we’ll break down what it actually takes to go from raw logs to a clean, usable AV dataset. We’ll walk through the practical process of validating transformations, aligning coordinate systems, checking intrinsics and extrinsics, and making sure your projected points actually show up on camera images. Along the way, we’ll share a checklist of common failure points and hard-won debugging tips.
Finally, we’ll show how doing this right unlocks downstream tools like Omniverse Nurec and Cosmos—enabling powerful workflows like digital reconstruction, simulation, and large-scale synthetic data generation
About the Speaker
Daniel Gural is a seasoned Machine Learning Engineer at Voxel51 with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data.
Clustering in Computer Vision: From Theory to Applications
In today’s AI landscape, these techniques are crucial. Clustering methods help organize unstructured data into meaningful groups, aiding knowledge discovery, feature analysis, and retrieval-augmented generation. From k-means to DBSCAN and hierarchical approaches like FINCH, selecting the right method is key: including balancing scalability, managing noise sensitivity, and fitting computational demands. This presentation provides an in-depth exploration of the current state-of-the-art of clustering techniques with a strong focus on their applications within computer vision.
About the Speaker
Constantin Seibold leads research group on the development of machine learning methods in the diagnostic and interventional radiology department at the university hospital Heidelberg. His research aims to improve the daily life of both doctors and patients.
- Network event227 attendees from 44 groups hostingAug 29 - Visual Agents Workshop Part 3: Teaching Machines to See and ClickLink visible for attendees
Welcome to the three part Visual Agents Workshop virtual series...your hands on opportunity to learn about visual agents - how they work, how to develop them and how to fine-tune them.
Date and Time
Aug 29, 2025 at 9 AM Pacific
Part 3: Teaching Machines to See and Click - Model Finetuning
From Foundation Models to GUI Specialists
Foundation models, such as Qwen2.5-VL, demonstrate impressive visual understanding, but they require specialized training to master GUI interactions. In this final session, you'll transform a general-purpose vision-language model into a GUI specialist that can navigate interfaces with human-like precision.
We'll explore modern fine-tuning strategies specifically designed for GUI tasks, from selecting the right architecture to handling the unique challenges of coordinate prediction and multi-step reasoning. You'll implement training pipelines that can handle the diverse formats and platforms in your dataset, evaluate models on metrics that actually matter for GUI automation, and deploy your trained model in a real-world testing environment.
About the Instructor
Harpreet Sahota is a hacker-in-residence and machine learning engineer with a passion for deep learning and generative AI. He’s got a deep interest in RAG, Agents, and Multimodal AI.
- Network event289 attendees from 44 groups hostingSept 10 - Visual AI in Manufacturing and Robotics (Day 1)Link visible for attendees
Join us for the first in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics.
Date and Time
Sept 10 at 9 AM Pacific
Location
Virtual. Register for the Zoom!
Detecting the Unexpected: Practical Approaches to Anomaly Detection in Visual Data
Anomaly detection is one of computer vision's most exciting and essential challenges today. From spotting subtle defects in manufacturing to identifying edge cases in model behavior, it is one of computer vision's most exciting and crucial challenges. In this session, we’ll do a hands-on walkthrough using the MVTec AD dataset, showcasing real-world workflows for data curation, exploration, and model evaluation. We’ll also explore the power of embedding visualizations and similarity searches to uncover hidden patterns and surface anomalies that often go unnoticed.
This session is packed with actionable strategies to help you make sense of your data and build more robust, reliable models. Join us as we connect the dots between data, models, and real-world deployment—alongside other experts driving innovation in anomaly detection.
About the Speaker
Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia.
Scaling Synthetic Data for Industrial AI: From CAD to Model in Hours
This talk explores how we generate high-performance computer vision datasets from CAD—without real-world images or manual labeling. We’ll walk through our synthetic data pipeline, including CPU-optimized defect simulation, material variation, and lighting workflows that scale to thousands of renders per part. While Blender plays a role, our focus is on how industrial data (like STEP files) and procedural generation unlock fast, flexible training sets for manufacturing QA, even on modest hardware. If you're working at the edge of 3D, automation, and vision AI—this is for you!
About the Speaker
Matt Puchalski is the founder and CEO of Bucket Robotics, A Y Combinator backed startup building self-serve computer vision systems for manufacturing. Previously, he led robotics reliability at Argo AI and helped build and deploy autonomous vehicles at Stack AV and Uber ATG.
Swarm Intelligence: Solving Complex Industrial Optimization in Seconds
Manufacturing and logistics companies face increasingly complex operational challenges that traditional AI and human planning struggle to solve effectively. Collide Technology harnesses Swarm Intelligence algorithms to transform intractable problems—like scheduling hundreds or thousands of maintenance employees while simultaneously optimizing production capacity, inventory levels, and cross-sector resource allocation—into solutions delivered in seconds rather than weeks.
Unlike rigid Operations Research approaches that require specialized expertise and expensive implementations, our platform democratizes industrial optimization by making sophisticated decision-making accessible to any factory or logistics operation. We deliver holistic, data-driven solutions that optimize across multiple business entities and sectors simultaneously, adapting to real-world constraints and evolving operational needs.
About the Speaker
Frederick Gertz, PhD has worked in AI for the manufacturing space for over a decade delivering data science insights for the medical and pharmaceutical manufacturing space. Prior to that he worked in nanotechnology with a focus on bio-physics and nanomagnetics with his dissertation research on Magnonic Holographic Devices being named as a runner-up for 2014 Physics Breakthrough of the Year by Physics World.
- Network event198 attendees from 44 groups hostingSept 11 - Visual AI in Manufacturing and Robotics (Day 2)Link visible for attendees
Join us for day two in a series of virtual events to hear talks from experts on the latest developments at the intersection of Visual AI, Manufacturing and Robotics.
Date and Time
Sept 11 at 9 AM Pacific
Location
Virtual. Register for the Zoom!
Bringing Specialist Agents to the Physical World to Improve Manufacturing Output
U.S. manufacturing productivity (output per labor hour) has been stagnant since 2008, driven by a stall in technology integration as well as available workers. RIOS Agents are collaborative AI perception and control systems that act as plant managers' eyes on the ground. Our Agents become specialists in a process, observing process steps, reporting on them, and ultimately controlling them by integrating into new or existing equipment. This enables factory production to be optimized in a way that was previously not possible.
About the Speaker
Clinton Smith is the co-founder and CEO of RIOS, whose AI agents watch, optimize and control production in various industrial facilities, including deep penetration into wood products and lumber. Clinton previously was a Senior Member of the Research Staff at Xerox PARC, leading multiple Dept. of Energy & Dept. of Defense projects, and holds a PhD in Electrical Engineering from Princeton University and a BS in Computer Engineering from Georgia Tech.
Accelerating Robotics with Simulation
In this session, Steve Xie, CEO of Lightwheel, shares how simulation-first workflows and high-quality SimReady assets are transforming the development of visual AI in manufacturing. From warehouse anomaly detection to worker safety and object identification, Steve will explore how physics-accurate simulation and synthetic datasets can drive scalable AI training with minimal real-world data. Drawing from Lightwheel’s deployment of robot models like GR00T N1 in factory environments, the talk highlights how unifying vision, language, and action in simulation accelerates real-world deployment while improving safety, generalization, and efficiency.
About the Speaker
Dr. Steve Xie is founder and CEO of Lightwheel, a company leading simulation infrastructure for embodied AI. Steve is a pioneer in generative-AI-powered simulation for robotics. He holds a B.S. from Peking University and a Ph.D. from Columbia University. Steve has led simulation efforts at NVIDIA and Cruise, where he built end-to-end synthetic data pipelines that set industry benchmarks for realism, scalability, and sim2real transfer.
Anomalib 2.0: Edge Inference and Model Deployment
When deploying models for inference, just exporting the models and calling them via the inferencers do not work. There are challenges related to pre-processing and post-processing. Any deviation in these steps during inference impacts performance. This talk is about how we re-designed components of Anomalib to integrate pre and post-processing steps in the model graph.
About the Speaker
Samet Akcay is an AI Research Engineer at Intel who leads ML research and development efforts across multiple Open Edge Platform libraries, including Intel Geti, Datumaro, Anomalib, Training Extensions, and Vision Inference libraries. His research specializes in semi/self-supervised learning, zero/few-shot learning, and multi-modal object and anomaly detection. He is the creator of Anomalib, a major open-source anomaly detection library.
Exploring Robotic Manipulation Datasets using FiftyOne: DROID and Amazon Armbench
About the Speaker
Allen Lee is currently a Machine Learning Engineer at Voxel51. Before that, Allen was the Co-Founder and Consulting Engineer at Leap Scientific LLC, where they provided scientific software consultancy services related to computation, machine learning, and computer vision.
Past events (180)
See all- Network event416 attendees from 44 groups hostingAug 22 - Visual Agent Workshop Part 2: From Pixels to PredictionsThis event has passed