
De qué se trata
đź–– This group is for data scientists, machine learning engineers, and open source enthusiasts.
Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.
- Are you interested in speaking at a future Meetup?
- Is your company interested in sponsoring a Meetup?
This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.
PrĂłximos eventos (4+)
Ver todo- Evento de red166 asistentes de 36 grupos de organizadoresMay 30 - Best of WACV 2025Solo los asistentes pueden ver el enlace
This is a virtual event taking place on May 29, 2025 at 9 AM Pacific.
Welcome to the Best of WACV 2025 virtual series that highlights some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials.
Iris Recognition for Infants
Non-invasive, efficient, physical token-less, accurate, and stable identification methods for newborns may prevent baby swapping at birth, limit baby abductions, and improve post-natal health monitoring across geographies, within both formal (e.g., hospitals) and informal (e.g., humanitarian and fragile settings) health sectors. This talk explores the feasibility of applying iris recognition as a biometric identifier for 4-6 week old infants.
About the Speaker
Rasel Ahmed Bhuiyan is a fourth-year PhD student at the University of Notre Dame, supervised by Adam Czajka. His research focuses on iris recognition at life extremes, specifically infants and post-mortem cases.
Advancing Autonomous Simulation with Generative AI
Autonomous vehicle (AV) technology, including self-driving systems, is rapidly advancing but is hindered by the limited availability of diverse and realistic driving data. Traditional data collection methods, which deploy sensor-equipped vehicles to capture real-world scenarios, are costly, time-consuming, and risk-prone, especially for rare but critical edge cases.
We introduce the Autonomous Temporal Diffusion Model (AutoTDM), a foundation model that generates realistic, physics-consistent driving videos. By leveraging natural language prompts and integrating semantic sensory data inputs like depth maps, edge detection, segmentation maps, and camera positions, AutoTDM produces high-quality, consistent driving scenes that are controllable and adaptable to various simulation needs. This capability is crucial for developing robust autonomous navigation systems, as it allows for the simulation of long-duration driving scenarios under diverse conditions.
AutoTDM offers a scalable, cost-effective solution for training and validating autonomous systems, enhancing safety and accelerating industry advancements by simulating comprehensive driving scenarios in a controlled virtual environment, which marks a significant leap forward in autonomous vehicle development.
About the Speaker
Xiangyu Bai is a second-year PhD candidate at ACLab, Northeastern University, specializing in generative AI and computer vision, with a focus on autonomous simulation. His research centers on developing innovative, physics-aware generative vision frameworks that enhance simulation systems to provide realistic, scalable solutions for autonomous navigation. He has authored six papers in top-tier conferences and journals, including three as first author, highlighting his significant contributions to the field.
Classification of Infant Sleep–Wake States from Natural Overnight In-Crib Sleep Videos​
Infant sleep plays a vital role in brain development, but conventional monitoring techniques are often intrusive or require extensive manual annotation, limiting their practicality. To address this, we develop a deep learning model that classifies infant sleep–wake states from 90-second video segments using a two-stream spatiotemporal architecture that fuses RGB frames with optical flow features. The model achieves over 80% precision and recall on clips dominated by a single state and demonstrates robust performance on more heterogeneous clips, supporting future applications in sleep segmentation and sleep quality assessment from full overnight recordings.
About the Speaker
Shayda Moezzi is pursuing a PhD in Computer Engineering at Northeastern University in the Augmented Cognition Lab, under the guidance of Professor Sarah Ostadabbas. Her current research focuses on computer vision techniques for video segmentation.
Leveraging Vision Language Models for Specialized Agricultural Tasks
Traditional plant stress phenotyping requires experts to annotate thousands of samples per task – a resource-intensive process limiting agricultural applications. We demonstrate that state-of-the-art Vision Language Models (VLMs) can achieve F1 scores of 73.37% across 12 diverse plant stress tasks using just a handful of annotated examples.
This work establishes how general-purpose VLMs with strategic few-shot learning can dramatically reduce annotation burden while maintaining accuracy, transforming specialized agricultural visual tasks.About the Speaker
Muhammad Arbab Arshad is a Ph.D. candidate in Computer Science at Iowa State University, affiliated with AIIRA. His research focuses on Generative AI and Large Language Models, developing methodologies to leverage state-of-the-art AI models with limited annotated data for specialized tasks.
- Evento de red104 asistentes de 37 grupos de organizadoresJune 17 - Databricks Mosaic AI + FiftyOne: Scaling Physical AISolo los asistentes pueden ver el enlace
When and Where
June 17, 2025 | 9:00 AM Pacific
About the Workshop
Ever tried to find something specific in your image or video datasets that weren’t already labeled? It’s always been a frustrating and time-consuming experience.
Until now.
When you combine the FiftyOne computer vision toolkit with Mosaic AI from Databricks, you unlock lightning-fast vector search for the millions of images and videos in your data lake – to find exactly what you are looking for, even if there’s no label for it.
In this technical session with machine learning engineer Dan Gural, he’ll show you how the Mosaic AI integration works inside FiftyOne, featuring real-world mobility and autonomous use cases where you search massive, state-of-the-art datasets in just seconds.
If you’re working with edge cases, building smarter datasets, or just curious about what Mosaic AI vector search can do for you, this one’s for you.
- Evento de red42 asistentes de 38 grupos de organizadoresJune 18 - Getting Started with FiftyOne WorkshopSolo los asistentes pueden ver el enlace
When and Where
June 18, 2025 | 9:00 – 10:30 AM Pacific
About the Workshop
Want greater visibility into the quality of your computer vision datasets and models? Then join us for this free 90-minute, hands-on workshop to learn how to leverage the open source FiftyOne computer vision toolset.
At the end of the workshop you’ll be able to:- Object detection
- Embeddings
- Mistakenness
- Deduplication
This workshop will explore the importance of taking a data-centric approach to computer vision workflows. We will start with importing and exploring visual data, then move to querying and filtering. Next, we’ll look at ways to extend FiftyOne’s functionality and simplify tasks using plugins and native integrations. We’ll generate candidate ground truth labels, and then wrap things up by evaluating the results of fine tuning a foundational model.
Prerequisites: working knowledge of Python and basic computer vision concepts.
All attendees will get access to the tutorials, videos, and code examples used in the workshop
About the Instructor
Antonio Rueda-Toicen, an AI Engineer in Berlin, has extensive experience in deploying machine learning models and has taught over 300 professionals. He is currently a Research Scientist at the Hasso Plattner Institute. Since 2019, he has organized the Berlin Computer Vision Group and taught at Berlin’s Data Science Retreat. He specializes in computer vision, cloud technologies, and machine learning. Antonio is also a certified instructor of deep learning and diffusion models in NVIDIA’s Deep Learning Institute.
- Evento de red75 asistentes de 39 grupos de organizadoresJune 19 - AI, ML and Computer Vision MeetupSolo los asistentes pueden ver el enlace
When
June 19, 2025 | 10:00 AM PacificWhen and Where
Online. Register for the Zoom.Multi-Modal Rare Events Detection for SAE L2+ to L4
A burst tire on the highway or a fallen motorbiker occur rarely and thus pose extra efforts to Autonomous vehicles. Methods to tackle such edge cases in road scenarios are explained.
About the Speaker
Wolfgang Schulz is Product Owner for Lidar Perception at Continental. He engages in the automotive industry since 2005. With his team he currently works on components for an SAE L4 stack.
Voxel51 + NVIDIA Omniverse: Exploring the Future of Synthetic Data
Join us for a lightning talk on one of the most exciting frontiers in Visual AI: synthetic data. We’ll showcase a sneak peek of the new integration between FiftyOne and NVIDIA Omniverse, featuring fully synthetic downtown scenes of Santa Jose. NVIDIA Omniverse is enabling the generation of ultra-precise synthetic sensor data, including LiDAR, RADAR, and camera feeds, while FiftyOne is making it easy to extract value from these rich datasets. Come see the future of sensor simulation and dataset curation in action, with pixel-perfect labels to match.
About the Speaker
Daniel Gural is a seasoned Machine Learning Engineer at Voxel51 with a strong passion for empowering Data Scientists and ML Engineers to unlock the full potential of their data.
Eventos pasados (21)
Ver todo- Evento de red207 asistentes de 36 grupos de organizadoresMay 29 - Best of WACV 2025Este evento ya se ha celebrado