Skip to content

What we’re about

🖖 This virtual group is for data scientists, machine learning engineers, and open source enthusiasts.

Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.

  • Are you interested in speaking at a future Meetup?
  • Is your company interested in sponsoring a Meetup?

Send me a DM on Linkedin

This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.

Upcoming events

6

See all
  • Network event
    Nov 24 - Best of ICCV (Day 4)
    ‱
    Online

    Nov 24 - Best of ICCV (Day 4)

    Online
    130 attendees from 44 groups

    Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

    When and Where

    Nov 24, 2025
    9 AM Pacific
    Online.
    Register for the Zoom!

    VLM4D: Towards Spatiotemporal Awareness in Vision Language Models

    Are Vision-Language Models Ready for Physical AI? Humans easily understand how objects move, rotate, and shift while current AI models that connect vision and language still make mistakes in what seem like simple situations: deciding “left” versus “right” when something is moving, recognizing how perspective changes, or keeping track of motion over time. To reveal these kinds of limitations, we created VLM4D, a testing suite made up of real-world and synthetic videos, each paired with questions about motion, rotation, perspective, and continuity. When we put modern vision-language models through these challenges, they performed far below human levels, especially when visual cues must be combined or the sequence of events must be maintained. But there is hope: new methods such as reconstructing visual features in 4D and fine-tuning focused on space and time show noticeable improvement, bringing us closer to AI that truly understands a dynamic physical world.

    About the Speaker

    Shijie Zhou is a final-year PhD candidate at UCLA, recipient of the 2026 Dissertation Year Award and the Graduate Dean’s Scholar Award. His research focuses on spatial intelligence, spanning 3D/4D scene reconstruction and generation, vision-language models, generative AI, and interactive agentic systems. His work has been recognized at top conferences including CVPR, ICCV, ECCV, ICLR, and NeurIPS, and has also led to practical impact through research internships at Google and Apple.

    DuoLoRA: Cycle-consistent and Rank-disentangled Content-Style Personalization

    We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters.

    Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks.

    About the Speaker

    Aniket Roy is currently a PhD student in the Computer Science at Johns Hopkins University. Prior to that, he did a Master’s from Indian Institute of Technology Kharagpur. During his Master’s program, he demonstrated strong research capabilities, publishing multiple papers in prestigious conferences and journals (including ICIP, CVPR Workshops, TCSVT, and IWDW). He was recognized with the Best Paper Award at IWDW 2016 and the Markose Thomas Memorial Award for the best research thesis at the Master’s level. Aniket continued to pursue research as a PhD student under the guidance of renowned vision researcher Professor Rama Chellappa at Johns Hopkins University. There, he explored the domains of few-shot learning, multimodal learning, diffusion models, LLMs, LoRA merging through publications in leading venues such as NeurIPS, ICCV, TMLR, WACV and CVPR. He also gained valuable industrial experience through internships at esteemed organizations, including Amazon, Qualcomm, MERL, and SRI International. He was also awarded as an Amazon Fellow (2023-24) at JHU, and invited to attend ICCV'25 doctoral consortium.

    Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting

    CLIP is a foundational model with transferable classification performance in the few-shot setting. Several methods have shown improved performance of CLIP using few-shot examples. However, so far, all these techniques have been benchmarked using standard few-shot datasets. We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability using few-shot examples. As most datasets have been seen by the CLIP model, the resultant setting can be termed as partially transductive. To solve this, we propose a pipeline that uses an unlearning technique to obtain true inductive baselines. In this new inductive setting, the methods show a significant drop in performance (-55% on average among 13 baselines with multiple datasets). We validate the unlearning technique using oracle baselines. An improved few-shot classification technique is proposed that consistently obtains state-of-the-art performance over 13 other recent baseline methods on a comprehensive analysis with 5880 experiments - varying the datasets, differing number of few-shot examples, unlearning setting, and with different seeds. Thus, we identify the issue with the evaluation of CLIP-based few-shot classification, provide a solution using unlearning, propose new benchmarks, and provide an improved method.

    About the Speaker

    Alexey Kravets is a PhD student in AI at the University of Bath, with over five years of experience working as a Lead Data Scientist at Aviva. My current research primarily focuses on vision and language models, few-shot learning, machine unlearning and mechanistic interpretability. Before my PhD, I've led significant machine learning projects in Aviva – a FTSE 100 insurer in the UK – that included the development of NLP tools for insurance predictions. My passion for AI extends into writing, where I regularly share insights through articles on Medium.

    Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)

    Tracking and forecasting the rotation of objects is fundamental in computer vision and robotics, yet SO(3) extrapolation remains challenging as (1) sensor observations can be noisy and sparse, (2) motion patterns can be governed by complex dynamics, and (3) application settings can demand long-term forecasting. This work proposes modeling continuous-time rotational object dynamics on SO(3) using Neural Controlled Differential Equations guided by Savitzky-Golay paths. Unlike existing methods that rely on simplified motion assumptions, our method learns a general latent dynamical system of the underlying object trajectory while respecting the geometric structure of rotations. Experimental results on real-world data demonstrate compelling forecasting capabilities compared to existing approaches.

    About the Speaker

    Lennart Bastian is a PhD candidate at TU Munich's CAMP lab under Prof. Nassir Navab, and an incoming research fellow at Imperial College London. Originally trained in applied mathematics (with early stints in NYC and California's tech scene), he found his calling at the intersection of geometry, machine learning, and clinical applications. His work focuses on making sense of the real world in 3D, teaching computers to understand geometry and what happens in complex surgical environments.

    UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields

    Neural Radiance Field (NeRF)-based segmentation methods focus on object semantics and rely solely on RGB data, lacking intrinsic material properties. This limitation restricts accurate material perception, which is crucial for robotics, augmented reality, simulation, and other applications. We introduce UnMix-NeRF, a framework that integrates spectral unmixing into NeRF, enabling joint hyperspectral novel view synthesis and unsupervised material segmentation. Our method models spectral reflectance via diffuse and specular components, where a learned dictionary of global endmembers represents pure material signatures, and per-point abundances capture their distribution. For material segmentation, we use spectral signature predictions along learned endmembers, allowing unsupervised material clustering. Additionally, UnMix-NeRF enables scene editing by modifying learned endmember dictionaries for flexible material-based appearance manipulation. Extensive experiments validate our approach, demonstrating superior spectral reconstruction and material segmentation to existing methods.

    About the Speaker

    Fabian Perez is a computer science student at Universidad Industrial de Santander (UIS) in Colombia. I am currently a master student. I have strong skills in software development and deep learning. My expertise across both these areas allows me to create innovative solutions by bringing them together.

    • Photo of the user
    1 attendee from this group
  • Network event
    Dec 4 - AI, ML and Computer Vision Meetup
    ‱
    Online

    Dec 4 - AI, ML and Computer Vision Meetup

    Online
    247 attendees from 47 groups

    Join the virtual Meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.

    Register for the Zoom

    Date and Time

    Dec 4, 2025
    9:00 - 11:00 AM Pacific

    Benchmarking Vision-Language Models for Autonomous Driving Safety

    This workshop introduces a unified framework for evaluating how vision-language models handle driving safety. Using an enhanced BDDOIA dataset with scene, weather, and action labels, we benchmark models like Gemini, FastVLM, and Qwen within FiftyOne. Our results show consistent blind spots where models misjudge unsafe situations, highlighting the need for safer and more interpretable AI systems for autonomous driving.

    About the Speaker

    Adonai Vera - Machine Learning Engineer & DevRel at Voxel51. With over 7 years of experience building computer vision and machine learning models using TensorFlow, Docker, and OpenCV. I started as a software developer, moved into AI, led teams, and served as CTO. Today, I connect code and community to build open, production-ready AI — making technology simple, accessible, and reliable.

    TrueRice: AI-Powered Visual Quality Control for Rice Grains and Beyond at Scale

    Agriculture remains one of the most under-digitized industries, yet grain quality control defines pricing, trust, and livelihoods for millions. TrueRice is an AI-powered analyzer that turns a flatbed scanner into a high-precision, 30-second QC engine, replacing the 2+ hours and subjectivity of manual quality inspection.

    Built on a state-of-the-art 8K image processing pipeline with SAHI (Slicing Aided Hyper Inference), it detects fine-grained kernel defects at scale with high accuracy across grain size, shape, breakage, discoloration, and chalkiness. Now being extended to maize and coffee, TrueRice showcases how cross-crop transfer learning and frugal AI engineering can scale precision QC for farmers, millers, and exporters. This talk will cover the design principles, model architecture choices, and a live demonstration, while addressing challenges in data variability, regulatory standards, and cross-crop adaptation.

    About the Speaker

    Sai Jeevan Puchakayala is an Interdisciplinary AI/ML Consultant, Researcher, and Tech Lead at Sustainable Living Lab (SL2) India, where he drives development of applied AI solutions for agriculture, climate resilience, and sustainability. He led the engineering of TrueRice, an award-winning grain quality analyzer that won India’s first International Agri Hackathon 2025.

    WeedNet: A Foundation Model Based Global-to-Local AI Approach for Real-Time Weed Species Identification and Classification

    Early and accurate weed identification is critical for effective management, yet current AI-based approaches face challenges due to limited expert-verified datasets and the high variability in weed morphology across species and growth stages. We present WeedNet, a global-scale weed identification model designed to recognize a wide range of species, including noxious and invasive plants. WeedNet is an end-to-end real-time pipeline that integrates self-supervised pretraining, fine-tuning, and trustworthiness strategies to improve both accuracy and reliability.

    Building on this foundation, we introduce a Global-to-Local strategy: while the Global WeedNet model provides broad generalization, we fine-tune local variants such as Iowa WeedNet to target region-specific weed communities in the U.S. Midwest. Our evaluation addresses both intra-species diversity (different growth stages) and inter-species similarity (look-alike species), ensuring robust performance under real-world variability. We further validate WeedNet on images captured by drones and ground rovers, demonstrating its potential for deployment in robotic platforms. Beyond field applications, we integrate a conversational AI to enable practical decision-support tools for farmers, agronomists, researchers, and land managers worldwide. These advances position WeedNet as a foundational model for intelligent, scalable, and regionally adaptable weed management and ecological conservation.

    About the Speaker

    Timilehin Ayanlade is a Ph.D. candidate in the Self-aware Complex Systems Laboratory at Iowa State University, where his research focuses on developing machine learning and computer vision methods for agricultural applications. His work integrates multimodal data across ground-based sensing, UAV, and satellite with advanced AI models to tackle challenges in weed identification, crop monitoring, and crop yield prediction.

    Memory Matters: Early Alzheimer’s Detection with AI-Powered Mobile Tools

    Advancements in artificial intelligence and mobile technology are transforming the landscape of neurodegenerative disease detection, offering new hope for early intervention in Alzheimer’s.
    By integrating machine learning algorithms with everyday mobile devices, we are entering a new era of accessible, scalable, and non-invasive tools for early Alzheimer’s detection
    In this talk, we’ll cover the potential of AI in health care systems, ethical considerations, plus an architecture, model, datasets and framework deep dive.

    About the Speaker

    Reetam Biswas has more than 18 years of experience in the IT industry as a software architect, currently working on AI.

    • Photo of the user
    • Photo of the user
    3 attendees from this group
  • Network event
    Dec 11 - Visual AI for Physical AI Use Cases
    ‱
    Online

    Dec 11 - Visual AI for Physical AI Use Cases

    Online
    158 attendees from 47 groups

    Join our virtual meetup to hear talks from experts on cutting-edge topics across Visual AI for Physical AI use cases.

    Date, Time and Location

    Dec 11, 2025
    9:00-11:00 AM Pacific
    Online.
    Register for the Zoom!

    From Data to Open-World Autonomous Driving

    Data is key for advances in machine learning, including mobile applications like robots and autonomous cars. To ensure reliable operation, occurring scenarios must be reflected by the underlying dataset. Since the open-world environments can contain unknown scenarios and novel objects, active learning from online data collection and handling of unknowns is required. In this talk we discuss different approach to address this real world requirements.

    About the Speaker

    Sebastian Schmidt is a PhD student at the Data Analytics and Machine Learning group at TU Munich and part of an Industrial PhD Program with the BMW research group. His work is mainly focused on Open-world active learning and perception for autonomous vehicles.

    From Raw Sensor Data to Reliable Datasets: Physical AI in Practice

    Modern mobility systems rely on massive, high-quality multimodal datasets — yet real-world data is messy. Misaligned sensors, inconsistent metadata, and uneven scenario coverage can slow development and lead to costly model failures. The Physical AI Workbench, built in collaboration between Voxel51 and NVIDIA, provides an automated and scalable pipeline for auditing, reconstructing, and enriching autonomous driving datasets.

    In this talk, we’ll show how FiftyOne serves as the central interface for inspecting and validating sensor alignment, scene structure, and scenario diversity, while NVIDIA Neural Reconstruction (NuRec) enables physics-aware reconstruction directly from real-world captures. We’ll highlight how these capabilities support automated dataset quality checks, reduce manual review overhead, and streamline the creation of richer datasets for model training and evaluation.

    Attendees will gain insight into how Physical AI workflows help mobility teams scale, improve dataset reliability, and accelerate iteration from data capture to model deployment — without rewriting their infrastructure.

    About the Speaker

    Daniel Gural leads technical partnerships at Voxel51, where he’s building the Physical AI Workbench, a platform that connects real-world sensor data with realistic simulation to help engineers better understand, validate, and improve their perception systems. With a background in developer relations and computer vision engineering,

    Building Smarter AV Simulation with Neural Reconstruction and World Models

    This talk explores how neural reconstruction and world models are coming together to create richer, more dynamic simulation for scalable autonomous vehicle development. We’ll look at the latest releases in 3D Gaussian splatting techniques and world reasoning and generation, as well as discuss how these technologies are advancing the deployment of autonomous driving stacks that can generalize to any environment. We’ll also cover NVIDIA open models, frameworks, and data to help kickstart your own development pipelines.

    About the Speaker

    Katie Washabaugh is NVIDIA’s Product Marketing Manager for Autonomous Vehicle Simulation, focusing on virtual solutions for real world mobility. A former journalist at publications such as Automotive News and MarketWatch, she joined the NVIDIA team in 2018 as Automotive Content Marketing Manager. Katie holds a B.A. in public policy from the University of Michigan and lives in Detroit.

    Relevance of Classical Algorithms in Modern Autonomous Driving Architectures

    While modern autonomous driving systems increasingly rely on machine learning and deep neural networks, classical algorithms continue to play a foundational role in ensuring reliability, interpretability, and real-time performance. Techniques such as Kalman filtering, A* path planning, PID control, and SLAM remain integral to perception, localization, and decision-making modules. Their deterministic nature and lower computational overhead make them especially valuable in safety-critical scenarios and resource-constrained environments. This talk explores the enduring relevance of classical algorithms, their integration with learning-based methods, and their evolving scope in the context of next-generation autonomous vehicle architectures.

    Prajwal Chinthoju is an Autonomous Driving Feature Development Engineer with a strong foundation in systems engineering, optimization, and intelligent mobility. I specialize in integrating classical algorithms with modern AI techniques to enhance perception, planning, and control in autonomous vehicle platforms.

    • Photo of the user
    • Photo of the user
    • Photo of the user
    3 attendees from this group
  • Network event
    Dec 16 - Building and Auditing Physical AI Pipelines with FiftyOne
    ‱
    Online

    Dec 16 - Building and Auditing Physical AI Pipelines with FiftyOne

    Online
    97 attendees from 47 groups

    This hands-on workshop introduces you to the Physical AI Workbench, a new layer of FiftyOne designed for autonomous vehicle, robotics, and 3D vision workflows. You’ll learn how to bridge the gap between raw sensor data and production-quality datasets, all from within FiftyOne’s interactive interface.

    Date, Time and Location

    Dec 16, 2025
    9:00-10:00 AM Pacific
    Online.
    Register for the Zoom!

    Through live demos, you’ll explore how to:

    • Audit: Automatically detect calibration errors, timestamp misalignments, incomplete frames, and other integrity issues that arise from dataset format drift over time.
    • Generate: Reconstruct and augment your data using NVIDIA pathways such as NuRec, COSMOS, and Omniverse, enabling realistic scene synthesis and physical consistency checks.
    • Enrich: Integrate auto-labeling, embeddings, and quality scoring pipelines to enhance metadata and accelerate model training.
    • Export and Loop Back: Seamlessly export to and re-import from interoperable formats like NCore to verify consistency and ensure round-trip fidelity.

    You’ll gain hands-on experience with a complete physical AI dataset lifecycle—from ingesting real-world AV datasets like nuScenes and Waymo, to running 3D audits, projecting LiDAR into image space, and visualizing results in FiftyOne’s UI. Along the way, you’ll see how Physical AI Workbench automatically surfaces issues in calibration, projection, and metadata—helping teams prevent silent data drift and ensure reliable dataset evolution.

    By the end, you’ll understand how the Physical AI Workbench standardizes the process of building calibrated, complete, and simulation-ready datasets for the physical world.

    Who should attend

    Data scientists, AV/ADAS engineers, robotics researchers, and computer vision practitioners looking to standardize and scale physical-world datasets for model development and simulation.

    About the Speaker

    Daniel Gural leads technical partnerships at Voxel51, where he’s building the Physical AI Workbench, a platform that connects real-world sensor data with realistic simulation to help engineers better understand, validate, and improve their perception systems.

Group links

Organizers

Members

1,221
See all