Skip to content

What we’re about

🖖 This group is for data scientists, machine learning engineers, and open source enthusiasts.

Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.

  • Are you interested in speaking at a future Meetup?
  • Is your company interested in sponsoring a Meetup?

Send me a DM on Linkedin

This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.

Upcoming events

7

See all
  • Network event
    Dec 12 - AI, ML and Computer Vision Meetup en Espanol
    Online

    Dec 12 - AI, ML and Computer Vision Meetup en Espanol

    Online
    29 attendees from 5 groups

    Join the Meetup to hear talks in Spanish from experts on cutting-edge topics across AI, ML, and computer vision...en Espanol!

    Date and Location

    Dec 12, 2025
    9 AM - Noon Pacific
    Online.
    Register for the Zoom!

    IA generativa en DevSecOps: automatización inteligente de pipelines

    Descubre cómo la inteligencia artificial generativa está revolucionando la creación y gestión de pipelines en Azure DevOps y Github Actions. En esta sesión práctica, exploraremos cómo automatizar la generación de pipelines CI/CD que cumplan automáticamente con los estándares corporativos, utilizando plantillas inteligentes y análisis predictivo de código.
    Aprenderás a implementar un sistema que interpreta cambios en pull requests, predice problemas de calidad y seguridad, y garantiza el cumplimiento normativo de forma proactiva. Veremos como documentar políticas para que la IA tome decisiones coherentes sobre advertencias y errores.
    ---
    Dachi Gogotchuri is the founder of Arcasiles Group and Platform Engineering Lead at Nationale Nederlanden Spain, shaping platforms, communities, and the future through real innovation.

    Más allá del laboratorio: Detección de anomalías en el mundo real para visión por computadora en agricultura

    La detección de anomalías está transformando la manufactura y la vigilancia, pero ¿qué pasa con la agricultura? ¿Puede la IA detectar realmente enfermedades de las plantas y daños por plagas con suficiente anticipación para marcar una diferencia?

    Esta charla demuestra cómo la detección de anomalías identifica y localiza problemas en los cultivos, usando como ejemplo principal la salud de las hojas de café. Comenzaremos con la teoría fundamental y luego examinaremos cómo estos modelos detectan la roya y el daño por minadores en imágenes de hojas.

    La sesión incluye un flujo de trabajo práctico y completo utilizando la herramienta de visión por computadora de código abierto FiftyOne, que abarca la curación de datasets, la extracción de parches, el entrenamiento de modelos y la visualización de resultados. Obtendrás tanto una comprensión teórica de la detección de anomalías en visión por computadora como experiencia práctica aplicando estas técnicas a desafíos agrícolas y en otros dominios.
    ---
    Paula Ramos has a PhD in Computer Vision and Machine Learning, with more than 20 years of experience in the technological field. She has been developing novel integrated engineering technologies, mainly in Computer Vision, robotics, and Machine Learning applied to agriculture, since the early 2000s in Colombia.

    Electrical Activation Analysis in Response to Visual Stimuli: An Application in Advertising

    Every day, we are exposed to hundreds of advertising campaigns; however, only about 12% of all this advertising leaves a lasting impression in our brains, highlighting the importance of capturing consumer attention. Evaluating the effectiveness of an advertising campaing allows us to predict its potential success. This practice has been employed by traditional marketing companies for many years, the results can be influenced. In this research, the acquisition and pre-processing of electroencephalographic (EEG) signals generated while viewing visual advertising campaigns are conducted. These signals reflect individuals' autonomic responses and are not consciously or voluntarily fabricated reactions to stimuli. Subsequently, an electrical activation analysis of the cortical brain and visualization of the EEG signals are performed through a three-dimensional representation on a standardized brain model.

    The brain regions with the highest electrical activation are analyzed and compared using two mathematical and computational techniques, one linear and one non-linear. The neural response due to the advertising images are compared against the brain representation during the performance of cognitive tasks involving selective attention and implicit memory. This allows us to infer the occurrence of these cognitive processes, evoked by marketing campaigns (visual ads), which are essential constructs for studying consumer behavior.

    The results indicate that visual advertising campaigns containing linguistic and cultural elements embedded in the graphic designs trigger greater brain activation, which is associated with the cognitive processes of selective attention and implicit memory. Thus, it can be concluded that this type of advertising images increases the probability of influencing purchasing decisions.
    ---
    Victor Alfonso is an Electronic Engineer from the Technological University of Pereira, with postgraduate studies in education and pedagogy. In the field of engineering, I hold a Master’s degree in Physical Instrumentation, and I am currently a Ph.D. student in Engineering at UTP.

    Movement as Story: Designing Empowering Workout Experiences with AI

    People have a reason to workout and with every reason, there's a story behind. Whether the goal is strength, stress relief, or transformation. In contemporary culture, these stories are often shared digitally, where exercise becomes not just a performative act but a resonant gesture within communities of friends, families, and clubs.

    This project explores how real-time workout detection can amplify such narratives by translating physical poses into audiovisual messages. Each detected pose becomes a trigger for text or visual cues—ranging from poetic phrases and evocative song lyrics to quotes aligned with the symbolic act of the movement. This way, exercise becomes both action and expression creating a personalized narrative layered on top of the physical activity.

    The system leverages computer vision, machine learning, large language models, and design software to construct a responsive application that shapes meaning alongside exercise. It further investigates how personalized outputs can empower individuals and reinforce collective resonance. Use cases developed with Voxel51 dependencies will also be presented.
    ---
    Jose Bringas is a creative technologist exploring how emerging technologies, real-time systems, and AI can expand human expression. With a background spanning visual effects, motion design, virtual reality, and interactive media, he brings an out-of-the-box mindset to designing responsive experiences.

    • Photo of the user
    • Photo of the user
    • Photo of the user
    17 attendees from this group
  • Network event
    Dec 16 - Building and Auditing Physical AI Pipelines with FiftyOne
    Online

    Dec 16 - Building and Auditing Physical AI Pipelines with FiftyOne

    Online
    156 attendees from 47 groups

    This hands-on workshop introduces you to the Physical AI Workbench, a new layer of FiftyOne designed for autonomous vehicle, robotics, and 3D vision workflows. You’ll learn how to bridge the gap between raw sensor data and production-quality datasets, all from within FiftyOne’s interactive interface.

    Date, Time and Location

    Dec 16, 2025
    9:00-10:00 AM Pacific
    Online.
    Register for the Zoom!

    Through live demos, you’ll explore how to:

    • Audit: Automatically detect calibration errors, timestamp misalignments, incomplete frames, and other integrity issues that arise from dataset format drift over time.
    • Generate: Reconstruct and augment your data using NVIDIA pathways such as NuRec, COSMOS, and Omniverse, enabling realistic scene synthesis and physical consistency checks.
    • Enrich: Integrate auto-labeling, embeddings, and quality scoring pipelines to enhance metadata and accelerate model training.
    • Export and Loop Back: Seamlessly export to and re-import from interoperable formats like NCore to verify consistency and ensure round-trip fidelity.

    You’ll gain hands-on experience with a complete physical AI dataset lifecycle—from ingesting real-world AV datasets like nuScenes and Waymo, to running 3D audits, projecting LiDAR into image space, and visualizing results in FiftyOne’s UI. Along the way, you’ll see how Physical AI Workbench automatically surfaces issues in calibration, projection, and metadata—helping teams prevent silent data drift and ensure reliable dataset evolution.

    By the end, you’ll understand how the Physical AI Workbench standardizes the process of building calibrated, complete, and simulation-ready datasets for the physical world.

    Who should attend

    Data scientists, AV/ADAS engineers, robotics researchers, and computer vision practitioners looking to standardize and scale physical-world datasets for model development and simulation.

    About the Speaker

    Daniel Gural leads technical partnerships at Voxel51, where he’s building the Physical AI Workbench, a platform that connects real-world sensor data with realistic simulation to help engineers better understand, validate, and improve their perception systems.

    • Photo of the user
    • Photo of the user
    2 attendees from this group
  • Network event
    Jan 13 - Designing Data Infrastructures for Multimodal Mobility Datasets
    Online

    Jan 13 - Designing Data Infrastructures for Multimodal Mobility Datasets

    Online
    169 attendees from 47 groups

    This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

    Date, Time and Location

    Jan 13, 2026
    9:00-10:00 AM Pacific
    Online.
    Register for the Zoom!

    We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

    What you’ll learn:

    • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
    • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
    • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
    • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
    • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

    By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

    • Photo of the user
    • Photo of the user
    2 attendees from this group
  • Network event
    Jan 14 - Best of NeurIPS
    Online

    Jan 14 - Best of NeurIPS

    Online
    153 attendees from 47 groups

    Welcome to the Best of NeurIPS series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined the conference. Live streaming from the authors to you.

    Jan 14, 2025
    9 AM Pacific
    Online.
    Register for the Zoom!

    EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

    Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and third-person perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery,

    EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.

    About the Speaker

    Ege Özsoy is a last year PhD student researching multimodal computer vision and vision–language models for surgical scene understanding, focusing on semantic scene graphs, multimodality, and ego-exocentric modeling in operating rooms.

    SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation

    Few-shot segmentation requires recognizing novel object categories from only a few annotated examples, demanding both accurate mask generation and strong visual correspondence. While Segment Anything 2 (SAM2) provides powerful prompt-based segmentation and built-in feature matching, its representations are entangled with tracking-specific cues that limit higher-level semantic generalization. We show that SAM2 nonetheless encodes rich latent semantic structure despite its class-agnostic training. To leverage this, we introduce SANSA, a lightweight framework that makes this structure explicit and adapts SAM2 for few-shot segmentation with minimal modifications. SANSA achieves state-of-the-art generalization performance, outperforms generalist in-context methods, supports flexible prompting, and remains significantly faster and smaller than prior approaches.

    About the Speaker

    Claudia Cuttano is a PhD student in the VANDAL Lab at Politecnico di Torino and is currently conducting a research visit at TU Darmstadt with Prof. Stefan Roth in the Visual Inference Lab. Her work centers on semantic segmentation, particularly on multi-modal scene understanding and leveraging foundation models for pixel-level vision tasks.

    Nested Learning: The Illusion of Deep Learning Architectures

    We present Nested Learning (NL), a new learning paradigm for continual learning that views machine learning models and their training process as a set of nested and/or parallel optimization problems, each of which with its own context flow, frequency of update, and learning algorithm. Based on NL, we design a new architecture, called Hope, that is capable of continual learning and also modifying itself, if it is needed.

    About the Speaker

    Ali Behrouz is a Ph.D. student in the Computer Science Department at Cornell University and a research intern at Google Research. His research spans topics from deep learning architectures to continual learning and neuroscience, and appeared at NeurIPS, ICML, KDD, WWW, CHIL, VLDB, ... conferences. His work has been featured with two Best Paper awards, a Best Paper Honorable Mention award, a Best Paper Award candidate, and oral and spotlight presentations.

    Are VLM Explanations Faithful? A Counterfactual Testing Approach

    VLMs sound convincing—but are their explanations actually true? This talk introduces Explanation-Driven Counterfactual Testing (EDCT), a simple and model-agnostic method that evaluates whether VLM explanations align with the evidence models truly use. By perturbing the very features a model claims to rely on, EDCT exposes mismatches between stated reasoning and real decision pathways. I will show surprising failure cases across state-of-the-art VLMs and highlight how EDCT can guide more trustworthy explanation methods.

    About the Speaker

    Santosh Vasa is a Machine Learning Engineer at Mercedes-Benz R&D North America, working on multimodal perception and VLM safety for autonomous driving. He co-authored the EDCT framework and focuses on explainability, counterfactual testing, and trustworthy AI.

    • Photo of the user
    1 attendee from this group

Group links

Organizers

Members

514
See all