
What we’re about
🖖 This group is for data scientists, machine learning engineers, and open source enthusiasts.
Every month we’ll bring you diverse speakers working at the cutting edge of AI, machine learning, and computer vision.
- Are you interested in speaking at a future Meetup?
- Is your company interested in sponsoring a Meetup?
This Meetup is sponsored by Voxel51, the lead maintainers of the open source FiftyOne computer vision toolset. To learn more, visit the FiftyOne project page on GitHub.
Upcoming events
11
- Network event•Online
Oct 23 - AI, ML and Computer Vision Meetup en Español
Online21 attendees from 5 groupsHear talks from experts on cutting-edge topics in AI, ML and Computer Vision Meetup en Español.
Date and Time
Oct 23 at 9 AM Pacific
Location
Virtual. Register for the Zoom
Del campo al dato: oportunidades, obstáculos y adopción de la IA en agricultura
La inteligencia artificial (IA) se ha consolidado como una herramienta clave para afrontar algunos de los mayores desafíos en la agricultura actual, desde la optimización del uso de insumos hasta el monitoreo preciso de cultivos y la predicción de rendimientos. En este seminario se presentarán varios casos prácticos reales en los que la IA ha demostrado su potencial para mejorar la eficiencia, sostenibilidad y rentabilidad en distintas etapas de la producción agrícola.
Sin embargo, a pesar de sus beneficios, la adopción de estas tecnologías en el sector sigue siendo limitada. Factores como la falta de relevo generacional, el envejecimiento de la población agraria, la baja digitalización en el medio rural y la percepción de complejidad o desconfianza hacia las herramientas digitales suponen importantes barreras. Esta charla, se aborda éxitos y obstáculos actuales, destacando la importancia de diseñar soluciones accesibles, acompañadas de formación y apoyo técnico, que se ajusten a la realidad de un sector tradicional en proceso de transformación.
José Blasco es doctor en Informática por la Universitat Politècnica de
València (2001) y desarrolla su actividad investigadora en el
Instituto Valenciano de Investigaciones Agrarias (IVIA) desde 1996. Ha
sido responsable del Área de Visión Artificial y Espectroscopia,
coordinador del Centro de Agroingeniería y director del IVIA.Multimodalidad con sesgos: Entiende y evalúa VLMs para conducción autónoma con FiftyOne
¿Tus VLMs realmente ven el peligro? Con FiftyOne te muestro cómo entender y evaluar modelos visión-lenguaje para conducción autónoma, haciendo visible el riesgo y el sesgo en segundos. Compararemos modelos en las mismas escenas, revelaremos fallos y edge cases, y verás un dashboard simple para decidir qué datos curar y qué ajustar. Te llevas un método claro, práctico y replicable para subir el listón de seguridad.
Adonai Vera - Machine Learning Engineer & DevRel @ Voxel51. Con más de 7 años construyendo modelos de visión por computador y ML con TensorFlow, Docker y OpenCV. Empecé como dev de software, pasé a IA, lideré equipos y fui CTO. Hoy conecto código y comunidad para crear IA abierta y lista para producción, con el propósito de hacer tecnología simple, accesible y confiable.
Crea tu primer Visual Search desde cero
¿Te imaginas poder buscar un objeto dentro de una imagen de la misma forma que lo harías en Google Images o Bing? En esta charla veremos paso a paso cómo diseñar e implementar un sistema de búsqueda visual desde cero utilizando redes neuronales y Python.
Carlos Bustillo es un ingeniero en inteligencia artificial con más de seis años de experiencia en Machine Learning, Computer Vision y Data Science. Ha trabajado en proyectos de alto impacto en sectores como fintech, banca, retail, ciberseguridad y hasta en la industria nuclear, colaborando con empresas en América Latina, Europa y Estados Unidos.
Deep Learning Techniques for HDR modulo imaging
This talk explores the transformative potential of modulo imaging for achieving unlimited dynamic range capture, fundamentally reimagining how we approach high dynamic range photography beyond traditional sensor limitations. By introducing cyclical intensity wrapping through the modulo operator, we unlock new opportunities for computational imaging that transcends conventional well-capacity constraints.
The modulo imaging paradigm presents fascinating new challenges in distinguishing authentic scene structure from artificial wrap discontinuities, a problem that pushes the boundaries of classical phase unwrapping into unexplored territory. Deep learning has emerged as a natural solution, providing advanced pattern-recognition capabilities to resolve ambiguities that traditional optimization methods cannot effectively address.
We present complementary approaches leveraging unrolled optimization networks and feature lifting strategies that teach neural architectures to handle wrapped measurements effectiveness. The introduction of scaling equivariance principles enables robust adaptation across varying exposure conditions, while physics-informed input representations guide networks toward meaningful reconstructions.
This work addresses fundamental questions about unlimited sampling theory in practical imaging systems, revealing how modulo measurements can codify arbitrarily bright scenes within finite bit depths. The implications extend beyond photography into autonomous systems, scientific imaging, and any application demanding extreme dynamic range. These advances establish computational modulo imaging as a viable pathway toward truly unlimited dynamic range capture, opening new frontiers in computational photography.
Brayan Monroy (Student Member, IEEE) received the B.S. and M.Sc. degree in systems engineering in 2022 and 2024, respectively, from the Universidad Industrial de Santander, Bucaramanga, Colombia, where he is currently working toward a Ph.D. in Computer Science. His work includes developing methodologies for self-supervised learning beyond Gaussian noise assumptions, exploring re-corruption strategies, and contributing to HDR image reconstruction through supervised and semi-supervised learning strategies from modulo measurements, including optimization-based algorithms, with applications in autonomous driving scenarios.
2 attendees from this group - Network event•Online
Oct 28 - Getting Started with FiftyOne for Agriculture Use Cases
Online116 attendees from 44 groupsThis special AgTec edition of our “Getting Started with FiftyOne” workshop series is designed for researchers, engineers, and practitioners working with visual data in agriculture. Through practical examples using a Colombian coffee dataset, you’ll gain a deep understanding of data-centric AI workflows tailored to the challenges of the AgTec space.
Date and Location
* Oct 28, 2025
* 9:00-10:30 AM Pacific
* Online. Register for the Zoom!Want greater visibility into the quality of your computer vision datasets and models? Then join us for this free 90-minute, hands-on workshop to learn how to leverage the open source FiftyOne computer vision toolset.
At the end of the workshop, you’ll be able to:- Load and visualize agricultural datasets with complex labels
- Explore data insights interactively using embeddings and statistics
- Work with geolocation and map-based visualizations
- Generate high-quality annotations with the Segment Anything Model (SAM2)
- Evaluate model performance and debug predictions using real AgTec scenarios
Prerequisites: working knowledge of Python and basic computer vision concepts.
Resources: All attendees will get access to the tutorials, videos, and code examples used in the workshop.
Learn how to:
- Visualize complex datasets
- Explore embeddings
- Analyze and improve models
- Perform advanced data curation
- Build integrations with popular ML tools, models, and datasets
1 attendee from this group - Network event•Online
Oct 30 - AI, ML and Computer Vision Meetup
Online265 attendees from 44 groupsJoin the virtual Meetup to hear talks from experts on cutting-edge topics across AI, ML, and computer vision.
Date, Time and Location
Oct 30, 2025
9 AM Pacific
Online. Register for the Zoom!The Agent Factory: Building a Platform for Enterprise-Wide AI Automation
In this talk we will explore what it takes to build an enterprise-ready AI automation platform at scale. The topics covered will include:
- The Scale Challenge: E-commerce environments expose the limitations of single-point AI solutions, which create fragmented ecosystems lacking cohesion and efficient resource sharing across complex, knowledge-based work.
- Root Cause Analysis Success: Flipkart’s initial AI agent transformed business analysis from days-long investigations to near-instantaneous insights, proving the concept while revealing broader platform opportunities.
- Platform Strategy Evolution: Success across Engineering (SDLC, SRE), Operations, and Commerce teams necessitated a unified, multi-tenant platform serving diverse use cases with consistency and operational efficiency.
- Architectural Foundation: Leveraging framework-agnostic design principles we were able to emphasize modularity, which enabled teams to leverage different AI models while maintaining consistent interfaces and scalable infrastructure.
- The “Agent Garden” Vision: Flipkart’s roadmap envisions an internal ecosystem where teams discover, deploy, and contribute AI agents, providing a practical blueprint for scalable AI agent infrastructure development.
About the Speaker
Virender Bhargav at Flipkart is a seasoned engineering leader whose expertise spans business technology integration, enterprise applications, system design/architecture, and building highly scalable systems. With a deep understanding of technology, he has spearheaded teams, modernized technology landscapes, and managed core platform layers and strategic products. With extensive experience driving innovation at companies like Paytm and Flipkart, his contributions have left a lasting impact on the industry.
Scaling Generative Models at Scale with Ray and PyTorch
Generative image models like Stable Diffusion have opened up exciting possibilities for personalization, creativity, and scalable deployment. However, fine-tuning them in production‐grade settings poses challenges: managing compute, hyperparameters, model size, data, and distributed coordination are nontrivial.
In this talk, we’ll dive deep into learning how to fine-tune Stable Diffusion models using Ray Train (with HuggingFace Diffusers), including approaches like DreamBooth and LoRA. We’ll cover what works (and what doesn’t) in scaling out training jobs, handling large data, optimizing for GPU memory and speed, and validating outputs. Attendees will come away with practical insights and patterns they can use to fine-tune generative models in their own work.
About the Speaker
Suman Debnath is a Technical Lead (ML) at Anyscale, where he focuses on distributed training, fine-tuning, and inference optimization at scale on the cloud. His work centers around building and optimizing end-to-end machine learning workflows powered by distributed computing framework like Ray, enabling scalable and efficient ML systems.
Suman’s expertise spans Natural Language Processing (NLP), Large Language Models (LLMs), and Retrieval-Augmented Generation (RAG).
Earlier in his career, he developed performance benchmarking and monitoring tools for distributed storage systems. Beyond engineering, Suman is an active community contributor, having spoken at over 100 global conferences and events, including PyCon, PyData, ODSC, AIE and numerous meetups worldwide.Privacy-preserving in Computer Vision through Optics Learning
Cameras are now ubiquitous, powering computer vision systems that assist us in everyday tasks and critical settings such as operating rooms. Yet, their widespread use raises serious privacy concerns: traditional cameras are designed to capture high-resolution images, making it easy to identify sensitive attributes such as faces, nudity, or personal objects. Once acquired, such data can be misused if accessed by adversaries. Existing software-based privacy mechanisms, such as blurring or pixelation, often degrade task performance and leave vulnerabilities in the processing pipeline.
In this talk, we explore an alternative question: how can we preserve privacy before or during image acquisition? By revisiting the image formation model, we show how camera optics themselves can be learned and optimized to acquire images that are unintelligible to humans yet remain useful for downstream vision tasks like action recognition. We will discuss recent approaches to learning camera lenses that intentionally produce privacy-preserving images, blurry and unrecognizable to the human eye, but still effective for machine perception. This paradigm shift opens the door to a new generation of cameras that embed privacy directly into their hardware design.
About the Speaker
Carlos Hinojosa is a Postdoctoral researcher at King Abdullah University of Science and Technology (KAUST) working with Prof. Bernard Ghanem. His research interests span Computer Vision, Machine Learning, AI Safety, and AI for Science. He focuses on developing safe, accurate, and efficient vision systems and machine-learning models that can reliably perceive, understand, and act on information, while ensuring robustness, protecting privacy, and aligning with societal values.
It's a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data
Can we match vision and language embeddings without any supervision? According to the platonic representation hypothesis, as model and dataset scales increase, distances between corresponding representations are becoming similar in both embedding spaces. Our study demonstrates that pairwise distances are often sufficient to enable unsupervised matching, allowing vision-language correspondences to be discovered without any parallel data.
About the Speaker
Dominik Schnaus is a third-year Ph.D. student in the Computer Vision Group at the Technical University of Munich (TUM), supervised by Daniel Cremers. His research centers on multimodal and self-supervised learning with a special emphasis on understanding similarities across embedding spaces of different modalities.
6 attendees from this group - Network event•Online
Physical AI Data Pipelines with NVIDIA Omniverse NuRec, Cosmos and FiftyOne
Online173 attendees from 47 groupsJoin Voxel51 and NVIDIA as they unveil a breakthrough that’s changing how Physical AI systems are built. In this first-ever demo featuring NVIDIA Omniverse NuRec and NVIDIA Cosmos with FiftyOne, you’ll learn how to create validated, simulation-ready data pipelines—cutting testing costs, eliminating manual data audits, and accelerating development from months to days.
Date and Location
Nov 5, 2025
9:00-10:30 AM Pacific
Online. Register for the ZoomDeveloping autonomous vehicles and humanoid robots requires rigorous simulations that capture real-world complexity. The critical barrier that keeps teams from achieving success isn’t the simulation engine itself, but the data that powers it.
As Physical AI systems ingest petabytes of multisensor data, converting this raw input into validated, simulation-ready data pipelines remains a hidden bottleneck. A camera-to-LiDAR projection off by a few pixels, timestamps misaligned by a few milliseconds, or inaccurate coordinate systems will cascade into flawed neural reconstructions and synthetic data.
Without a well-orchestrated data pipeline, even the most advanced simulation platforms end up consuming imperfect data, wasting weeks of effort and thousands of dollars in testing and compute costs.
In a first-ever demo featuring NVIDIA Omniverse NuRec and NVIDIA Cosmos with FiftyOne, you’ll discover how to:
- Eliminate manual data audits with an automated workflow that calibrates, aligns, and ensures data integrity across cameras, LiDAR, radar, and other sensors
- Curate and enrich the data for neural reconstructions and synthetic data generation
- Reduce Physical AI testing and QA costs by up to 80%
- Accelerate Physical AI development from months to days
Who should attend:
- Data Engineers, MLOps & ML Engineers working with Physical AI data
- Technical leaders and Managers driving Physical AI projects from prototype to production
- AV/Robotics Researchers building safety-critical apps with cutting-edge tech
- Product & Strategy leaders seeking to accelerate development while reducing infra costs and risks.
About the Speakers
Itai H Zadok is a Senior Product Manager l Autonomous Vehicles Simulation at NVIDIA
Daniel Gural is a Machine Learning Engineer and Evangelist at Voxel51
1 attendee from this group
Past events
49