Skip to content

Details

Join us to hear talks from experts on cutting-edge topics across AI, ML, and computer vision!

Pre-registration is mandatory.

Time and Location

Feb 12, 2026
5:30 - 8:30 PM

Union AI Offices
400 112th Ave NE #115
Bellevue, WA 98004

ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification

In the complex environments, the anomalies are sometimes highly contextual and also ambiguous, and thereby, uncertainty quantification (UQ) is a crucial capacity for a multi-modal LLM (MLLM)-based video anomaly detection (VAD) system to succeed. In this talk, I will introduce our UQ-supported MLLM-based VAD framework called ALARM. ALARM integrates UQ with quality-assurance techniques like reasoning chain, self-reflection, and MLLM ensemble for robust and accurate performance and is designed based on a rigorous probabilistic inference pipeline and computational process.

About the Speaker

Congjing Zhang is a third-year Ph.D. student in the Department of Industrial and Systems Engineering at the University of Washington, advised by Prof. Shuai Huang. She is a recipient of the 2025-2027 Amazon AI Ph.D. Fellowship. Her research interests center on large language models (LLMs) and machine learning, with a focus on uncertainty quantification, anomaly detection and synthetic data generation.

The World of World Models: How the New Generation of AI Is Reshaping Robotics and Autonomous Vehicles

World Models are emerging as the defining paradigm for the next decade of robotics and autonomous systems. Instead of depending on handcrafted perception stacks or rigid planning pipelines, modern world models learn a unified representation of an environment—geometry, dynamics, semantics, and agent behavior—and use that understanding to predict, plan, and act. This talk will break down why the field is shifting toward these holistic models, what new capabilities they unlock, and how they are already transforming AV and robotics research.

We then connect these advances to the Physical AI Workbench, a practical foundation for teams who want to build, validate, and iterate on world-model-driven pipelines. The Workbench standardizes data quality, reconstruction, and enrichment workflows so that teams can trust their sensor data, generate high-fidelity world representations, and feed consistent inputs into next-generation predictive and generative models. Together, world models and the Physical AI Workbench represent a new, more scalable path forward—one where robots and AVs can learn, simulate, and reason about the world through shared, high-quality physical context.

About the Speaker

Daniel Gural leads technical partnerships at Voxel51, where he’s building the Physical AI Workbench, a platform that connects real-world sensor data with realistic simulation to help engineers better understand, validate, and improve their perception systems.

Modern Orchestration for Durable AI Pipelines and Agents - Flyte 2.0

In this talk we’ll discuss how the orchestration space is evolving with the current AI landscape, and provide a peak at Flyte 2.0, which makes truly dynamic, compute aware, and durable AI orchestration easy for any type of AI application, from computer vision, agents, and more!

Flyte, the open source orchestration platform, is already being used by thousands of teams to build their AI pipelines. In-fact it’s extremely likely you’ve interacted with AI models trained on Flyte, while on social media, listening to music on using self driving technologies.

About the Speaker

Sage Elliott is an AI Engineer at Union.ai (core maintainers of Flyte).

Context Engineering for Video Intelligence: Beyond Model Scale to Real-World Impact

Video streams combine vision, audio, time-series and semantics at a scale and complexity unlike text alone. At TwelveLabs, we’ve found that tackling this challenge doesn’t start with ever-bigger models — it starts with engineering the right context. In this session, we’ll walk engineers and infrastructure leads through how to build production-grade video AI by systematically designing what information the model receives, how it's selected, compressed, and isolated. You’ll learn our four pillars of video context engineering (Write, Select, Compress, Isolate), see how our foundation models (Marengo & Pegasus) and agent product (Jockey) use them, and review real-world outcomes in media, public-safety and advertising pipelines.

We’ll also dive into how you measure context effectiveness — tokens per minute, retrieval hit rates, versioned context pipelines — and how this insight drives cost, latency and trust improvements. If you’re deploying AI video solutions in the wild, you’ll leave with a blueprint for turning raw video into deployable insight — not by model size alone, but by targeted context engineering.

About the Speaker

James Le currently leads the developer experience function at TwelveLabs - a startup building foundation models for video understanding. He previously operated in the MLOps space and ran a blog/podcast on the Data & AI infrastructure ecosystem.

Build Reliable AI apps with Observability, Validations and Evaluations

As generative AI moves from experimentation to enterprise deployment, reliability becomes critical. This session outlines a strategic approach to building robust AI apps using Monocle for observability and the VS Code Extension for diagnostics, and bug fixing. Discover how to create AI systems that are not only innovative but also predictable and trustworthy.

About the Speaker

Hoc Phan has 20+ years of experience driving innovation at Microsoft, Amazon, Dell, and startups. In 2025, he joined Okahu to lead product and pre-sales, focusing on AI observability and LLM performance. Previously, he helped shape Microsoft Purview via the BlueTalon acquisition and led R&D in cybersecurity and data governance. Hoc is a frequent speaker and author of three books on mobile development and IoT.

Artificial Intelligence
Computer Vision
Machine Learning
Data Science
Open Source

Members are also interested in