Nov 21 - Best of ICCV (Day 3)

Name: Nov 21 - Best of ICCV (Day 3)
Start: 2025-11-21T09:00:00-08:00
End: 2025-11-21T11:00:00-08:00

Network event

106 attendees from 44 groups hosting

Hosted by Silicon Valley AI, Machine Learning & Computer Vision Meetup

Meet the group

Silicon Valley AI, Machine Learning & Computer Vision Meetup

No reviews yet

Details

Welcome to the Best of ICCV series, your virtual pass to some of the groundbreaking research, insights, and innovations that defined this year’s conference. Live streaming from the authors to you.

Date, Time and Location

Nov 21, 2025
9 AM Pacific
Online. Register for the Zoom!

GECO: Geometrically Consistent Embedding with Lightspeed Inference

Recent advances in feature learning have shown that self-supervised vision foundation models can capture semantic correspondences but often lack awareness of underlying 3D geometry. GECO addresses this gap by producing geometrically coherent features that semantically distinguish parts based on geometry (e.g., left/right eyes, front/back legs).
We propose a training framework based on optimal transport, enabling supervision beyond keypoints, even under occlusions and disocclusions. With a lightweight architecture, GECO runs at 30 fps, 98.2% faster than prior methods, while achieving state-of-the-art performance on PFPascal, APK, and CUB, improving PCK by 6.0%, 6.2%, and 4.1%, respectively. Finally, we show that PCK alone is insufficient to capture geometric quality and introduce new metrics and insights for more geometry-aware feature learning

About the Speaker

Regine Hartwig is a PHD Graduate Student at the Technical University of Munich

Proactive Comorbidity Prediction in HIV: Towards Fair and Trustworthy Care

HIV is a chronic infection that weakens the immune system and exposes patients to a high burden of comorbidities. While antiretroviral therapy has improved life expectancy, comorbidities remain a major challenge, and traditional screening protocols often fail to capture subtle risk patterns early enough. To address this, we develop a novel method trained on lab tests and demographic data from 2,200 patients in SE London. The method integrates feature interaction modeling, attention mechanisms, residual fusion and label-specific attention heads, outperforming TabNet, MLPs and classical machine learning models.

Our experiments show that incorporating demographic information improves predictive performance, though demographic recoverability analyses reveal that age and gender can still be inferred from lab data alone, raising fairness concerns. Finally, robustness checks confirm stable feature importance across cross-validation folds, reinforcing the trustworthiness of our approach.

About the Speaker

Dimitrios Kollias is an Associate Professor in Multimodal AI at Queen Mary University of London, specializing in machine/deep learning, trustworthy AI, computer vision, medical imaging & healthcare, behavior analysis, HMI. I have published 80+ papers (h-index 39; 6100+ citations) in top venues (e.g., CVPR, ICCV, ECCV, AAAI, IJCV, ECAI), invented a patent in behavior analysis (Huawei) and my research is widely adopted by academia and industry. I also serve as AI consultant and advisor to global companies, and have played leading roles in major international AI workshops and competitions.

Toward Trustworthy Embodied Agents: From Individuals to Teams

Modern intelligent embodied agents, such as service robots and autonomous vehicles, interact frequently with humans in dynamic, uncertain environments. They may also collaborate with each other as a team through effective communication to enhance task success, safety, and efficiency. These brings a few significant challenges. First, building reliable agents that safely navigate multi-agent scenarios requires scalable and generalizable prediction of surrounding agents’ behaviors and robust decision making under environmental uncertainty in out-of-distribution (OOD) scenarios. Second, effective cooperation between agents requires efficient communication and information fusion strategies and reliable task planning for complex long-horizon tasks.

In this talk, I will introduce a series of our recent work that addresses these challenges to enable safe and trustworthy embodied agents and their application to autonomous driving and service robots. Specifically, I will first demonstrate principled uncertainty quantification techniques and how they enable generalizable prediction and planning in out-of-distribution scenarios. Then, I will talk about effective approaches to enable efficient multi-agent communication and cooperation in centralized and decentralized settings.

About the Speaker

Dr. Jiachen Li is an Assistant Professor in the Department of Electrical and Computer Engineering (ECE) and a cooperating faculty in the Department of Computer Science and Engineering (CSE) at the University of California, Riverside. He is the Director of the Trustworthy Autonomous Systems Laboratory and is affiliated with the Riverside Artificial Intelligence Research Institute (RAISE), the Center for Robotics and Intelligent Systems (CRIS), and the Center for Environmental Research and Technology (CE-CERT).

DRaM-LHM: A Quaternion Framework for Iterative Camera Pose Estimation

We explore a quaternion adjugate matrix-based representation for rotational motion in the Perspective-n-Point (PnP) problem. Leveraging quadratic quaternion terms within a Determinant Ratio Matrix (DRaM) estimation framework, we extend its application to perspective scenarios, providing a robust and efficient initialization for iterative PnP pose estimation. Notably, by solving the orthographic projection least-squares problem, DRaM provides a reliable initialization that enhances the accuracy and stability of iterative PnP solvers. Experiments on synthetic and real data demonstrate its efficiency, accuracy, and robustness, particularly under high noise conditions. Furthermore, our nonminimal formulation ensures numerical stability, making it effective for real-world applications.

About the Speaker

Chen Lin was a Research Fellow at the Simons Foundation, where she specialized in 3D computer vision and visual(-inertial) SLAM. Her research spans from classical multiview geometry to learning-based pose estimation and scene understanding. Her ICCV 2025 paper introduces a new framework for rotation and pose estimation built on advanced algebraic paradigms.

Artificial Intelligence

Computer Vision

Machine Learning

Data Science

Open Source

Nov 21 - Best of ICCV (Day 3)

Silicon Valley AI, Machine Learning & Computer Vision Meetup

Details

Members are also interested in