ML Pub Club #18: Foundation models in 3D vision and where to find them


Details
Foundation models have made its way in 3D vision. Remembering the famous Morpheus's quote: "have you ever had a dream, Neo, that you were so sure was real", we now have models like SORA, Veo 2 & recently Cosmos that have shown emergent capabilities in understanding the real-world physics.
One might ask a somewhat "provocative" question: is multiview geometry still a thing (the almighty Zisserman and Hartley) and correspondence the most important problem in computer vision (as per Takeo Kanade) or have the foundation models solved these problems? What are some of the practical aspects of 3D vision, how to collect data, what are the HW requirements?
To tackle these questions (& promote discussion), we will be joining Valentina Zadrija, Tech Lead at Yaak, to our upcoming ML Pub Club!
Valentina holds a PhD in artificial intelligence from FER and has previously worked on computer vision, self-driving and robotics problems at Rimac Automobili and Gideon , as well as spending the 10 years of her career as an R&D SW engineer. Her research interests include 3D vision, large multimodal models, robotics and spatial AI in general.
๐๏ธ February 4th 2025
๐ข 18:00 - 20:00
๐ CroAI HQs (Zavrtnica 17)
๐ https://lu.ma/s6ef38dd

ML Pub Club #18: Foundation models in 3D vision and where to find them