Machine Perception and Augmented Reality Meetup
Details
Dear Visionists,
We are very excited to announce that our meetup is back with an event at Google Munich on Thursday, 29 June. The topic will be Machine Perception and Augmented Reality with the following talks:
-
Jürgen Sturm - Seeing the World in 3D with Tango
-
David Tan - 3D tracking for AR
-
Federico Tombari, Johanna Wald - Real-time Incremental Scene Understanding on the Google Tango
The talks will start at 7pm, but doors open already at 6pm with some food.
If you would like to attend, please fill out this form (https://docs.google.com/forms/d/e/1FAIpQLSeKbv8ViiRZhP1IqXVjtYzM00P2B1jouS3DfHZfJXCvoRp7Gg/viewform) - your registration is needed to enter the building.
We look forward to seeing you there,
Your Meetup Team
--------------------------------------------------------------------------------
Talk abstracts:
Seeing the World in 3D with Tango (Jürgen Sturm)
What if your smartphone could perceive and understand the world in 3D as we do? Imagine what applications this would enable, such as augmented reality games that play directly in the space around you, professional applications such as building and object scanning, indoor navigation, and exciting robotics projects. Tango provides visual-inertial 6-DOF motion estimation and dense 3D reconstruction in real-time and on-board the device, and can thereby kickstart your AR/VR/robotics project significantly. In my talk, I will give a technical introduction to the hardware and algorithms underlying the Tango platform, including visual-inertial odometry, SLAM, loop closure detection and global localization. The main focus of this talk will be on 3D reconstruction, texturing and floor plan generation. During my talk, I will show several live demos with a Tango device.
3D Tracking for AR - Abstract (David Tan)
3D object temporal trackers estimate the 3D rotation and 3D translation of a rigid object by propagating the transformation from one frame to the next. To confront this task, algorithms either learn the transformation between two consecutive frames or optimize an energy function to align the object to the scene. The motivation behind our approach stems from a consideration on the nature of learners and optimizers. Throughout the evaluation of different types of objects and working conditions, we observe their complementary nature – on one hand, learners are more robust when undergoing challenging scenarios, while optimizers are prone to tracking failures due to the entrapment at local minima; on the other, optimizers can converge to a better accuracy and minimize jitter. Therefore, we propose to bridge the gap between learners and optimizers to attain a robust and accurate RGB-D temporal tracker that runs at approximately 2 ms per frame using one CPU core.
Real-time Incremental Scene Understanding on the Google Tango (Federico Tombari, Johanna Wald)
To help Google Tango fully understand the surrounding environment, we want to obtain full spatial and semantic awareness. This requires to extract information about the objects, their physical relation and corresponding object classes. Not only: everything has to be done in real-time and directly on the device, as reducing latency as much as possible is key for a successful user experience. In this talk, we present a real-time RGB-D based scene understanding method for indoor scenes on Google Tango enabled devices. We use its stable visual inertial odometry to incrementally merge segments obtained from each input depth image in a global 3D model. We then go further and combine the segmented 3D reconstruction with semantic classification and object detection. To accomplish efficient semantic segmentation, we incrementally encode the segments in the global model and use a classifier to determine its high-level semantic class. Deep learning-based object detection is finally carried out to fuse object predictions from RGB frames at different viewpoints according to the estimate camera pose. A live demo will be showcased to demonstrate the potentiality of this approach under real environmental settings.
--------------------------------------------------------------------------------
Speaker bios:
Jürgen Sturm
Jürgen Sturm joined Tango at Google in 2015 where he works on 3D reconstruction and scene understanding. Before he joined Google, he led a team at Metaio on augmented reality and machine learning. From 2011-2014, he was a post-doc in the computer vision group of Prof. Daniel Cremers at the Technical University of Munich. Previously, he obtained his PhD in robotics from the University of Freiburg under the supervision of Prof. Wolfram Burgard in 2011. His PhD thesis received the ECCAI best dissertation award 2011 and was short-listed for the euRobotics Georges Giralt Award 2012. His lecture "Visual Navigation for Flying Robots" was distinguished with the TUM best lecture award in 2012 and 2013.
David Tan
David Joseph Tan is finishing his Ph.D. studies at TU Munich in Computer Science. With research that focuses on Computer Vision, he has continuously developed innovative technologies with 3D data that concentrates on applications in robotics and augmented reality (AR). Based on the demos of a working prototype, the algorithm he developed during his Ph.D. has shown state of the art performance which triggered interest from various companies.
Federico Tombari
Federico Tombari is a senior research scientist and team leader at the Computer Aided Medical Procedures and Augmented Reality (CAMPAR) Chair of the Technical University of Munich (TUM). He has more than 10 years of research experience in the field of computer vision and 3D perception. He has co-authored more than 120 refereed papers on international conferences and journals, on topics such as visual data representation, RGB-D object recognition, 3D reconstruction and matching, stereo vision, deep learning for computer vision. He got his Ph.D at 2009 from University of Bologna, at the same institution he was Assistant Professor from 2013 to 2016. In 2008 he was an intern at Willow Garage, California. He is a Senior Scientist volunteer for the Open Perception foundation and a developer for the Point Cloud Library, for which he served, in 2012 and 2014, respectively as mentor and administrator in the Google Summer of Code. In 2015 he was the recipient of a Google Faculty Research Award. His works have been awarded at conferences and workshops such as 3DIMPVT'11, MICCAI'15, ECCV-R6D'16. He is a research partner of BMW, Google, Toyota, Zeiss.
Johanna Wald
Johanna Wald is a computer science graduate student at the University of Salzburg with a strong computer vision background. She has experience in efficient real-time detection and tracking algorithms, indoor positioning and SLAM for augmented reality applications on mobile phones. She found her passion for computer vision during her bachelor studies in Multimedia Technology, specializing on augmented reality and game development. During that time, she interned at Metaio and later wrote her thesis on head pose estimation and deformable face models supervised by Jürgen Sturm, PhD and Soumitry J. Ray, PhD. Since January 2017 she has been working on her master's thesis in collaboration with the Google Tango team and the computer vision group lead by Federico Tombari, PhD within the CAMP chair at the Technical University of Munich.