Skip to content

Visual Perception Models for Multi-Modal Video Understanding - Dr. Bertasius

Photo of Peter Naf
Hosted By
Peter N.
Visual Perception Models for Multi-Modal Video Understanding - Dr. Bertasius

Details

Humans understand the world by processing signals from different modalities (e.g., speech, sound, vision, etc). Considering multiple modalities is useful (1) for developing systems that do not require manual supervision, and also (2) for systems that require multi-modal understanding during inference. In this talk, I will present two methods that take a step in this direction.

First, I will present a large-scale training framework COBE that learns contextual object representations in settings involving human-object interactions. Our approach exploits automatically-transcribed narrations from instructional videos, and it does not require manual annotations.

Afterwards, I will present a multi-modal video-based text generation framework Vx2Text, which outperforms state-of-the-art on three video based text-generation tasks: captioning, question answering and dialoguing.

The talk is based on the paper:

COBE: Contextualized Object Embeddings from Narrated Instructional Video (NeurIPS 2020)
arxiv: https://arxiv.org/abs/2007.07306

Presenter BIO:

Gedas Bertasius is a postdoctoral researcher at Facebook AI working on computer vision and machine learning problems. His current research focuses on topics of video understanding, first-person vision, and multi-modal deep learning. He received his Bachelors Degree in Computer Science from Dartmouth College, and a Ph.D. in Computer Science from the University of Pennsylvania. His recent work was nominated for the CPVR 2020 best paper award.

His website: https://gberta.github.io/

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Photo of 2d3d.ai group
2d3d.ai
See more events
Online event
This event has passed