Skip to content

Graph Convolutional Networks in Videos and 3D Point Clouds - Dr. Ali Thabet

Photo of Peter Naf
Hosted By
Peter N.
Graph Convolutional Networks in Videos and 3D Point Clouds - Dr. Ali Thabet

Details

In this lecture I will introduce my journey into the world of Graph Convolutional Networks (GCNs). We will first dive into the realm of 3D point clouds and some of the challenges of applying deep learning to this data modality. 3D point clouds lack some of the structural properties of images, making them unsuitable for standard convolution operators. One can circumvent this obstacle by modeling point clouds as graphs, where each point corresponds to a node, and edges are built using nearest neighbors. GCN based point cloud architectures have shown effective in several point cloud tasks like classification and segmentation.
I will introduce our DeepGCN framework and how we used it to effectively train deep graph networks for point clouds. Our work shows how we can train GCNs of over 100 layers deep, and increase the performance of baseline models in 3D scene segmentation. We achieve this performance by incorporating common CNN concepts, like residual connections and dilated convolutions, to GCNs. I will also present G-TAD and MAAS. In G-TAD, we model frames in a video using a graph, where we learn to connect frames that are not necessarily temporally adjacent. We use this graph formulation to improve the performance of temporal activity detection in video. We show how these new relations correspond to context frames, which are frames not belonging to an action, but serve as important sources of information to detect said action. With MAAS, we use GCNs to mix multi-modal data, audio and video in this case, to solve the problem of active speaker detection. Since GCNs do not need to process homogeneous data nodes, it is trivial to mix multi-modal data. We show how a simple GCN pipeline achieves state-of-the-art performance in active speaker detection. Overall, this talk will focus on practical applications of GCNs and will not delve into theoretical analysis of graphs.

Talk is based on the speaker's papers:
G-TAD: Sub-Graph Localization for Temporal Action Detection (CVPR 2020)
Paper: https://www.deepgcns.org/app/g-tad
Code: https://github.com/liuzechun/Bi-Real-net

SGAS: Sequential Greedy Architecture Search (CVPR 2020)
Paper: https://www.deepgcns.org/auto/sgas
Code: https://github.com/lightaime/sgas

DeepGCNs: Can GCNs Go as Deep as CNNs? (ICCV 2019)
Paper page: https://www.deepgcns.org/
Git: https://github.com/lightaime/deep_gcns_torch

MAAS: Multi-modal Assignation for Active Speaker Detection
Paper: https://arxiv.org/abs/2101.03682

Presenter BIO:

Ali Thabet is a research scientist in the Image and Video Understanding Lab (IVUL) at King Abdullah University of Science and Technology (KAUST). His interests are in 3D computer vision and video understanding. His recent work focuses on how to effectively and efficiently train and use deep learning models in these two data modalities. He also works with Graph Convolutional Networks and their applications to computer vision. His work on DeepGCNs provided new tools to train deeper graph based networks, and has seen applications in 3D processing, video understanding, biological systems, drug discovery, computational fluid dynamics, and others. Ali holds a PhD in Computer Science from the University of Dundee, UK. Prior to his current role, Ali co-founded Fig, and e-commerce platform for smart fashion shopping, where he served as CTO.
More information about Ali can be found at https://www.alithabet.com/

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Photo of 2d3d.ai group
2d3d.ai
See more events
Online event
This event has passed