What we're about

Hi all and welcome!

We will share knowledge and have online lectures on advances and ideas in the realm of image processing, 3D modelling, design, artificial intelligence, deep learning and all relevant subjects that we are curious about.

All lectures will be recorded and uploaded to our youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

This meetup is the events area of our sub-reddit ➜ https://www.reddit.com/r/2D3DAI/

The group is a product of the awesome https://www.reddit.com/r/learnmachinelearning community. It started off with one post about an online live lecture about automatic 3D modeling from an image using deep learning: https://www.reddit.com/r/learnmachinelearning/comments/gkr44a/free_zoom_lecture_about_advances_in_deep_learning/

Places to find us:

Newsletter for updates ➜ http://eepurl.com/gJ1t-D

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

***When registering to events, be sure to register through the zoom link which you will get after you press RSVP on meetup.com . After your zoom registration we will approve you and send the zoom connection link through email an hour before event starts.***

Upcoming events (5)

Putting visual recognition in context

Online event

Recent studies have shown that visual recognition networks can be fooled by placing objects in inconsistent contexts (e.g., a pig floating in the sky). This lecture covers two representative works modeling the role of contextual information in visual recognition. We systematically investigated critical properties of where, when, and how context modulates recognition.
In the first work, we focused on the study of the amount of context, context and object resolution, geometrical structure of context, context congruence, and temporal dynamics of contextual modulation on real-world images.
In the second work, we explored more challenging properties of contextual modulation including gravity, object co-occurrences and relative sizes in synthetic environments.
In both works, we conducted a series of experiments to gain insights into the impact of contextual cues on both human and machine vision:
* Psycho-physics experiments to establish a human benchmark for out-of-context recognition and then compare it with state-of-the-art computer vision models to quantify the gap between the two.
* We proposed new context-aware recognition models. The models captured useful information for contextual reasoning, enabling human-level performance and significantly better robustness in out-of-context conditions compared to baseline models across both synthetic and other existing out-of-context natural image datasets.

Talk is based on the speakers' papers:

Putting visual object recognition in context (CVPR2020)
Paper: https://arxiv.org/abs/1911.07349
Git: https://github.com/kreimanlab/Put-In-Context

When Pigs Fly: Contextual Reasoning in Synthetic and Natural Scenes
Paper: http://arxiv.org/abs/2104.02215
Git: https://github.com/kreimanlab/WhenPigsFlyContext

Presenter BIO:

Philipp Bomatter is a master student for Computational Science and Engineering at ETH Zurich.
He is interested in artificial intelligence and neuroscience and currently works on a project concerning contextual reasoning in vision at the Kreiman Lab at Harvard University.

Mengmi Zhang completed her PhD in the Graduate School for Integrative Sciences and Engineering, NUS in 2019. She is now a postdoc in KreimanLab in Children's Hospital, Harvard Medical School.
Her research interests include computer vision, machine learning, and cognitive neuroscience. In particular, she studies high-level cognitive functions in humans including attention, memory, learning and reasoning from psychophysics experiments, machine learning approaches and neuroscience.

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Introduction to Photogrammetry and Points2Surf (ECCV 2020)

This lecture is an introduction to photogrammetry. It covers geometry representations, conversion algorithms and evaluation methods. We explain methods that convert between images, point clouds, explicit surfaces and implicit and surfaces. Covered algorithms include Structure-From-Motion, Poisson Surface Reconstruction and Marching Cubes. While we keep the focus on the typical photogrammetry pipeline, we also present other methods, i.e., our own research on surface reconstruction from point clouds (Points2Surf, ECCV 2020). We conclude the lecture with an overview of common evaluation metrics like F1-score and Chamfer distance.

The reconstruction focuses on the author's paper:

Points2Surf: Learning Implicit Surfaces from Point Cloud Patches (ECCV2020)
Paper: https://arxiv.org/abs/2007.10453
Git: https://github.com/ErlerPhilipp/points2surf

Presenters BIO:

Philipp Erler is a PhD student in the rendering and modeling group (headed by Prof. Michael Wimmer) at TU Wien. His area of research is surface reconstruction using deep learning. The cooperation with Prof. Niloy Mitra (UCL / Adobe) and Dr. Paul Guerrero (Adobe) resulted in his first publication, Points2Surf. He also assists in teaching and reviewing papers.

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Graph Convolutional Networks in Videos and 3D Point Clouds - Dr. Ali Thabet

In this lecture I will introduce my journey into the world of Graph Convolutional Networks (GCNs). We will first dive into the realm of 3D point clouds and some of the challenges of applying deep learning to this data modality. 3D point clouds lack some of the structural properties of images, making them unsuitable for standard convolution operators. One can circumvent this obstacle by modeling point clouds as graphs, where each point corresponds to a node, and edges are built using nearest neighbors. GCN based point cloud architectures have shown effective in several point cloud tasks like classification and segmentation.
I will introduce our DeepGCN framework and how we used it to effectively train deep graph networks for point clouds. Our work shows how we can train GCNs of over 100 layers deep, and increase the performance of baseline models in 3D scene segmentation. We achieve this performance by incorporating common CNN concepts, like residual connections and dilated convolutions, to GCNs. I will also present G-TAD and MAAS. In G-TAD, we model frames in a video using a graph, where we learn to connect frames that are not necessarily temporally adjacent. We use this graph formulation to improve the performance of temporal activity detection in video. We show how these new relations correspond to context frames, which are frames not belonging to an action, but serve as important sources of information to detect said action. With MAAS, we use GCNs to mix multi-modal data, audio and video in this case, to solve the problem of active speaker detection. Since GCNs do not need to process homogeneous data nodes, it is trivial to mix multi-modal data. We show how a simple GCN pipeline achieves state-of-the-art performance in active speaker detection. Overall, this talk will focus on practical applications of GCNs and will not delve into theoretical analysis of graphs.

Talk is based on the speaker's papers:
G-TAD: Sub-Graph Localization for Temporal Action Detection (CVPR 2020)
Paper: https://www.deepgcns.org/app/g-tad
Code: https://github.com/liuzechun/Bi-Real-net

SGAS: Sequential Greedy Architecture Search (CVPR 2020)
Paper: https://www.deepgcns.org/auto/sgas
Code: https://github.com/lightaime/sgas

DeepGCNs: Can GCNs Go as Deep as CNNs? (ICCV 2019)
Paper page: https://www.deepgcns.org/
Git: https://github.com/lightaime/deep_gcns_torch

MAAS: Multi-modal Assignation for Active Speaker Detection
Paper: https://arxiv.org/abs/2101.03682

Presenter BIO:

Ali Thabet is a research scientist in the Image and Video Understanding Lab (IVUL) at King Abdullah University of Science and Technology (KAUST). His interests are in 3D computer vision and video understanding. His recent work focuses on how to effectively and efficiently train and use deep learning models in these two data modalities. He also works with Graph Convolutional Networks and their applications to computer vision. His work on DeepGCNs provided new tools to train deeper graph based networks, and has seen applications in 3D processing, video understanding, biological systems, drug discovery, computational fluid dynamics, and others. Ali holds a PhD in Computer Science from the University of Dundee, UK. Prior to his current role, Ali co-founded Fig, and e-commerce platform for smart fashion shopping, where he served as CTO.
More information about Ali can be found at https://www.alithabet.com/

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Adversarial Transferability and Beyond

Online event

Deep Neural Networks have achieved great success in various vision tasks in recent years. However, they remain vulnerable to adversarial examples, i.e. small human imperceptible perturbations fooling a target model. This intriguing phenomenon has inspired numerous techniques for attack and defense. In this talk, we will mainly focus on the transferability property that makes adversarial examples so dangerous as well as some of the theories to understand this intriguing phenomenon. Here, transferability refers to the property that adversarial examples generated on one model successfully transfer to another, unseen model, therefore constituting a black-box attack.

Lecture slides [Will be published here in proximity to the event date]: https://phibenz.github.io/talk/2d3d.ai/2d3dai_adversarial_transferability_and_beyond.pdf

The reconstruction focuses on the authors' papers:

1. Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs - Workshop on Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges (AML-CV) (Outstanding Paper) - Will be published in June

2. On Strength and Transferability of Adversarial Examples: Stronger Attack Transfers Better
Paper: https://sites.google.com/connect.hku.hk/robustml-2021/accepted-papers/paper-099

Presenters BIO:

Chaoning Zhang and Philipp Benz are 4th and 5th year Ph.D. students at the Robotics and Computer Vision (RCV) Lab at the Korea Advanced Institute of Science and Technology (KAIST) supervised by Prof. Kweon In So. Their research interest lies in deep learning with a focus on robustness and security. Through their collaborative efforts, they published papers at top conferences like CVPR, NeurIPS, and AAAI and are always open to collaborations with other researchers.

Philipp Benz: https://phibenz.github.io

Chaoning Zhang: https://scholar.google.co.kr/citations?user=lvhxhyQAAAAJ&hl=en

RCV-Lab: https://rcv.kaist.ac.kr

A recording of Philipp and Chaoning's previous event in our community: https://www.youtube.com/watch?v=ylEE1HtGNJc

** ** Please register through the zoom link right after your RSVP. We will send the links to the zoom event via email only to those who have registered through zoom. ** **

-------------------------
Find us at:

All lectures are uploaded to our Youtube channel ➜ https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

Newsletter for updates about more events ➜ http://eepurl.com/gJ1t-D

Sub-reddit for discussions ➜ https://www.reddit.com/r/2D3DAI/

Discord server for, well, discord ➜ https://discord.gg/MZuWSjF

Blog ➜ https://2d3d.ai

AI Consultancy -> https://abelians.com

Past events (33)

Few-Shot Patch-Based Training - Dr. Ondřej Texler

Online event

Photos (63)