DataTalks #29: Distilling BNNs with GANs and StarNet โ๏ธ๐ซโญ๏ธ


Details
DataTalks #29: Distilling BNNs with GANs and FSL w/ StarNet โ๏ธ๐ซโญ๏ธ
Our 29th DataTalks meetup will be held online and will feature talks on distilling BNNs with GANs and StarNet.
๐ญ๐ผ๐ผ๐บ ๐น๐ถ๐ป๐ธ: https://us02web.zoom.us/j/87175114825?pwd=NThPcFF3MzZSeHpLL1NhVmpYTitZdz09
๐๐ด๐ฒ๐ป๐ฑ๐ฎ:
๐ถ 16 :30 - 17:15 โ Distilling BNNs with GANs โ Natan Katz, NICE
๐ด 17:20 - 17:40 โ Few-Shot Learning for Classification with StarNet - Leonid Karlinsky, IBM
---------------------
๐๐ถ๐๐๐ถ๐น๐น๐ถ๐ป๐ด ๐๐ก๐ก๐ ๐๐ถ๐๐ต ๐๐๐ก๐ โ ๐ก๐ฎ๐๐ฎ๐ป ๐๐ฎ๐๐, ๐ก๐๐๐
In this talk I will go over an interesting ICML 2018 paper that proposes a framework for distilling BNNs using GANs:
Bayesian neural networks (BNNs) allow us to reason about uncertainty in a principled way. Stochastic Gradient Langevin Dynamics (SGLD) enables efficient BNN learning by drawing samples from the BNN posterior using mini-batches. However, SGLD and its extensions require storage of many copies of the model parameters, a potentially prohibitive cost, especially for large neural networks.
We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN). At test-time, samples are generated by the GAN. We show that this distillation framework incurs no loss in performance on recent BNN applications including anomaly detection, active learning, and defense against adversarial attacks.
By construction, our framework not only distills the Bayesian predictive distribution, but the posterior itself. This allows one to compute quantities such as the approximate model variance, which is useful in downstream tasks. To our knowledge, these are the first results applying MCMC-based BNNs to the aforementioned downstream applications.
๐ฃ๐ฎ๐ฝ๐ฒ๐ฟ ๐น๐ถ๐ป๐ธ: https://arxiv.org/abs/1806.10317
๐๐ถ๐ผ: Natan is a Principal Researcher and Research Leader at NICE. He has over 15 years of experience as an algorithm researcher, data scientist, and a research leader in a variety of domains such as: Speech, NLP, quantitative analysis and risks.
---------------------
๐๐ฒ๐-๐ฆ๐ต๐ผ๐ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ณ๐ผ๐ฟ ๐๐น๐ฎ๐๐๐ถ๐ณ๐ถ๐ฐ๐ฎ๐๐ถ๐ผ๐ป ๐๐ถ๐๐ต ๐ฆ๐๐ฎ๐ฟ๐ก๐ฒ๐ - ๐๐ฒ๐ผ๐ป๐ถ๐ฑ ๐๐ฎ๐ฟ๐น๐ถ๐ป๐๐ธ๐, ๐๐๐
Few-shot learning for classification has advanced significantly in recent years. Yet, these approaches rarely provide interpretability related to their decisions or localization of objects in the scene. In this paper, we introduce StarNet, featuring an end-to-end differentiable non-parametric star-model classification head. Through this head, the backbone is meta-trained using only image-level labels to produce good features for classifying previously unseen categories of few-shot test tasks using a star-model that geometrically matches between the query and support images. This also results in localization of corresponding object instances (on the query and best matching support images), providing plausible explanations for StarNetโs class predictions.
We evaluate StarNet on multiple few-shot classification benchmarks attaining significant gains on CUB and ImageNetLOC-FS. In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.
๐ฃ๐ฎ๐ฝ๐ฒ๐ฟ ๐น๐ถ๐ป๐ธ: https://arxiv.org/abs/2003.06798
๐๐ถ๐ผ: Leonid Karlinsky leads the CV & DL research team in the Computer Vision and Augmented Reality (CVAR) group @ IBM Research AI. His recent research is in the areas of few-shot learning with specific focus on object detection, metric learning, and example synthesis methods. He received his PhD degree at the Weizmann Institute of Science, supervised by Prof. Shimon Ullman.
---------------------
๐ญ๐ผ๐ผ๐บ ๐น๐ถ๐ป๐ธ: https://us02web.zoom.us/j/87175114825?pwd=NThPcFF3MzZSeHpLL1NhVmpYTitZdz09

DataTalks #29: Distilling BNNs with GANs and StarNet โ๏ธ๐ซโญ๏ธ