Skip to content

DataTalks #29: Distilling BNNs with GANs and StarNet โš—๏ธ๐Ÿ”ซโญ๏ธ

Photo of Shay Palachy Affek
Hosted By
Shay Palachy A.
DataTalks #29: Distilling BNNs with GANs and StarNet โš—๏ธ๐Ÿ”ซโญ๏ธ

Details

DataTalks #29: Distilling BNNs with GANs and FSL w/ StarNet โš—๏ธ๐Ÿ”ซโญ๏ธ

Our 29th DataTalks meetup will be held online and will feature talks on distilling BNNs with GANs and StarNet.

๐—ญ๐—ผ๐—ผ๐—บ ๐—น๐—ถ๐—ป๐—ธ: https://us02web.zoom.us/j/87175114825?pwd=NThPcFF3MzZSeHpLL1NhVmpYTitZdz09

๐—”๐—ด๐—ฒ๐—ป๐—ฑ๐—ฎ:
๐Ÿ”ถ 16 :30 - 17:15 โ€“ Distilling BNNs with GANs โ€“ Natan Katz, NICE
๐Ÿ”ด 17:20 - 17:40 โ€“ Few-Shot Learning for Classification with StarNet - Leonid Karlinsky, IBM

---------------------

๐——๐—ถ๐˜€๐˜๐—ถ๐—น๐—น๐—ถ๐—ป๐—ด ๐—•๐—ก๐—ก๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐—š๐—”๐—ก๐˜€ โ€“ ๐—ก๐—ฎ๐˜๐—ฎ๐—ป ๐—ž๐—ฎ๐˜๐˜‡, ๐—ก๐—œ๐—–๐—˜

In this talk I will go over an interesting ICML 2018 paper that proposes a framework for distilling BNNs using GANs:

Bayesian neural networks (BNNs) allow us to reason about uncertainty in a principled way. Stochastic Gradient Langevin Dynamics (SGLD) enables efficient BNN learning by drawing samples from the BNN posterior using mini-batches. However, SGLD and its extensions require storage of many copies of the model parameters, a potentially prohibitive cost, especially for large neural networks.

We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN). At test-time, samples are generated by the GAN. We show that this distillation framework incurs no loss in performance on recent BNN applications including anomaly detection, active learning, and defense against adversarial attacks.

By construction, our framework not only distills the Bayesian predictive distribution, but the posterior itself. This allows one to compute quantities such as the approximate model variance, which is useful in downstream tasks. To our knowledge, these are the first results applying MCMC-based BNNs to the aforementioned downstream applications.

๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—น๐—ถ๐—ป๐—ธ: https://arxiv.org/abs/1806.10317

๐—•๐—ถ๐—ผ: Natan is a Principal Researcher and Research Leader at NICE. He has over 15 years of experience as an algorithm researcher, data scientist, and a research leader in a variety of domains such as: Speech, NLP, quantitative analysis and risks.

---------------------

๐—™๐—ฒ๐˜„-๐—ฆ๐—ต๐—ผ๐˜ ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐—ณ๐—ผ๐—ฟ ๐—–๐—น๐—ฎ๐˜€๐˜€๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐˜„๐—ถ๐˜๐—ต ๐—ฆ๐˜๐—ฎ๐—ฟ๐—ก๐—ฒ๐˜ - ๐—Ÿ๐—ฒ๐—ผ๐—ป๐—ถ๐—ฑ ๐—ž๐—ฎ๐—ฟ๐—น๐—ถ๐—ป๐˜€๐—ธ๐˜†, ๐—œ๐—•๐— 

Few-shot learning for classification has advanced significantly in recent years. Yet, these approaches rarely provide interpretability related to their decisions or localization of objects in the scene. In this paper, we introduce StarNet, featuring an end-to-end differentiable non-parametric star-model classification head. Through this head, the backbone is meta-trained using only image-level labels to produce good features for classifying previously unseen categories of few-shot test tasks using a star-model that geometrically matches between the query and support images. This also results in localization of corresponding object instances (on the query and best matching support images), providing plausible explanations for StarNetโ€™s class predictions.

We evaluate StarNet on multiple few-shot classification benchmarks attaining significant gains on CUB and ImageNetLOC-FS. In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.

๐—ฃ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—น๐—ถ๐—ป๐—ธ: https://arxiv.org/abs/2003.06798

๐—•๐—ถ๐—ผ: Leonid Karlinsky leads the CV & DL research team in the Computer Vision and Augmented Reality (CVAR) group @ IBM Research AI. His recent research is in the areas of few-shot learning with specific focus on object detection, metric learning, and example synthesis methods. He received his PhD degree at the Weizmann Institute of Science, supervised by Prof. Shimon Ullman.

---------------------

๐—ญ๐—ผ๐—ผ๐—บ ๐—น๐—ถ๐—ป๐—ธ: https://us02web.zoom.us/j/87175114825?pwd=NThPcFF3MzZSeHpLL1NhVmpYTitZdz09

Photo of DataHack - Data Science, Machine Learning & Statistics group
DataHack - Data Science, Machine Learning & Statistics
See more events