Abstract: Generative adversarial networks (GANs) are a powerful approach to unsupervised learning. They have achieved state-of-the art performance in several domains. However, GANs are limited in two ways. They often fail to faithfully capture all the modes of the underlying data distribution---a phenomenon known as mode collapse---and are inherently incompatible with evaluation measures such as predictive log-likelihood. In this paper, we develop semi-implicit generative adversarial networks (SIGANs) to address these two shortcomings. SIGANs model data by adding noise to the output of a density network and optimize an entropy-regularized adversarial loss. The regularizer encourages the generative model of SIGAN to capture all the modes of the data distribution. The added noise enables tractable predictive log-likelihood via importance sampling and stabilizes the training procedure. Fitting SIGANs to data involves computing the intractable gradients of the entropy regularization term. SIGANs sidestep this intractability using unbiased estimates through Hamiltonian Monte Carlo. We evaluate SIGANs on several datasets and found they mitigate mode collapse, generate high-quality samples, and yield competitive log-likelihood scores when compared to state-of-the-art density estimators specifically trained to maximize log-likelihood.
- Doors at 6:15 pm (there will be someone downstairs checking you in)
- The talk begins promptly at 7 pm with Q&A following
- Networking & Drinks!
Food & beverages will be available.
------- Sponsored by Comet.ml ---------
Comet.ml is doing for ML what Github did for software development. We allow data science teams to automagically track their datasets, code changes, experimentation history and production models creating efficiency, transparency, and reproducibility
About our speaker:
Adji Bousso Dieng is a PhD candidate in the Statistics Department at Columbia University. She is jointly advised by David Blei and John Paisley. Her research focuses on combining probabilistic graphical modeling and deep learning to design models that are flexible and expressive enough to capture meaningful representations of high dimensional structured data such as text. She also works on developing inference methods for learning effectively with these models. Prior to joining Columbia, Adji worked as a Junior Professional Associate at the World Bank. She did her undergraduate training in France where she attended Lycee Henri IV and Telecom ParisTech--France's Grandes Ecoles system. Adji holds a Diplome d'Ingenieur from Telecom ParisTech and spent the third year of Telecom ParisTech's curriculum at Cornell University where she earned a Master in Statistics. Her research is funded by a Columbia University Dean Fellowship and very recently by a Google PhD Fellowship.