Gwanghyun Kim | Text-Guided Diffusion Models for Robust Image Manipulation
Details
Virtual London Machine Learning Meetup - 14.09.2022 @ 18:30 (BST)
Agenda:
- 18:25: Virtual doors open
- 18:30: Talk
- 19:10: Q&A session
- 19:30: Close
Sponsor: Evolution AI - Intelligent data extraction from corporate and financial documents.
Title: DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation
Speaker: Gwanghyun Kim, Ph.D. student at Seoul National University (SNU)
Papers:
Abstract: Gwanghyun (Bradley) Kim will talk about DiffusionCLIP, a robust text-guided image manipulation method using diffusion models (this work was done with Jong Chul Ye and Taesung Kwon when he was a Master's student at KAIST).
Recently, GAN inversion methods combined with Contrastive Language-Image Pretraining (CLIP) enable zero-shot image manipulation guided by text prompts. However, their applications to diverse real images are still difficult due to the limited GAN inversion capability. Specifically, these approaches often have difficulties in reconstructing images with novel poses, views, and highly variable contents compared to the training data, altering object identity, or producing unwanted image artefacts. To mitigate these problems and enable faithful manipulation of real images, Gwanghyun Kim and his colleagues propose a novel method, dubbed "DiffusionCLIP", that performs text-driven image manipulation using diffusion models.
Based on the full inversion capability and high-quality image generation power of recent diffusion models, DiffusionCLIP performs zero-shot image manipulation successfully even between unseen domains and takes another step towards general application by manipulating images from a widely varying ImageNet dataset. Furthermore, a novel noise combination method that allows straightforward multi-attribute manipulation is proposed. Extensive experiments and human evaluation confirmed robust and superior manipulation performance of DiffusionCLIP compared to the existing baselines.
Bio: Gwanghyun (Bradley) Kim is a Ph.D. student at Seoul National University (SNU), advised by Prof. Se Young Chun. He completed his M.S. degree at KAIST, and earned his B.S degree at Yonsei University. His recent works explore generative models and their applications, privacy-preserving distributed learning, and multi-modal learning. You can read his full profile here: https://gwang-kim.github.io




