Monthly MLAI Meetup: Stable Diffusion and Bee Neurons
Details
Welcome to our February MLAI meetup! Location is the Community Hub at the Dock (912 Collins St Docklands). We will be in the multipurpose room.
Info for newcomers: MLAI is a social event for the AI community, as well as a talk. Our format:
- 6-6:30: Socialising
- 6:30-6:40: Announcements and AI news
- 6:40-8: Talk(s) and Q&A
- 8: Head to the nearest pub for dinner and keep socializing
Julian Greentree will give a short talk on how to build a faux bee brain using neural networks as part of some recent research.
Abstract: Recent results have shown the ability for Honeybees to count as high as 10, beyond the so called “subitizing threshold,” and that they have a concept of zero, discerning an empty set from the absence of a set, most recently I worked on a project showing that they also had the ability to decern odd and even numbers. It is unclear how critical any of these skills are to natural survival and, given the bee’s rather tight hardware restrictions, how much “free space” a bee brain has to dedicate to such ability. Based on previous modelling methods, I created a small and simple recurrent neural network to show that this is possible with as few as 5 neurons. Even though this means that, in theory a bee might be able to complete this with some tiny fraction of its 960,000-neuron brain, and it shows us that complicated and abstract mathematical concepts are easy to calculate; it does not show us that a bee “understands” parity, it also does not show that a bee generates a small or relatively efficient network like this.
Speaker’s bio: Julian is a PhD candidate working in the area of quantum computing at the University of Melbourne. His focus is on helping quantum computers achieve parity with traditional computing.
Louka Ewington-Pitsos will give a talk about text to image models (dall-e, Imagen, Stable Diffusion), how they work, and cover the key ideas from the paper "An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion" (https://arxiv.org/abs/2208.01618).
Slides: https://docs.google.com/presentation/d/1kVVNqjA7Ney0cCL-kgvZS99CC8OCeNG5n8Cg-XEPodg/edit?usp=sharing
Abstract: Text-to-image models offer unprecedented freedom to guide creation through natural language. Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes. In other words, we ask: how can we use language-guided models to turn our cat into a painting, or imagine a new product based on our favorite toy? Here we present a simple approach that allows such creative freedom. Using only 3-5 images of a user-provided concept, like an object or a style, we learn to represent it through new "words" in the embedding space of a frozen text-to-image model. These "words" can be composed into natural language sentences, guiding personalized creation in an intuitive way. Notably, we find evidence that a single word embedding is sufficient for capturing unique and varied concepts. We compare our approach to a wide range of baselines, and demonstrate that it can more faithfully portray the concepts across a range of applications and tasks.
Speaker’s bio: Louka is a self-taught data scientist who has founded multiple small startups (some successful, some not). He is focused on applying machine learning research to enterprise-level problems.
