New Directions in LLMs


Details
We have three talks about recent trends in Generative AI for you this month (which will either be the 49th or 50th MeetUp we've held for the group, depending on how it's counted):
"Self-Improving LLMs" - Martin Andrews
There have been several papers that look at self improvement of LLMs released since the beginning of 2024 (such as Self-Rewarding, AlphaCodium and Self-Discover). Martin will describe how this line of work has progressed since our previous MeetUp on the topic was in July last year - with the potential for a demo or two...
"State Space Models 101" - Shubham Gupta
Many feel that SSMs could become a serious challenger to the Transformer in sequence modelling. To get us up to speed, Shubham will present an introduction to the topic, through HiPPO, S4 and H3! He'll also explain how Flash Attention works (and how the ideas were used for Selective SSM ~ Mamba).
"Gemini 1.5 Intro and Demos" - Sam Witteveen
With the release of Gemini 1.5 and its million token context window we have decided to reschedule the model merging talk to focus on a new break through from Google:
In this quick tour Sam will go through what the changes are to the new version of Gemini Pro 1.5 and go through some demos of what it can do with it’s 1 Million token window for text and for things like videos and code.
"LLMs + Cognitive Architecture = Generalist Agents?" - Nicholas Chen
In this short talk, Nicholas will introduce the CoALA framework for thinking about intelligent systems, and how recent works like VOYAGER and Generative Agents can be understood. He will also show how using this framework can lead to more capable agents (FRIDAY, an intelligent agent for automating computer tasks).
***
Talks will start at 7:00 pm and end at 8:50pm or so, at which point people normally come up to the front for a bit of a chat with each other, and the speakers.
As always, we're actively looking for more speakers - both '30 minutes long-form', and lightning talks. For the lightning talks, we welcome folks to come and talk about something cool they've done with keras_core, TensorFlow, PyTorch, JAX and/or Deep Learning for 5-10mins (so, if you have slides, then #max=10). We believe that the key ingredient for the success of a Lightning Talk is simply the cool/interesting factor. It doesn't matter whether you're an expert or an enthusiastic beginner: Given the responses we have had to previous talks, we're sure there are lots of people who would be interested to hear what you've been playing with. If you're interested in talking, please just introduce yourself to Martin or Sam at one of the events.

New Directions in LLMs