Skip to content

Details

Details
This will be a journal club event
Two Talks:

1. Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training. Link to paper.

2. Self Optimizing Deep Agents. Link to repo.

Speakers

  1. Rakshak Talwar, Cofounder of [Zenus.ai](https://zenus.ai)
  2. Dmitri Iourovitski, AI Research Scientist

Abstract
1. Diffusion models have achieved remarkable success across a wide range of generative tasks. A key challenge is understanding the mechanisms that prevent their memorization of training data and allow generalization. In this work, we investigate the role of the training dynamics in the transition from generalization to memorization. Through extensive experiments and theoretical analysis, we identify two distinct timescales: an early time τgen at which models begin to generate high-quality samples, and a later time τmem beyond which memorization emerges. Crucially, we find that τmem increases linearly with the training set size n, while τgen remains constant. This creates a growing window of training times with n where models generalize effectively, despite showing strong memorization if training continues beyond it. It is only when n becomes larger than a model-dependent threshold that overfitting disappears at infinite training times. These findings reveal a form of implicit dynamical regularization in the training dynamics, which allow to avoid memorization even in highly overparameterized settings. Our results are supported by numerical experiments with standard U-Net architectures on realistic and synthetic datasets, and by a theoretical analysis using a tractable random features model studied in the high-dimensional limit.

2. Deep Agents have showcased to be a a very powerful concept that is capable of accomplishing many tasks on their own. Recent experiments showcase that `Deep Agents` Achieve incredible performance gains with the assistnace of `sub-agents`. Configuring the right `sub-agents` and ensuring that they have the proper prompts, and even descriptions for other agents has so far been very challenging. The SODA (Self-Optimizing Deep Agents) project is an Open Source Project that is focused on automating the process of developing deep agents for the right use cases, and making discovering / refining the correct sub-agents a breeze. The intent is to allow users to focus more on the task at hand, and abstract away the `context engineering`
Acknowledgements : The following project leverages `LangChain`'s `DeepAgent` framework and is built ontop of it

Info
Austin Deep Learning Journal Club is group for committed machine learning practitioners and researchers alike. The group typically meets every first Tuesday of each month to discuss research publications. The publications are usually the ones that laid foundation to ML/DL or explore novel promising ideas and are selected by a vote. Participants are expected to read the publications to be able to contribute to discussion and learn from others. This is also a great opportunity to showcase your implementations to get feedback from other experts.

Sponsors:
Thank you to Station Austin for sponsoring Austin Deep Learning. Station Austin is the center of gravity for entrepreneurs in Texas. They bring together the best entrepreneurs in the state and connect them with their first investors, employees, mentors, and customers. To sign up for a Station Austin membership, click here.

AI summary

By Meetup

Two 30-minute talks in a journal club for ML practitioners and researchers on diffusion-model memorization dynamics and SODA project; aim to discuss implications.

Members are also interested in