[PDG 468] Nested Learning: The Illusion of Deep Learning Architecture
Details
Link to article: https://abehrouz.github.io/files/NL.pdf
Title: Nested Learning: The Illusion of Deep Learning Architecture
Content: Nested Learning (NL) is a new paradigm that frames ML models as nested optimization problems, revealing that standard optimizers like Adam are essentially associative memory modules compressing gradient information. Building on this insight, the authors develop more expressive optimizers and self-modifying sequence models that learn their own update rules. They combine these with a generalized "continuum memory system" to create Hope, a continual learning module showing promising results in language modeling, few-shot learning, and long-context reasoning.
Slack link: ml-ka.slack.com, channel: #pdg. Please join us -- if you cannot join, please message us here or to mlpaperdiscussiongroupka@gmail.com.
In the Paper Discussion Group (PDG) we discuss recent and fundamental papers in the area of machine learning on a weekly basis. If you are interested, please read the paper beforehand and join us for the discussion. If you have not fully understood the paper, you can still participate – everyone is welcome! You can join the discussion or simply listen in. The discussion is in German or English depending on the participants.
