
What we’re about
Meet and discuss applications and algorithms related to machine learning and AI. Use our speaker sign-up form.
Other Cool Melbourne Meetups
Statistical Society of Australia: https://www.meetup.com/statistical-society-of-australia-victorian-branch/
MLOps: https://www.meetup.com/melbourne-mlops-community1/
BeerOps: https://www.linkedin.com/company/beerops/?originalSubdomain=au
Melbourne Automation Meetup: https://www.meetup.com/melbourne-automation-meetup/
Upcoming events
3

The surprising efficiency of recurrent reasoning models
912 Collins St, Docklands, VI, AUThe MLAI Meetup is a community for AI researchers and professionals which hosts monthly talks on exciting research. Our format is:
- 6:00 - 6:20: Socializing
- 6:20 - 6:40: Announcements and AI news
- 6:40 - 7:40: Talk(s) and Q&A
- 7:40 - 8:00 Networking
- 8:00: Head to the nearest pub for dinner
David Rawlinson & Long Dang: "The surprising efficiency of recurrent reasoning models"
Abstract: Large Language models (LLMs) still struggle with reasoning problems, defined as devising and executing complex, goal-oriented action sequences. Current solutions, such as Chain-of-Thought (CoT) and Test-Time Compute (TTC) techniques, can suffer from brittle task decomposition. In addition, auto-regressive output generation is prone to errors, which usually cannot be rectified.
In 2025 the Hierarchical Reasoning Model (HRM) was introduced by Wang et al (1). On 3 reasoning problems (Extreme-Sudoku, Maze navigation, and ARC-AGI tasks) HRM demonstrated performance comparable to large, pre-trained LLMs with orders of magnitude fewer trainable parameters and no pre-training. HRM uses a process of repeated recurrent convergence between two modules to produce a latent representing a problem solution.
Shortly after, Jolicoeur-Martineau released a pre-print (2) describing a thorough ablation of the ideas in HRM and her derivative model, known as the Tiny Recursive Model (TRM). TRM uses a similar recursive convergence process, even fewer parameters, and yet obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.
Finally, Dang and Rawlinson’s 2025 preprint explores HRM as a reinforcement learning Agent, allowing it to be applied to dynamic, uncertain or partially observable reasoning problems, or where the “correct” action is undefined (HRM and TRM use supervised learning). They demonstrate that computation from previous environment time-steps can be re-used during execution of a plan, crucial to efficiency and continuity of thought.
References:
1- Hierarchical Reasoning Model
by Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, and Yasin Abbasi Yadkori (2025)
https://arxiv.org/abs/2506.21734
2- Less is More: Recursive Reasoning with Tiny Networks
by Alexia Jolicoeur-Martineau (2025)
https://arxiv.org/abs/2510.04871
3- HRM-Agent: Training a recurrent reasoning model in dynamic environments using reinforcement learning
by Long H Dang and David Rawlinson (2025)
https://arxiv.org/abs/2510.22832
Speaker bios TBA27 attendees
Past events
138

