Name: [PDG 450] Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Start: 2025-08-26T20:00:00+02:00
End: 2025-08-26T22:00:00+02:00

**Link to article**: https://arxiv.org/pdf/2505.03335
**Title:** Absolute Zero: Reinforced Self-play Reasoning with Zero Data
**Content**: The Absolute Zero paradigm introduces AZR (Absolute Zero Reasoner), a system that generates its own training tasks and improves reasoning abilities without any external data, using a code executor to validate tasks and verify answers as a unified reward source. This approach addresses scalability concerns of current reinforcement learning methods that still depend on human-curated question-answer datasets, even when they avoid direct supervision of reasoning processes. Despite training entirely without external data, AZR achieves state-of-the-art performance on coding and mathematical reasoning benchmarks, outperforming existing zero-setting models that rely on tens of thousands of human examples.
**Slack link**: ml-ka.slack.com, channel: #pdg. Please join us -- if you cannot join, please message us here or to mlpaperdiscussiongroupka@gmail.com.

In the Paper Discussion Group (PDG) we discuss recent and fundamental papers in the area of machine learning on a weekly basis. If you are interested, **please** **read the paper beforehand** and join us for the discussion. If you have not fully understood the paper, you can still participate – everyone is welcome! You can join the discussion or simply listen in. The discussion is in German or English depending on the participants.

DavidFarago

AI Paper Discussion Group Karlsruhe

Technology

Data Science

Neural Networks

Deep Learning

Natural Language Processing

Artificial Intelligence

Machine Learning

Academics

Intellectual Discussions

Mario Zachmann

Lionel Chiron

Leonard Plotkin

r-stl

Christoph Michel

Vivi

David Wölfle

[PDG 450] Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Online event

Dieses Event teilen

[PDG 450] Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Details