Open Questions in LLM Reasoning


Details
Reasoning is the new frontier in LLMs. At least one successful formula for producing reasoning in LLMs is now in the public domain, thanks to DeepSeek-R1 Zero and DeepSeek-R1. However, lots of open questions remain and we want to explore them in this meetup event.
The idea is that everyone comes prepared with one question on LLM Reasoning that they want answered. Please also do some research on your question, so that we can learn from you.
To get you started on open questions (if you don't have one already), here's a small selection and references to get inspired.
1. How do LLMs reason? See the paper Procedural Knowledge in Pretraining Drives Reasoning in LLMs. Here's an YouTube interview of the author.
2. Is Reinforcement Learning the only way to get reasoning? It seems purely supervised fine-tuning can go a long way. See the paper s1: Simple test-time scaling.
3. How do we handle reasoning in domains where verification is neither cheap nor scalable? Example domains are law, creative writing, government policy etc. See the blog post The Problem with Reasoners & another blog post Why Reasoning Models will Generalize.
4. Is reasoning a new skill that "emerges" out of large scale Reinforcement Learning or was it already there? There are indications that the later is true. See the paper Understanding R1-Zero-Like Training: A Critical Perspective.
5. How was the dataset for DeepSeek-R1 created and how can this process be scaled to produce more training data? Haven't found a good reference for this yet; suggestions welcome.
And since we love ARC Prize, here is a nice summary of DeepSeek R1 Zero from the ARC Prize team.
Here is the rough plan for the evening
🍕🍻 18:30 - 19:00: Arrival and networking with pizza and drinks
🤝 19:00 - 19:30: Introduction
🗫 19:30 - 21:00: Round table discussions. If required, one organizer can give a quick overview on LLM reasoning first to ground the discussion.
Hope to see you there and have a great discussion! 🤗
Cheers!
- Alexandra, Dibya, Mainak and Nico

Open Questions in LLM Reasoning