Skip to content

Understanding DeepSeek-R1

Photo of Junling Hu
Hosted By
Junling H.
Understanding DeepSeek-R1

Details

DeepSeek-R1 is the first open-source LLM that outperforms OpenAI o1. In this talk, I will review the model architecture, training and finetuning of DeepSeek-R1. I will introduce the model behind DeepSeek-R1 called DeepSeek-V3. This model is then finetuned into DeepSeek-R1-Zero with large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) and demonstrates remarkable reasoning capabilities. DeepSeek-R1-Zero emerges with powerful reasoning behaviors, but it encounters challenges such as poor readability and language mixing. DeepSeek-R1 addresses these issues by incorporating multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks.

Speaker: Junling Hu

Links: DeepSeek-R1 Paper, Github, Model on HuggingFace

Join online at: https://us02web.zoom.us/meeting/register/l-ox2AFnQIapIwrpPDjnag

7-7:05 pm Meet & Chat
7:05-7:50pm Talk
7:50-8:00 Q&A

Photo of AI Frontiers Forum group
AI Frontiers Forum
See more events
Online event
This event has passed