Understanding DeepSeek-R1

Name: Understanding DeepSeek-R1
Start: 2025-01-31T19:00:00-08:00
End: 2025-01-31T20:00:00-08:00

Hosted by Junling H.

AI Frontiers Forum

Details

DeepSeek-R1 is the first open-source LLM that outperforms OpenAI o1. In this talk, I will review the model architecture, training and finetuning of DeepSeek-R1. I will introduce the model behind DeepSeek-R1 called DeepSeek-V3. This model is then finetuned into DeepSeek-R1-Zero with large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) and demonstrates remarkable reasoning capabilities. DeepSeek-R1-Zero emerges with powerful reasoning behaviors, but it encounters challenges such as poor readability and language mixing. DeepSeek-R1 addresses these issues by incorporating multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks.

Speaker: Junling Hu

Links: DeepSeek-R1 Paper, Github, Model on HuggingFace

Join online at: https://us02web.zoom.us/meeting/register/l-ox2AFnQIapIwrpPDjnag

7-7:05 pm Meet & Chat
7:05-7:50pm Talk
7:50-8:00 Q&A

AI Frontiers Forum

Understanding DeepSeek-R1

AI Frontiers Forum

Details

Related topics

You may also like