[Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)

![[Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)](https://secure.meetupstatic.com/photos/event/b/6/4/highres_515642916.webp?w=750)
Details
This is a paid event. The registration is through Eventbrite (Meetup RSVP is not considered as registration ). Please register here
https://RLHF1Day.eventbrite.com
RLHF (Reinforcement Learning with Human Feedback) is the secret weapon behind ChatGPT and Llama. It is a crucial step to enhance the performance of an LLM. By the end of this workshop, you will gain complete understanding of RLHF, the major components including PPO, reward function and supervised finetuning. You will gain the confidence of knowing the complete process of RLHF and how to implement it.
Topics include (see details at the end):
- Supervised finetuning
- Reinforcement learning basics
- PPO and its implementation
- Training a reward function
- Applying RLHF step by step
This workshop is for those who want to enhance their LLM model, and who are interested in RLHF.
What you get from this workshop:
- Real-time interaction with the instructor.
- 4 Live Python notebooks that you can take home.
- Certificate when finishing the class (upon request).
- Join the community of AI builders.
- A free one-on-one call with the instructor after the workshop.
Schedule details:
9-10 am. Overview of the process of finetuning and enhancing an LLM
10-11 am. Supervised finetuning
Understand the supervised finetuning process for LLM, from data formatting to the training procedure.
11-12 pm. Reinforcement learning fundamentals
Introduction to reinforcement learning process, the basic concepts including action, reward, policy and value function.
12 pm-12:30 pm. Deep Reinforcement Learning
Major methods in deep reinforcement learning, particularly the policy gradient method and actor-critic framework.
12:30-1 pm Break
1-2 pm. Introduction to PPO (Proximal Policy Optimization)
The PPO algorithm and its implementation. See it in action in a Python notebook.
2-3 pm. The reward function and its training process
Learning a reward function for language generation. How to train such a function and practical steps.
3-4 pm. Training the LLM with PPO
Apply PPO to enhance language generation. How to reframe the language model in reinforcement learning framework, and train it using PPO.
4-5 pm. Bring everything together and Q&A
- Implement RLHF step by step, and enhance your model with feedback from humans. Starting with a pretrained LLM, we will enhance the model through the 3 steps that we have learned this course.
- Introducing tools for doing RLHF and resources after this workshop.
Instructor: Junling Hu, https://www.linkedin.com/in/junlinghu/
Refund policy:
Risk free purchase. 100% refundable if you are not happy with the class. Simply submit your request for refund within 1 day after the class finishes.
The registration is through Eventbrite. Register at Eventbrite here: https://RLHF1Day.eventbrite.com

Sponsors
[Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)