[Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)

Name: [Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)
Start: 2023-09-30T09:00:00-07:00
End: 2023-09-30T17:00:00-07:00

Hosted by Junling H.

Bay Area Machine Learning

Details

This is a paid event. The registration is through Eventbrite (Meetup RSVP is not considered as registration ). Please register here
https://RLHF1Day.eventbrite.com

RLHF (Reinforcement Learning with Human Feedback) is the secret weapon behind ChatGPT and Llama. It is a crucial step to enhance the performance of an LLM. By the end of this workshop, you will gain complete understanding of RLHF, the major components including PPO, reward function and supervised finetuning. You will gain the confidence of knowing the complete process of RLHF and how to implement it.

Topics include (see details at the end):

Supervised finetuning
Reinforcement learning basics
PPO and its implementation
Training a reward function
Applying RLHF step by step

This workshop is for those who want to enhance their LLM model, and who are interested in RLHF.

What you get from this workshop:

Real-time interaction with the instructor.
4 Live Python notebooks that you can take home.
Certificate when finishing the class (upon request).
Join the community of AI builders.
A free one-on-one call with the instructor after the workshop.

Schedule details:

9-10 am. Overview of the process of finetuning and enhancing an LLM

10-11 am. Supervised finetuning

Understand the supervised finetuning process for LLM, from data formatting to the training procedure.

11-12 pm. Reinforcement learning fundamentals

Introduction to reinforcement learning process, the basic concepts including action, reward, policy and value function.

12 pm-12:30 pm. Deep Reinforcement Learning

Major methods in deep reinforcement learning, particularly the policy gradient method and actor-critic framework.

12:30-1 pm Break

1-2 pm. Introduction to PPO (Proximal Policy Optimization)

The PPO algorithm and its implementation. See it in action in a Python notebook.

2-3 pm. The reward function and its training process

Learning a reward function for language generation. How to train such a function and practical steps.

3-4 pm. Training the LLM with PPO

Apply PPO to enhance language generation. How to reframe the language model in reinforcement learning framework, and train it using PPO.

4-5 pm. Bring everything together and Q&A

Implement RLHF step by step, and enhance your model with feedback from humans. Starting with a pretrained LLM, we will enhance the model through the 3 steps that we have learned this course.
Introducing tools for doing RLHF and resources after this workshop.

Instructor: Junling Hu, https://www.linkedin.com/in/junlinghu/

Refund policy:

Risk free purchase. 100% refundable if you are not happy with the class. Simply submit your request for refund within 1 day after the class finishes.

The registration is through Eventbrite. Register at Eventbrite here: https://RLHF1Day.eventbrite.com

Machine Learning

Natural Language Processing

Robotics

Entrepreneurship

Python

Bay Area Machine Learning

[Workshop] Mastering RLHF in 1 day (LLM Crash Course #2)

Bay Area Machine Learning

Details

Sponsors

Members are also interested in