Name: What's the deal with RLHF?
Start: 2024-04-25T19:00:00-04:00
End: 2024-04-25T21:00:00-04:00
Location: 100 University Ave

Getting here: Enter the lobby at 100 University Ave (right next to St Andrew subway station), and message Giles Edkins on the meetup app or call him on 647-823-4865 to be let up to room 6H.

Reinforcement Learning through Human Feedback (RLHF), and related techniques, have been very important in aligning large language models with the goals of the companies that are creating them, and presumably with wider human values.

But how does this technique work, and what are its limitations? Is alignment "solved"? Find out this week!

Giles

Mario Gibney

Toronto AI Safety

Technology

Risk Management

New Technology

Safety

Critical Thinking

Artificial Intelligence Applications

AI and Society

Mathematics

Artificial Intelligence Machine Learning Robotics

Artificial Intelligence

Machine Learning

Software Engineering

Machine Learning Interpretability

Deep Learning

What's the deal with RLHF?

100 University Ave

Share

Toronto AI Safety

What's the deal with RLHF?

Toronto AI Safety

Details

Related topics

You may also like