Link to paper: https://arxiv.org/abs/2307.15217
AI Safety Paper #1: Open Problems and Fundamental Limitations of RLHF