Skip to content

Details

In this talk, Andy Zou will share the findings of a research paper on Universal and Transferable Attacks on Aligned Language Models. He will share about a simple and effective attack method that causes aligned language models to generate objectionable behaviors.

Artificial Intelligence
Deep Learning
Machine Learning
Data Science

Members are also interested in