Skip to content

LLM Interpretability: Tracing Thoughts

Photo of Rob
Hosted By
Rob
LLM Interpretability: Tracing Thoughts

Details

Join us at our next meetup this Thursday, where we'll be diving into the fascinating world of LLM Interpretability. In an age of increasingly powerful language models, understanding their inner workings is crucial for responsible development and deployment. Here, we will explore techniques to trace the thoughts of these models, gaining insights into their decision-making processes.

This session will focus on the practical application of interpretability methods and discuss the challenges and opportunities in this field.

Who is this meeting for?
Anyone interested in Data Science and Machine Learning. We aim to have the discussion at an intermediate to advanced level, but we welcome people of all levels.

Discussion Format:
Similar to a reading group where we give everyone a chance to share what they have learned and ask questions about where they got stuck. Those who did not look at materials beforehand or would just like to listen are welcome to pass their turn.

Suggested materials:
Anthropic's Tracing Thoughts blog: https://www.anthropic.com/research/tracing-thoughts-language-model
Short Explainer Video: https://www.youtube.com/watch?v=Bj9BD2D3DzA
Transformer Circuits Paper 1: https://transformer-circuits.pub/2025/attribution-graphs/methods.html
Transformer Circuits Paper 2: https://transformer-circuits.pub/2025/attribution-graphs/biology.html
Feel free to explore other resources!

Please try to spend some time learning about the topics and/or exploring a dataset before the event. The goal of the event is to have everyone come, speak about what they have learned so that we can all discuss and help each other.

Meeting Schedule:
6:00 - 6:30 - Arrive / informal chat
6:30 - Session Start - Introductions
6:35 - Discussion
7:25 - Nominate and vote for following meeting's next topic
7:30 - Meeting ends We hope to see you there!

Photo of Data Science Discussion Auckland group
Data Science Discussion Auckland
See more events
Grid AKL - Tech Cafe
Ground Floor - 101 Pakenhan Street West, Wynyard Quarter · Auckland