Can we tell when LLMs know what they're talking about?
Details
Using SAEs researchers have been able to find latent activations which provide an indicator on when the LLM knows the entity in it's context. Join us to explore and discuss this paper and find out how this is done.
References:
https://arxiv.org/pdf/2411.14257