Skip to content

Razvan Pascanu, A closer look at some limitations of transformer

Photo of Traian Rebedea
Hosted By
Traian R.
Razvan Pascanu, A closer look at some limitations of transformer

Details

We are pleased that Razvan Pascanu - one of the top Romanian researchers in ML / AI [1], will have an invited lecture at UPB on Wednesday, May 7th, at 7pm in EC105. Razvan is working at Google Deepmind, co-organizer of EEML (Eastern European Machine Learning summer school) and Romanian AI Days among many others, and he is a very close friend and mentor for the Romanian AI/ML community. This year he is also one of the PC Chairs at the most prestigious ML / deep learning conference - NeurIPS.
Razvan will be covering in his talk a very relevant research topic: the limitations of transformer architectures. Details and some references are below.

If you are working in AI/ML or have collaborators / students interested in these topics, it would be great to participate or forward the invite to them.
Have a nice day,
Traian

------------------------

Title: A closer look at some limitations of transformer

Abstract:
Transformers are becoming a dominant architecture in machine learning, being widely used from vision to language to reinforcement learning. They are part of the backbone of modern large language models. In this talk we will discuss some potential limitations of the architecture itself, focusing predominantly on the attention layer as a mechanism of temporal mixing of information. In particular I will focus on recent work looking at the ability of transformers to reason about position, generalize to longer contexts or explore or be able to learn full rank representations. I will finish the talk with some generic thoughts on how we can improve our understanding of these architectures, and are open questions facing us.

References:
https://arxiv.org/abs/2406.04267,
https://arxiv.org/abs/2410.01104,
https://arxiv.org/abs/2410.06205,
https://arxiv.org/html/2504.02732v2,
https://arxiv.org/abs/2504.16078
https://arxiv.org/abs/2503.21676
[1] - https://scholar.google.com/citations?user=6nKHDKYAAAAJ&hl=en&oi=ao (65k+ citations)

Photo of Bucharest Deep Learning group
Bucharest Deep Learning
See more events
FREE
100 spots left