Write Your Own GPT from Scratch: Session 2 Attention Part A


Details
New location: Minneapolis College, Room T3000
Sponsors: > https://Lab651.com/ > https://CodeSavvy.com
Ms. Sona Maniyan continues her presentations on Generative Pretrained Transformers and coding a GPT from scratch. In the July workshop, she led us through tokenization. Now Sona will help us understand a recent neural net structure call "Attention".
In this workshop we will explore the paper - Attention is All You Need (Vaswani et al, 2017 link below). We will discuss the paper in two parts.
In the first session we will go through the paper up to and including Sec 3.2.1 - Scaled Dot Product Attention. Our discussion will include
(a) the motivation for the transformer architecture,
(b) how it overcomes some of the drawbacks of other models to learn representations of varying length data, and
(c) get an intuition on the math behind self attention.
Bring your laptops for a few exercises on implementing toy examples for coding the math we discuss.
To get the most out of this discussion please read the paper (up to and including Sec 3.2.1)
https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
COVID-19 safety measures

Write Your Own GPT from Scratch: Session 2 Attention Part A