🤯「Attention Is All You Need」: Decoding the Transformer Paper

Name: 🤯「Attention Is All You Need」: Decoding the Transformer Paper
Start: 2024-11-10T13:30:00+09:00
End: 2024-11-10T16:30:00+09:00
Location: Chiyoda Public Library

Hosted By

Mario

🤯「Attention Is All You Need」: Decoding the Transformer Paper

Details

OVERVIEW 📖

Join us for an in-depth look at the concepts in THE seminal "Attention Is All You Need" paper that paved the way for the generative AI boom. This session is designed for everyone, from complete beginners who are curious about AI to industry professionals seeking an in-depth overview of the technology that powers models like ChatGPT and Google’s GEMINI, to name a few.

This session aims to be one of the most comprehensive out there, ensuring that all levels of understanding are catered to without assuming prior knowledge. We’ll also have games to make the learning process more engaging.

We’ll start with a beginner-friendly overview of the significance of the Transformer Architecture📐, explaining its ability to handle long-range dependencies (e.g., relationships between words in sentences for better translation) more efficiently than RNNs and its impact on the development of models like GPT. This will be followed by an outline of its components before delving into the distinct roles of its parts, namely encoders and decoders🛡️, such as how and why the encoder maps the entire input sequence into a context-aware continuous representation (i.e., a numerical format with added information that the computer can process).

Next, we will go through each layer, exploring the math and rationale behind each🔢, using concrete examples. Don’t worry if math isn’t your strong suit—we’ll break down each formula step-by-step, including those implied but not fully shared in the paper, such as the detailed workings of the Softmax function in the attention mechanism or the layer normalization process.

To make it more practical, we’ll use a concrete sentence from the time it’s input into the transformer to the time it is translated, showing step-by-step how that process works.

Do you, on an intuitive level, understand why sine and cosine functions are used for positional encoding? How Q, K, and V vectors work together to produce an attention score? Or why attention is masked in the decoder? Let us explore these questions and more by methodically working our way through the paper, ensuring you grasp the core concepts and their implications.

IMPORTANT 📌
No need to bring anything; all necessary materials will be provided. Feel free to review basic matrix operations beforehand, but we'll do a quick recap of the math to ensure we're all on the same page.

⚠️ Note on Content
While we aim to cover all topics listed, the content or order of the session may be adjusted to best accommodate the flow of the event and ensure a successful learning experience for all. Your flexibility and understanding are appreciated!

***SPECIAL OFFER***
If you’re super curious but unsure if it’s worth it, I’m offering free participation in exchange for a video testimonial of your impressions of the session. Feel free to text me and we can work something out!

Events in Tokyo, JP Artificial Intelligence Neural Networks

Artificial Intelligence Applications Machine Learning Google

Tokyo AI

See more events

Tokyo AI

Sunday, November 10, 2024
1:30 PM to 4:30 PM JST

Chiyoda Public Library

Chiyoda City, Kudanminami, 1-chōme−2−１千代田区役所 · Tokyo

Tokyo AI

public group

🤯「Attention Is All You Need」: Decoding the Transformer Paper