Dive into Deep Learning: Coding Session#5 Attention Mechanism II (Americas/EMEA)
詳細
📌 Session #5 – Attention mechanism (Transformer and BERT) implementation (Part II)
📌 Introduction, Coding & Discussion
About:
The goal of this series is to provide code-focused sessions by reimplementing selected models from the interactive open source book "Dive into Deep Learning" http://d2l.ai/ by Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola. In this session, we will continue our discussion about the attention mechanism and we’ll be covering parts of Chapter 10. Attention Mechanism https://d2l.ai/chapter_attention-mechanisms/
We recommend interested participants read these chapters of the book to take full advantage of the session.
These sessions are meant for people who are interested in implementing models from scratch and/or never have implemented a model before. We hope to help participants either get started in their Machine Learning journey or deepen their knowledge if they already have previous experience.
We will try to achieve this by:
- 
Helping participants to create their models end-to-end by reimplementing models from scratch, and discussing what modules/elements need to be included (e.g. data preprocessing, dataset generation, data transformation, etc…) to train an ML model. 
- 
Discussing and resolving coding questions participants might have during the sessions. 
📌 Session Leads: Devansh Agarwal and Kshitij Aggarwal
📌 Prerequisites
Even though we welcome everybody to join the sessions, it is highly recommended to have at least intermediate Python skills as we will be using PyTorch to implement models. We also recommend participants have a foundational knowledge of calculus, linear algebra, and statistics/probability theory.
📌 Session Structure
● 30 min – Introduction
● 60 min – Live Coding
● 30 min – Discussion
📌 Join Zoom Meeting
https://us02web.zoom.us/j/82258057264?pwd=dUVyWkQ3dElyNDR4TTM4L3BDNHVEdz09
📌 ORGANIZER BIO
Devansh Agarwal is a Data Scientist at BMS. He graduated with a Ph.D. in Astronomy where he has developed pipelines using high-performance computing and machine learning to aid the discovery of astronomical objects. https://www.linkedin.com/in/devanshkv/
Kshitij Aggarwal is a 4th-year graduate student at the Department of Physics and Astronomy at West Virginia University. He uses data analysis, machine learning, and high-performance computing to discover and study a new class of astronomical objects called Fast Radio Bursts. https://kshitijaggarwal.github.io/
https://www.linkedin.com/in/kshitijaggarwal13/
● FULL CURRICULUM
📌 Session 1:
Coding env setup example and book presentation
A quick review of ML domains (supervised/unsupervised/RL)
General Architecture/Components of ML code
Implementation of simple MLP-model
http://d2l.ai/chapter_introduction/index.html
📌 Session 2:
CNN model (LeNet/ResNet) implementation
http://d2l.ai/chapter_convolutional-neural-networks/index.html
📌 Session 3:
RNN model (LSTM) implementation
http://d2l.ai/chapter_recurrent-neural-networks/index.html
📌 Session 4:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html
📌 Session 5:
Attention mechanism (Transformer) implementation
http://d2l.ai/chapter_attention-mechanisms/index.html
📌 Session 6:
Generative adversarial networks (DCGAN) implementation
http://d2l.ai/chapter_generative-adversarial-networks/index.html
● MLT PATRON
Become a MLT Patron and help us to keep MLT meetups like this inclusive and for free. https://www.patreon.com/MLTOKYO
● SUBSCRIBE
Subscribe to our monthly newsletter: https://mltokyo.ai/membership-join
● FIND MLT RESOURCES
Github: https://github.com/Machine-Learning-Tokyo
Youtube: https://www.youtube.com/MLTOKYO
Slack: https://bit.ly/3ai1kte
● CODE OF CONDUCT
MLT promotes an inclusive environment that values integrity, openness, and respect. https://github.com/Machine-Learning-Tokyo/MLT_starterkit

