Skip to content

Details

Following on from the previous week's unpacking of the Whisper paper (Robust Speech Recognition via Large-Scale Weak Supervision), in this session we switch gears and dive into the code.

🔧 What we’ll look at together:

  1. Preprocessing of audio data into the token equivalent representations fed into the encoder.,
  2. Cross-encoder mechanism.
  3. Learned Positional Encoding,
  4. Multitask training format specification

Please note that the session is not recorded.

Key links:

Discord joining instructions: https://bit.ly/llm-discord

AI Algorithms
Artificial Intelligence
Machine Learning
Data Science
Machine Learning with Python

Members are also interested in