Skip to content

Write Your Own GPT from Scratch: Session 1 Tokenizer

Photo of Steph Thompson
Hosted By
Steph T.
Write Your Own GPT from Scratch: Session 1 Tokenizer

Details

As a follow up to 'Write your own GPT from scratch' at the AppliedAI conference this May, we are organizing a series of short workshops to go over different elements, in depth, that were covered in the presentation.

Sona Maniyan will guide us constructing a GPT.

For the first session, the focus will be on setting up our workspace for the series of workshops and an in-depth look at tokenization. We will explore the role a good tokenizer plays in large language models and build a tokenizer from scratch. Following this, we will use python libraries that allow seamless tokenizing.

In this workshop, we will instantiate a Tokenizer object with a model, then set its normalizer, pre_tokenizer, post_processor, and decoder attributes to values we want. We will work through some examples using tiktoken, the opensource Python tokenizer library from OpenAI.

It's been 3+ yrs folks. As a reminder, bring your laptop computers. 💻

COVID-19 safety measures

COVID-19 vaccination required
Event will be indoors
The event host is instituting the above safety measures for this event. Meetup is not responsible for ensuring, and will not independently verify, that these precautions are followed.
Photo of League of Extraordinary Algorithms group
League of Extraordinary Algorithms
See more events
550 Vandalia St suite 231
550 Vandalia St suite 231 · St Paul, MN