Fine-Tuning BERT for the Unstructured Data You Actually Have

Name: Fine-Tuning BERT for the Unstructured Data You Actually Have
Start: 2026-06-16T12:00:00-07:00
End: 2026-06-16T13:00:00-07:00

Hosted by Sage E.

Super Organizer

Building AI Together - Seattle

Details

Most fine-tuning attention goes to generative LLMs, but a large share of production NLP still runs on BERT-family encoders. They are small, fast, and cheap to serve, and on the tasks where most real data lives (classifying support tickets, extracting fields from documents, routing emails, semantic search) a fine-tuned BERT often matches or beats a prompted frontier model at a fraction of the cost and latency.

In this hands-on workshop, we'll fine-tune an open-weight BERT model on a custom text dataset and deploy it behind a simple UI. Base BERT is small enough that full fine-tuning runs comfortably on a single GPU. The whole pipeline runs on Flyte 2/Union, so data prep is cached, runs are reproducible and recoverable, and the same code scales from a laptop to a cluster without rewrites.

By the end, you'll have a working fine-tuned model and a reusable pipeline you can point at your own unstructured data.

What we'll cover

Where encoder models like BERT fit, and why they still win on classification, extraction, and embedding tasks
Fine-tuning an open-weight BERT model with Hugging Face Transformers
Orchestrating with Flyte 2: cached data prep, GPU-aware training, reproducible runs at any scale
Deploying behind a UI, with a path to low-latency, scaled inference

What you'll leave with

A fine-tuned BERT model trained on a custom dataset
A reusable training and deployment pipeline you can adapt to your own unstructured data
The knowledge to build and label datasets for classification and extraction tasks
A portfolio-ready project you can adapt to a production scenario at work

Who it's for
ML engineers and practitioners working with unstructured text who want models that are cheap to run and easy to deploy. Whether you're prototyping at work, evaluating infrastructure for a production NLP use case, or building a portfolio project, you'll leave with code you can keep extending.
Hosted by Sage Elliott, AI Engineer at [Union.ai](https://atunion.ai/?utm_source=luma)

Building AI Together - Seattle

Fine-Tuning BERT for the Unstructured Data You Actually Have

Building AI Together - Seattle

Details

Related topics

You may also like