Sept 2 - Document Visual AI Workshop
Network event
32 attendees from 48 groups hosting
Details
In this hands-on workshop, you'll use FiftyOne and the High Quality Invoice Images for OCR dataset to run the full data-centric loop end-to-end: embed invoices with a modern visual document model, cluster them by structure, run LightOnOCR as your base model, and use per-sample evaluation scores layered onto embedding space to find *where* and *why* it fails.
Time, Date and Location
Sep 02, 2026
9:00 AM - 11:00 AM PST
Online. Register for the Zoom!
What You'll Walk Away With
- A working FiftyOne pipeline for any document collection you own
- A repeatable curation query that combines evaluation + embedding signals
- A fine-tuned LightOnOCR checkpoint that demonstrably outperforms the base model on your invoices
- The mental model that data curation — not architecture or hyperparameters — is the highest-leverage thing you can do to improve a document AI system
Related topics
Artificial Intelligence
Computer Vision
Machine Learning
