
About us
Welcome to the Building AI Together meetup!
💬 Join the community Slack group: https://slack.flyte.org/
Our community meetups are for data scientists and engineers in machine learning, infrastructure, and data. Our central topics are:
best practices for putting ml in production
ml and data workflow automation
machine learning at scale
data and machine learning pipelines
distributed computing
Kubernetes-native machine learning and data workflows
MLOps
This group is run by the wonderful people at Union.ai.
The founding team at Union created Flyte, the data-ware machine learning orchestrator.
Check Flyte out on GitHub ⭐: https://github.com/flyteorg/flyte
Flyte is a Kubernetes-native open-source platform for production-grade data and machine-learning pipelines. It caches executions, tracks data and dependencies, and integrates with countless data and ML stacks, including AWS Sagemaker, Distributed Tensorflow, PyTorch Distributed, Ray, AWS Batch, Kubernetes Pods, and more.
Union.ai also provides the open-source solutions Pandera for statistical validation and UnionML.
Upcoming events
2

Fine-Tuning BERT for the Unstructured Data You Actually Have
·OnlineOnlineMost fine-tuning attention goes to generative LLMs, but a large share of production NLP still runs on BERT-family encoders. They are small, fast, and cheap to serve, and on the tasks where most real data lives (classifying support tickets, extracting fields from documents, routing emails, semantic search) a fine-tuned BERT often matches or beats a prompted frontier model at a fraction of the cost and latency.
In this hands-on workshop, we'll fine-tune an open-weight BERT model on a custom text dataset and deploy it behind a simple UI. Base BERT is small enough that full fine-tuning runs comfortably on a single GPU. The whole pipeline runs on Flyte 2/Union, so data prep is cached, runs are reproducible and recoverable, and the same code scales from a laptop to a cluster without rewrites.
By the end, you'll have a working fine-tuned model and a reusable pipeline you can point at your own unstructured data.
What we'll cover- Where encoder models like BERT fit, and why they still win on classification, extraction, and embedding tasks
- Fine-tuning an open-weight BERT model with Hugging Face Transformers
- Orchestrating with Flyte 2: cached data prep, GPU-aware training, reproducible runs at any scale
- Deploying behind a UI, with a path to low-latency, scaled inference
What you'll leave with
- A fine-tuned BERT model trained on a custom dataset
- A reusable training and deployment pipeline you can adapt to your own unstructured data
- The knowledge to build and label datasets for classification and extraction tasks
- A portfolio-ready project you can adapt to a production scenario at work
Who it's for
ML engineers and practitioners working with unstructured text who want models that are cheap to run and easy to deploy. Whether you're prototyping at work, evaluating infrastructure for a production NLP use case, or building a portfolio project, you'll leave with code you can keep extending.
Hosted by Sage Elliott, AI Engineer at Union.ai17 attendees
Container-enabled Asyncio is All You Need
·OnlineOnlineAs AI applications and agents move from prototypes to production, Python developers are increasingly tasked with orchestrating large numbers of models, tools and external services.
These requirements often push teams toward specialized frameworks or domain-specific languages to manage concurrency and workflows, even though Python’s standard library already provides the core building blocks to solve these problems.
Speaker: Niels Bantilan - Chief ML Engineer at Union.aiThis talk demonstrates how engineers can leverage Python’s native asyncio library together with container orchestration platforms like Kubernetes to build scalable, production-ready AI workflows. It presents a practical explainer of asyncio, emphasizing the aspects most relevant to today’s AI systems, such as structured concurrency, task coordination, backpressure, timeouts and failure isolation. It demonstrates how asyncio can be a highly effective programming paradigm to coordinate compute and data flow on a Kubernetes backend, giving Python developers the scale they need to build production-grade AI applications and agents.
Through concrete examples, the session shows how common workflow patterns, including coordinating LLM calls, executing tools in parallel, streaming responses and interacting with rate-limited APIs, can be implemented directly with asyncio and other Python primitives. Rather than relying on declarative pipelines or custom Domain-Specific Language (DSLs), these patterns remain explicit, debuggable and easy to reason about using plain Python.The talk also explores how async Python can serve as a client to a scalable container orchestration backend, enabling AI services and agents to scale predictably while preserving readability and operational control. Topics include handling partial failures, retries, and high-throughput workloads without blocking or over-abstracting the developer’s programming paradigm.
By the end of the session, attendees will understand why asyncio coupled with container orchestrators like Kubernetes are sufficient to build scalable, Pythonic AI workflows and how using the standard library can reduce complexity and improve long-term maintainability.
5 attendees
Past events
97

