Name: GPU Sharing for AI at Enterprise Scale
Start: 2025-05-19T11:00:00+03:00
End: 2025-05-19T12:00:00+03:00

**Abstract**
In this talk, we show you how we optimize GPU usage at scale with Hopsworks on Kubernetes. A single Hopsworks can contain 1000s of CPUs and GPUs, and Hopsworks builds on Kueue to enable fair sharing of resources for training (Ray, PyTorch, etc), inference (KServe/vLLM), and scalable compute workloads (Spark, Flink, etc) across teams and projects. We show we simplify Kueue's abstractions (cohorts, global queues, local queues) using Hopsworks' project-based multi-tenancy security model.

**About the Speaker**
Jim Dowling is CEO of Hopsworks and former Associate Professor at KTH Royal Institute of Technology. He is lead architect of the open-source Hopsworks Feature Store platform. He is currently writing a book for O'Reilly on "Building ML Systems: batch, real-time, and LLMs"

Doron Chen

Cloud Technology in the North

Technology

Linux

Open Source

Cloud Computing

Private Cloud

Cloud Security

Cloud Storage

Software Defined Networking

Containers

FaaS (Function as a Service)

GPU Sharing for AI at Enterprise Scale

Artificial Intelligence

Machine Learning

Online event

Share this event

GPU Sharing for AI at Enterprise Scale

Details