Skip to content

Details

We are excited to announce the June meetup scheduled for Thursday, June 18th.

1st Presentation: Scaling AI Workloads on K8S: Reliability Patterns for Prod. Agentic Platforms

Running AI agents in production exposes failure modes that traditional microservices never surface. This session draws from real-world experience building and operating an enterprise agentic AI platform on Azure Kubernetes Service — where node pool exhaustion, autoscaling misconfigurations, and invisible queue failures collide with live inference workloads. I’ll cover how to design for horizontal scalability across heterogeneous AI workloads, what signals actually predict cluster saturation before pods go Pending, and a concrete set of reliability patterns: when to use HPA vs KEDA vs Cluster Autoscaler for AI workloads, how to benchmark inference throughput under real load, and what a production-grade control plane looks like when the workload is an agent, not a REST API.

Speaker Bio: Jothsna Praveena Pendyala
Jothsna is an AI Platform Architect and Senior Data Scientist at Infosys, where she leads the design and operation of an enterprise-scale agentic AI platform on Azure Kubernetes Service and Langsmith. Her work spans cloud-native infrastructure, distributed systems reliability, and production observability for AI workloads. She is also an executive member of the ACM Dallas Chapter, an active conference speaker and panel moderator, and a researcher with publications at IEEE and NeurIPS venues.

6:30 - 6:45 - Social
6:45 - 6:55 - Club Business
6:55 - 8:30 - Scaling AI Workloads on Kubernetes:
8.30 - 8.35 - Social/Wrap-up

Related topics

You may also like