Skip to content

Details

Join us to learn how to run open-source Large Language Models (LLMs) with HTTP-based inference endpoints inside your AKS cluster using the Kubernetes AI Toolchain Operator (KAITO). We’ll walk through the setup and deployment of containerized LLMs on GPU node pools and see how KAITO can help reduce operational burden of provisioning GPU nodes and tuning model deployment parameters to fit GPU profiles.

### Presenters:

Paul Yu | Senior Cloud Advocate, Microsoft | LinkedIn
Ishaan Sehgal | Software Engineer, Microsoft | LinkedIn

### Learning objectives

  • Learn how to use Prometheus-style metrics with Azure Monitor.
  • Learn how to visualize application and infrastructure state with Azure Managed Grafana.
  • Learn how to use the AKS Cost Analysis add-on to monitor the different aspects of your AKS environment.

📌**All About This Series**

📌**More About Learn Live**

Related topics

SaaS (Software as a Service)
Computer Programming
Web Development
Web Technology
DevOps

Sponsors

Microsoft Reactor YouTube

Microsoft Reactor YouTube

Watch past Microsoft Reactor events on-demand anytime

Microsoft Learn AI Hub

Microsoft Learn AI Hub

Learning hub for all things AI

Microsoft Copilot Hub

Microsoft Copilot Hub

Learning hub for all things Copilot

Microsoft Reactor LinkedIn

Microsoft Reactor LinkedIn

Follow Microsoft Reactor on LinkedIn

You may also like