Zum Inhalt springen

Details

Most RAG tutorials stop at proof of concept. They don't cover what breaks when traffic spikes or document volumes grow. This lab focuses on production: infrastructure sizing, end-to-end Helm-based deployment of NVIDIA NIMs, and customizing pipeline components. The goal is to give your organization a concrete path from working prototype to a deployment that holds under real load.

Who Is this for
This certification course is available free of charge for academic staff and students. You will need an active academic email address to access the NVIDIA DLI environment and Google Meet link. Please fill in the following form to provide this: https://forms.gle/WhUsMyTH1iXNrpip7
Learning Objectives
After completing this course you will be able to:

  • Launch RAG pipeline applications onto a Kubernetes cluster using Helm and the NVIDIA NIM Operator
  • Use NVIDIA NIMs for scalable, containerized LLMs and embedding models
  • Connect, update, add, and autoscale application components
  • Monitor application performance with Prometheus and Grafana

Topics Covered

  • NVIDIA NIMs
  • Kubernetes
  • Helm
  • Grafana
  • Prometheus

Course Outline

  • Course Overview: Key application components and course structure
  • Course Setup: The interactive environment, Kubernetes setup, and the commands you'll need
  • Kubernetes-based Deployment: Deploying a RAG pipeline with Kubernetes and Helm, using individual NIM services within the pipeline
  • Monitoring: DCGM-based monitoring, Grafana and Prometheus configuration
  • Autoscaling: HPA-based autoscaling on custom metrics, load testing the application

Verwandte Themen

Artificial Intelligence
Computer Vision
Machine Learning

Das könnte dir auch gefallen