Skip to content

LLMs & AI Infrastructure: From Fundamentals to Production

Photo of manju yadav
Hosted By
manju y.
LLMs & AI Infrastructure: From Fundamentals to Production

Details

This mini-series is designed for developers, ML engineers, and AI enthusiasts who want to understand, build, and scale LLMs and AI systems. Across 10 hands-on classes, you will:

  • Learn how LLMs serve predictions efficiently at scale.
  • Understand AI infrastructure, caching, vector databases, and distributed pipelines.
  • Implement retrieval, indexing, multi-modal search, and RL-based inference loops.
  • Build end-to-end AI pipelines and monitoring solutions.

By the end of the course, you will have the skills to design production-ready LLM systems, optimize AI workloads, and handle multi-modal, multilingual, and retrieval-augmented pipelines.

## Class 1: Foundations of AI Infrastructure & LLMs

Topics Covered:

  • LLM, AI, LLM inference scaling: batching, speculative decoding
  • Neural Network, AI Architecture
  • GPU vs TPU trade-offs for inference

Hands-on:

  • Write an async batch inference server in Python
  • Implement Redis caching for inference API

Goal:
Understand how LLMs serve predictions efficiently, including hardware and software trade-offs, and how caching and batching improve throughput and latency.

### Class 1: Foundations of AI Infrastructure & LLMs

### Class 2: Fault Tolerance & Distributed AI

### Class 3: Vector Representations & Indexing

### Class 4: Scheduling & Optimization

### Class 5: Search & Retrieval

### Class 6: Embeddings & Multilingual AI

### Class 7: Advanced AI Pipelines

### Class 8: Multi-modal & Vision-Language Models

### Class 9: End-to-End LLM Systems

### Class 10: Capstone Simulation & Review

Photo of Frontier AI Forum group
Frontier AI Forum
See more events
Online event
Link visible for attendees
FREE