LLMs & AI Infrastructure: From Fundamentals to Production

Name: LLMs & AI Infrastructure: From Fundamentals to Production
Start: 2025-08-22T19:00:00-04:00
End: 2025-08-22T19:40:00-04:00

Hosted By

manju y.

LLMs & AI Infrastructure: From Fundamentals to Production

Details

This mini-series is designed for developers, ML engineers, and AI enthusiasts who want to understand, build, and scale LLMs and AI systems. Across 10 hands-on classes, you will:

Learn how LLMs serve predictions efficiently at scale.
Understand AI infrastructure, caching, vector databases, and distributed pipelines.
Implement retrieval, indexing, multi-modal search, and RL-based inference loops.
Build end-to-end AI pipelines and monitoring solutions.

By the end of the course, you will have the skills to design production-ready LLM systems, optimize AI workloads, and handle multi-modal, multilingual, and retrieval-augmented pipelines.

## Class 1: Foundations of AI Infrastructure & LLMs

Topics Covered:

LLM, AI, LLM inference scaling: batching, speculative decoding
Neural Network, AI Architecture
GPU vs TPU trade-offs for inference

Hands-on:

Write an async batch inference server in Python
Implement Redis caching for inference API

Goal:
Understand how LLMs serve predictions efficiently, including hardware and software trade-offs, and how caching and batching improve throughput and latency.

### Class 1: Foundations of AI Infrastructure & LLMs

### Class 2: Fault Tolerance & Distributed AI

### Class 3: Vector Representations & Indexing

### Class 4: Scheduling & Optimization

### Class 5: Search & Retrieval

### Class 6: Embeddings & Multilingual AI

### Class 7: Advanced AI Pipelines

### Class 8: Multi-modal & Vision-Language Models

### Class 9: End-to-End LLM Systems

### Class 10: Capstone Simulation & Review

Events in