Skip to content

Details

​This is the 1st workshop in our series to update the LLM Zoomcamp content.

​This workshop updates Module 1: Introduction to LLMs and RAG.

​In this hands-on session, Alexey Grigorev will show how to build a basic Retrieval-Augmented Generation pipeline for answering questions about course FAQ documents.

​You’ll index FAQ documents from the Zoomcamp courses, retrieve relevant entries, and use the OpenAI API to generate answers based on the retrieved context.

​What you’ll learn:

  • ​What LLMs are and how they are used in question-answering systems
  • ​What Retrieval-Augmented Generation is and why it’s useful
  • ​How a basic RAG architecture works
  • ​How to prepare a Python environment for an LLM application
  • ​How to index FAQ documents from Zoomcamp courses
  • ​How to implement keyword search with MinSearch
  • ​How to build prompts with retrieved context
  • ​How to generate answers with the OpenAI API
  • ​How to refactor the RAG pipeline into modular code
  • ​How to replace MinSearch with Elasticsearch for a more realistic retrieval setup
  • ​How to run Elasticsearch with Docker and search indexed documents


By the end, you’ll have a working RAG pipeline that answers questions using FAQ documents from Data Engineering Zoomcamp, Machine Learning Zoomcamp, and MLOps Zoomcamp.
​Like the other workshops, this will be a live demo with practical tips and time for Q&A.

***

​All events in these series:

  1. Build Your First RAG Application with LLMs
  2. From RAG to AI Agents: Function Calling and Tool Use
  3. Vector Databases: Embeddings, Semantic Search, and Hybrid Retrieval
  4. RAG and Agents Evaluation: Measuring Retrieval and LLM Answer Quality
  5. Monitoring LLM Applications: Traces, Feedback, and Production Quality

***

## ​Thinking about Joining LLM Zoomcamp?

​This workshop covers the updated content for Module 1 of the LLM Zoomcamp, our free course on building practical LLM applications with RAG, vector search, evaluation, monitoring, and AI agents.
​You start with a simple RAG pipeline, then improve it with better retrieval, semantic search, function calling, evaluation, monitoring, and production practices.
​The course covers the full lifecycle of an LLM application: from the first working prototype to evaluation, monitoring, and a complete final project.
​The new cohort of LLM Zoomcamp starts on June 8, 2026. You can join it by registering here.

##

## About the Speaker

​Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series.
​Alexey is a software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books, including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS’17 Criteo Challenge.​

**Join our Slack: https://datatalks.club/slack.html**

You may also like