Skip to content

Details

In MC08, you’ll learn how to turn real-world data into a reliable Retrieval-Augmented Generation (RAG) system using LlamaIndex. We’ll cover how to connect and ingest data from common enterprise sources (PDFs, docs, web pages, knowledge bases, databases), clean and structure it, and build an indexing + retrieval pipeline that consistently returns the right context at query time. You’ll implement chunking strategies, metadata design, embedding + vector store setup, and retrieval tuning (top-k, filters, hybrid search, reranking) so your assistant responds with grounded, source-backed answers instead of hallucinations.

Outcomes (what you’ll be able to do after this module)

  • Ingest and standardize documents from multiple data sources into a usable RAG-ready format
  • Build a LlamaIndex pipeline: loaders → preprocessing → chunking → embeddings → index → retriever
  • Design metadata schemas to enable scoped retrieval (by department, policy type, date, user role, etc.)
  • Improve retrieval quality using filters, hybrid retrieval, reranking, and query transformations
  • Produce answers with citations / references and a clear trace back to the source content
  • Plug the RAG module into your AI app (chatbot/assistant) as a reusable component for your project

Who this is for

  • Data scientists & ML/AI engineers building assistants on internal/company documents
  • Backend / full-stack developers integrating LLM features into products
  • Product, analytics, and automation teams who need trustworthy “chat with your data” systems
  • Anyone in the AI Residency building a project where accuracy, grounding, and traceability matter

This masterclass is part of the AI Residency.

Join the new AI Residency cohort to build this end-to-end with guided support, project feedback, and a production-ready workflow—from data ingestion → indexing → retrieval → evaluation → deployment.
https://academy.decodingdatascience.com/airesidencyfasttrack

Related topics

Innovation
Professional Development
Courses and Workshops
Data Analytics
Education & Technology

You may also like