How ChatGPT Works: The Secrets of Modern LLMs

Name: How ChatGPT Works: The Secrets of Modern LLMs
Start: 2025-12-13T19:30:00+05:30
End: 2025-12-13T21:30:00+05:30

Hosted by venkatesh D.

CoderRange - AI , Big data , Data Science !.

Details

💡 Perfect for Devs, ML Engineers, Founders, and Curious Professionals

***

## 📅 Total Duration: 2 Hours

Format: 80 mins talk + 25 mins live demo + 15 mins Q&A
Goal: Understand how ChatGPT is actually built, how it thinks, and how to build with or fine-tune LLMs.

***

## 🧠 Full 2-Hour Content Breakdown

***

### ⏱️ 0–10 min: Introduction

🤔 What is an LLM?
📊 Real-world applications (ChatGPT, GitHub Copilot, Claude, etc.)
🧭 Session agenda & what they’ll walk away with

***

### ⏱️ 10–30 min: The Core Brain – Transformers

🤖 How transformers work: Self-attention, multi-head attention
🔁 Sequence-to-sequence & next-token prediction
🧱 Architecture of GPT (blocks, layers, position embeddings)

📊 Diagram: Full GPT model stack
🎥 Analogy: Predict the next word in a sentence like “autocomplete on steroids”

***

### ⏱️ 30–50 min: Training Pipeline – How LLMs Learn

🏗️ Pretraining: Language modeling objective (next-token prediction)
📚 Data: What’s used to train GPT-style models (web, code, books)
🧠 Fine-tuning:
Instruction tuning (follow commands)
RLHF (Reinforcement Learning with Human Feedback)

📊 Explain PPO + Reward Model
💡 Why RLHF makes ChatGPT feel “polite” and “useful”

***

### ⏱️ 50–70 min: Inference & System Design

🧩 Tokenization (BPE): What is a “token”? Why does it matter?
🔄 Token flow: Input → Model → Output
⚙️ System architecture:
API, frontend, backend
GPU inference, context caching, rate limiting

📊 Architecture Diagram: End-to-end flow of a ChatGPT API request

***

### ⏱️ 70–80 min: “Secrets” of ChatGPT’s Performance

| Secret | Insight |
| ------ | ------- |
| 🧠 Mixture-of-Experts (MoE) | GPT-4 may use sparse routing |
| 🚀 FlashAttention | Faster attention = cheaper inference |
| ⚖️ Alignment training | Safety filters & refusal mechanisms |
| 🧩 Prompt Engineering | The real “art” of using LLMs |

***

### ⏱️ 80–105 min: Live Demos: How to Use or Build with LLMs

Choose any 2-3 short demos from below:

#### ✅ 1. Use OpenAI GPT-4 API

Send a prompt using Python (openai SDK)
Show how token count and cost work

#### ✅ 2. Retrieval-Augmented Generation (RAG)

Build a “Chat with your Docs” using LangChain or LlamaIndex
Load PDF → embed → chat

#### ✅ 3. LoRA Fine-tuning (optional if audience is ML-heavy)

Use HuggingFace + LoRA to fine-tune Mistral/LLama on custom data

***

### ⏱️ 105–120 min: Q&A + Wrap-Up

Top questions: safety, hallucinations, token limits, copyright
Bonus topics to explore: Agents, Multimodal LLMs, Vector DBs
Share: GitHub repo, prompt sheet, learning links

Join Zoom Meeting

[https://us02web.zoom.us/j/86369463178?pwd=mhZqUrFbGvomnSgV8oDdUIwrEEUnf1.1](https://www.google.com/url?q=https://us02web.zoom.us/j/86369463178?pwd%3DmhZqUrFbGvomnSgV8oDdUIwrEEUnf1.1&sa=D&source=calendar&usd=2&usg=AOvVaw0yWqUaOdTCWtxujAFhyGrR)

Meeting ID: 863 6946 3178
Passcode: 673750

AI and Society

Artificial Intelligence

Machine Learning

Cloud Computing

Golang

How ChatGPT Works: The Secrets of Modern LLMs

CoderRange - AI , Big data , Data Science !.

Details

Members are also interested in