Saltar al contenido

Detalles

ABSTRACT:
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling.

Noteworthy features of DPK include: de-duping documents (exact dedupe and fuzzy dedupe), handling documents and code, language detection (spoken languages and programming languages), malware detection and creating embeddings.

In this hands-on workshop, we will demonstrate implementing an end-to-end RAG pipeline using all opensource technologies.

  1. Data Prep Kit for processing documents
  2. Milvus as vector database
  3. Llama 3 as the LLM

What do you need to participate in this workshop?

  1. A laptop with Python development environment (Setup Instruction)
  2. A Replicate account (FREE) - get one at replicate.com

INSTRUCTOR:

  1. Sujee Maniyam, Consulting AI Engineer / Developer Advocate

Sujee Maniyam is a seasoned practitioner focusing on Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He loves teaching and has taught and mentored thousands of professionals.
Sujee is a passionate user, advocate and contributor to open source. He is also an author and enjoys speaking at conferences and running hackathons and workshops, engaging with the community.

  1. Jiang Chen is the Head of Ecosystem and Developer Relations at Zilliz, the company behind the open-source vector database Milvus. Before joining Zilliz, he had previously served as a tech lead and product manager at Google, where he led the development of web-scale semantic understanding and search indexing that powers innovative search products such as short video search. He has years of industry experience handling massive unstructured data and multi-modal content retrieval. Jiang holds a Master's degree in Computer Science from the University of Michigan.

In his talk, Jiang will give hands-on advice on building RAG applications with the open-source Milvus database and share best practices in search quality and performance optimization.

SPONSOR:
The AI Alliance

Los miembros también están interesados en