Skip to content

Open source data ingestion for RAGs with Dlt

E
Hosted By
Events D.
Open source data ingestion for RAGs with Dlt

Details

About the event

In this hands-on workshop, we’ll learn how to build a data ingestion pipeline using dlt to load data from a REST API into LanceDB so you can have an always up-to-date RAG.
​​We’ll cover the following steps:

  • ​​Extract data from REST APIs
  • ​​Loading and vectorizing into LanceDB, which unlike other vector DBs stores the data and the embeddings
  • ​​Keeping your data up to date with incremental loading

​​By the end of this workshop, you’ll be able to write a portable, OSS data pipeline for your RAG that you can deploy anywhere, such as Python notebooks, virtual machines, or orchestrators like Airflow, Dagster, or Mage.

​​This event is sponsored by dlthub.

​​​​​DataTalks.Club is the place to talk about data. Join our slack community!

Photo of Berlin DataTalksClub Group group
Berlin DataTalksClub Group
See more events
FREE