Skip to content

Details

Delta Lake storage format gives deep learning practitioners unique data management capabilities for working with their datasets. The challenge is that, as of now, it’s not possible to use Delta Lake to train PyTorch models directly.

PyTorch community has recently introduced a Torchdata library for efficient data loading. This library supports many formats out of the box, but not Delta Lake. This Delta Lake Deep Dive with Michael Shtelma will demonstrate using the Delta Lake storage format for single-node and distributed PyTorch training using the torchdata framework and standalone delta-rs Delta Lake implementation. Let's dive in! 🌊

Related topics

Big Data
Data Engineering
Python
Open Source
PyTorch

You may also like