veScale- A PyTorch-Native Auto-Parallel AI Framework for Ease of Use

Name: veScale- A PyTorch-Native Auto-Parallel AI Framework for Ease of Use
Start: 2024-11-07T09:00:00-08:00
End: 2024-11-07T10:00:00-08:00

Hosted By

Sujata T.

veScale- A PyTorch-Native Auto-Parallel AI Framework for Ease of Use

Details

veScale An Industrial-Level Framework for Easy-of-Use

🔥 PyTorch Native: veScale is rooted in PyTorch-native data structures, operators, and APIs, enjoying the ecosystem of PyTorch that dominates the ML world.
🛡 Zero Model Code Change: veScale decouples distributed system design from model architecture, requiring near-zero or zero modification on the model code of users.
🚀 Single Device Abstraction: veScale provides single-device semantics to users, automatically distributing and orchestrating model execution in a cluster of devices.
🎯 Automatic Parallelism Planning: veScale parallelizes model execution with a synergy of strategies (tensor, sequence, data, ZeRO, pipeline parallelism) under semi- or full-automation [coming soon].
⚡ Eager & Compile Mode: veScale supports not only Eager-mode automation for parallel training and inference but also Compile-mode for ultimate performance [coming soon].
📀 Automatic Checkpoint Resharding: veScale manages distributed checkpoints automatically with online resharding across different cluster sizes and different parallelism strategies.

Ziang Song: Ziang Song is a research scientist at ByteDance LLM team. He specializes in scaling up large-scale distributed training systems for large language models and multimodal models. He is one of the founding members of veScale, ByteDance's PyTorch-native distributed training framework. Before joining ByteDance, Ziang was a researcher at CMU with close collaboration with Microsoft Research and JHU-CLSP in both ML algorithms and distributed systems.

Events in Machine Learning Web Development

Courses and Workshops Database Professionals Technology

Open Source Development

See more events

Open Source Development

Online event

This event has passed

Open Source Development

public group

veScale- A PyTorch-Native Auto-Parallel AI Framework for Ease of Use