Skip to content

Bay Area Ray Community Meetup

Photo of Jules Damji
Hosted By
Jules D.
Bay Area Ray Community Meetup

Details

Please join us for an evening of community technical talks from the users of Data and Ray community. We want to thank PingCAP for being our gracious host and facilitating the meetup!

👉 PLEASE RSVP/REGISTER FOR THIS MEETUP AT THIS LINK 👉 https://lu.ma/eb8tpu2f

​Agenda
(The times are not strict; they may vary slightly.)

  • ​5:30-6:00 pm: Networking, Snacks & Drinks
  • ​6:00 pm: Talk 1 (30-35 mins) : - How to build serverless database cloud service
  • ​Q & A (10 mins)
  • ​6:45 pm: Talk 2 (30-35 mins) : Multi-Region/Cloud Ray Pipeline with Distributed Caching
  • ​ Q & A (10 mins)
  • ​7:20 pm: Talk 3 (30-35 mins): - Introduction to Ray for Distributed and ML/AI Applications in Python

​Talk 1: How to build a serverless database cloud service
​Abstract:Relational databases have long been the core component of application systems, and their reliability and performance are critical to the stability and availability of applications. Distributed SQL, as the evolution direction of the next-generation database, offers built-in features such as horizontal scaling and high availability.
​In this talk, Li Shen will introduce the architecture and key technologies of the open-source distributed SQL database - TiDB, and how we utilize the capabilities provided by the Public Cloud to build a cloud-native Serverless database service.
​
​Bio:
​Li Shen is SVP and founding engineer of PingCAP, the company behind TiDB. He is a maintainer of several popular open-source projects including TiDB and TiKV, a distributed transactional key-value store and CNCF graduated project. Li has extensive experience in data infrastructure, software architecture design, and cloud computing.
​Talk 2: Multi-Region/Cloud Ray Pipeline with Distributed Caching
​Abstract: In some cases, the machine learning pipeline stages may be distributed across regions or clouds. Data preprocessing, model training, and inferencing are in different regions/clouds to leverage special resource types or services that exist in a particular cloud, and to reduce latency by placing inference near user-facing applications. Additionally, as GPUs remain scarce resources, it is getting more common to set up remote training clusters from where data resides. This multi-region/cloud scenario introduces challenges of losing data locality, resulting in latency and expensive data egress costs.
​In this talk, Beinan Wang, Senior Staff Software Engineer from Alluxio, will discuss how Alluxio’s open-source distributed caching system integrates with Ray in the multi-region/cloud scenario:

  • ​The data locality challenges in the multi-region/cloud ML pipeline
  • ​The stack of Ray+PyTorch+Alluxio to overcome these challenges, optimize model training performance, save on costs, and improve reliability
  • ​The architecture and integration of Ray+PyTorch+Alluxio using POSIX or RESTful APIs
  • ​ResNet and BERT benchmark results showing performance gains and cost savings analysis
  • ​Real-world examples of how Zhihu, a top Q&A platform, leveraged Alluxio’s distributed caching and data management with Ray’s scalable distributed computing to optimize their multi-cloud model training performance

​Bio: Beinan Wang
​Dr. Beinan Wang is a Senior Staff Software Engineer at Alluxio and a TSC of PrestoDB. Prior to Alluxio, he was the Tech Lead of the Presto team at Twitter and he built large-scale distributed SQL systems for Twitter’s data platform. He has twelve-year of experience working on performance optimization, distributed caching, and volume data processing. He received his Ph.D. in computer engineering from Syracuse University on the symbolic model checking and runtime verification of distributed systems.
​Talk 3: Introduction to Ray for ML/AI Applications in Python
​Abstract: An introduction to Ray (https://www.ray.io/), the system for scaling your Python and machine learning applications from a laptop to a cluster. We'll start with a hands-on exploration of the core Ray API for distributed workloads, covering basic distributed Ray Core API patterns for scaling ML workloads:

  • ​Remote Python functions as tasks
  • ​Remote objects as futures
  • ​Remote Python classes as stateful actors
  • ​Multi-model training with Ray Core APIs patterns

👉 PLEASE RSVP/REGISTER FOR THIS MEETUP AT THIS LINK 👉 https://lu.ma/eb8tpu2f

Photo of Ray Meetup group
Ray Meetup
See more events
Needs a location