Saltar al contenido

Detalles

We are delighted to host an exclusive Ray Summit Meetup, hosted by Anyscale with Ray community talks, on the eve of the summit. Invited Ray community speakers will share how they use Ray to scale and solve challenging ML problems.

You don’t have to be registered for the Ray Summit to attend. The meetup is free for the community. Join us for the Ray Summit Happy Hour from 5:00 - 6:00 p.m., followed immediately by the meetup from 6:00 - 8:00 p.m.

IMPORTANT: Don’t RSVP on this meetup page. Instead, you will need to register for the meetup at the Ray Summit website to obtain a ticket.

PLEASE REGISTER HERE

AGENDA

  • 5:00 - 6:00 p.m. Ray Summit Community Happy Hour (in Seacliff Foyer)
  • 6:00 p.m. Welcome remarks, announcements, and agenda - Jules Damji, Anyscale
  • Talk 1: Ray + Arize: Close the ML infrastructure loop - Aparna Dhinakaran, Arize AI
  • Talk 2: Approaching Cluster Multi-tenancy with Ray Job- Jaehyun Sim, Ikigai Labs
  • Talk 3: Large-scale distributed approximate nearest neighbor search with Ray - Daniel Acuña, Syracuse University

Talk 1: Ray + Arize: Close the infrastructure loop
Abstract: Detecting, diagnosing, and resolving ML model performance can be difficult for even the most sophisticated ML engineers. As more machine learning models are deployed into production, it is imperative we have tools to monitor, troubleshoot, and explain model decisions. This talk will highlight common challenges seen in models deployed in production, including model drift, data quality issues, distribution changes, outliers, and bias, as well as best practices for model observability and explainability
Bio: Aparna Dhinakaran is the co-founder and chief product officer at Arize AI, a pioneer, and an early leader in machine learning (ML) observability. A frequent speaker at top conferences and thought leader in the space, Dhinakaran was recently named to the Forbes 30 Under 30.

Talk 2: Approaching Cluster Multi-tenancy with Ray Job
Abstract: Maintaining a highly available service with close to zero downtime and optimal performance is challenging and almost mandatory in the world of data-intensive operations. Our strategy for achieving scalability and reliability without compromising on the simplicity of implementation involves utilizing Ray API functionality in simple but clever ways. We'll explore how to incorporate multi-cluster sanity, seamlessly swappable dependencies, and fail-safe cluster transitions with long-running Ray clusters to meet the demands of modern cloud infrastructure.
Bio: Jaehyun Sim is the director of engineering at Ikigai Labs, where he is building a highly scalable and interactive data pipelining platform for raw data. He is a CNCF-certified CKA and CKAD and enjoys working with solving big data problems with a cloud-native approach, such as Kubernetes and AWS.

Talk 3: Large-scale distributed approximate nearest neighbor search with Ray
Abstract: One of the simplest and most reliable learning methods in AI is to use memory: retrieve training data points that are closest to the testing data. However, as the datasets grow, such a search gets prohibitively expensive.

This talk will describe how we use Ray to develop approximated large-scale, distributed nearest neighbor search. In particular, I will describe applications to fraud detection in images of the scientific literature. Next, I will describe how we harness Ray to process tens of millions of scientific articles and images and billions of keypoints extracted from these images. Then, I will describe how we use Ray to integrate GPUs into this computation, effectively cutting down billion-scale searches from days to seconds. Finally, I will discuss how we use Ray Serve to provide this search as a service, hiding all the complexity behind a simple interface. I will end by discussing how Ray compares to other tools we have attempted to use in the past, including Message Passing Interface (MPI) and Dask, and discuss the advantages and disadvantages of Ray.
Bio: Daniel Acuña is an assistant professor in the School of Information Studies at Syracuse University, Syracuse, NY. His current research aims to understand decision-making in science, from helping hiring committees to predict future academic success to removing the potential biases that scientists and funding agencies commit during peer review. To achieve these tasks, Dr. Acuña harnesses vast datasets about scientific activities and applies machine learning and artificial intelligence to uncover rules that make publication, collaboration, and funding decisions more successful.

Temas relacionados

AI Algorithms
Machine Learning
Distributed Systems
Parallel Programming
Python

También te puede gustar