This is a developer-centric meetup focused on Apache Spark, Apache Flink, Apache Kafka, Apache Mesos, related Typesafe and Twitter OSS stacks, and broader distributed Data Science and Machine Learning. We're open to all OSS developers, vendors, consultants, and startups both using the tools and building or supporting them, attending, presenting, and organizing.
How it may be complementary to the original Spark Users, now Bay Area Spark Meetup: Spark in its end-to-end ecosystem -- Mesos, Akka, Kafka, Cassandra, etc., with focus on what works for the final goals of the whole pipeline. We will teach you how to use Scala for Spark to make you more effective, and consider devops options so you can get to production faster. We'll invite projects relevant to or inspired by Apache Spark, such as Apache Storm, Apache Flink, and others, and will be focused on putting together useful OSS as a system.
This is a joint event with SF Scala. RSVP there! The Zoom link is available only to those who do:
Exabytes Delivered Each Day: Some Lessons Building Large-Scale Cloud Software at Databricks
Cloud service developers need to handle massive scale workloads from thousands of customers with no downtime or regressions. In this talk, we’ll present our experience building a very large scale cloud service at Databricks, which provides a data and ML platform as a service running over AWS and Azure used by some of the largest enterprises in the world. Databricks launches millions of VMs per day in each of these clouds that process exabytes of data per day for interactive, streaming and batch production applications. This means that our control plane has to handle a wide range of workload patterns and cloud issues such as outages. We will describe how we built our control plane for Databricks using Scala services and open source infrastructure such as Kubernetes, Envoy, Prometheus, Apache Spark and Delta Lake, and various design patterns and engineering processes that we learned along the way to make development and operations reliable.
Speaker: Matei Zaharia is the Chief Technologist and cofounder at Databricks, the creator of Apache Spark, and an Assistant Professor at Stanford University, where he cofounded the DAWN Lab.
Note: Matei will keynote the Scale By the Bay 2020 Conference in November, the CFP is still open at https://scale.bythebay.io!