What we're about

This is a developer-centric meetup focused on Apache Spark, Apache Flink, Apache Kafka, Apache Mesos, related Typesafe and Twitter OSS stacks, and broader distributed Data Science and Machine Learning. We're open to all OSS developers, vendors, consultants, and startups both using the tools and building or supporting them, attending, presenting, and organizing.

How it may be complementary to the original Spark Users, now Bay Area Spark Meetup: Spark in its end-to-end ecosystem -- Mesos, Akka, Kafka, Cassandra, etc., with focus on what works for the final goals of the whole pipeline. We will teach you how to use Scala for Spark to make you more effective, and consider devops options so you can get to production faster. We'll invite projects relevant to or inspired by Apache Spark, such as Apache Storm, Apache Flink, and others, and will be focused on putting together useful OSS as a system.

Upcoming events (2)

First Choice Scale By the Bay 2020 CFP is now Open through May 31

The 8th Annual Scale By the Bay developer conference will be held either online or in person in November, 2020. The CFP is now open at https://scale.bythebay.io. The First Choice CFP will run until May 31st, when 1/2 of the program will be selected. The next 1/4 will be selected by June 30th, and so on. The bar will move higher in each iteration, allowing for the strongest talks to still join. Please submit your best talk early, and hope to see you on the program!

Exabytes Delivered Daily: Lessons Building Cloud Software at Databricks

This is a joint event with SF Scala. RSVP there! The Zoom link is available only to those who do: https://www.meetup.com/SF-Scala/events/270818386 Exabytes Delivered Each Day: Some Lessons Building Large-Scale Cloud Software at Databricks Cloud service developers need to handle massive scale workloads from thousands of customers with no downtime or regressions. In this talk, we’ll present our experience building a very large scale cloud service at Databricks, which provides a data and ML platform as a service running over AWS and Azure used by some of the largest enterprises in the world. Databricks launches millions of VMs per day in each of these clouds that process exabytes of data per day for interactive, streaming and batch production applications. This means that our control plane has to handle a wide range of workload patterns and cloud issues such as outages. We will describe how we built our control plane for Databricks using Scala services and open source infrastructure such as Kubernetes, Envoy, Prometheus, Apache Spark and Delta Lake, and various design patterns and engineering processes that we learned along the way to make development and operations reliable. Speaker: Matei Zaharia is the Chief Technologist and cofounder at Databricks, the creator of Apache Spark, and an Assistant Professor at Stanford University, where he cofounded the DAWN Lab. Note: Matei will keynote the Scale By the Bay 2020 Conference in November, the CFP is still open at https://scale.bythebay.io!

Photos (13)