Building a Big Data Stack on Kubernetes


Details
Event Description
There is growing interest in running Apache Spark natively on Kubernetes (see https://github.com/apache-spark-on-k8s/spark). This meeup will discuss how to build a big data stack on Kubernetes. Specifically, you will learn how Spark scheduler can still provide HDFS data locality on Kubernetes by discovering the mapping of Kubernetes containers to physical nodes to HDFS datanode daemons. You’ll also learn how you can provide Spark with the high availability of the critical HDFS namenode service when running HDFS in Kubernetes.
Presenter
Pepperdata Founder and CTO, Sean Suchter
Speaker Bio
Sean is the co-founder and CTO of Pepperdata. Previously, Sean was the founding GM of Microsoft’s Silicon Valley Search Technology Center, where he led the integration of Facebook and Twitter content into Bing search. Prior to Microsoft, Sean managed the Yahoo Search Technology team, the first production user of Hadoop. Sean joined Yahoo through the acquisition of Inktomi, and holds a B.S. in Engineering and Applied Science from Caltech.
Agenda
• 7:00pm: Doors open, networking, snacking
• 7:30pm: Start presentation
• 8:15pm: Conclusion + Q and A
Pepperdata: DevOps for Big Data
Pepperdata products enable developers to rapidly debug, optimize, and understand production Big Data applications while also enabling operators to diagnose and automatically solve performance problems in production multi-tenant clusters. Pepperdata products bring critical performance feedback to every phase of the DevOps cycle for Big Data.
A growing list of Fortune 200 companies are Pepperdata customers. They trust Pepperdata to ensure that Big Data applications consistently and predictably deliver business critical solutions. Pepperdata products are certified on all Big Data distributions, with support for both MapReduce and Spark. Pepperdata supports clusters running on-premise and in the cloud. Learn more at www.pepperdata.com.
About Baylisa
BayLISA stands for: Bay Area Large Installation System Administrators. BayLISA is the premiere system administration user group in Silicon Valley and the San Francisco Bay Area. Founded in the very early 90s after the fourth Large Installation System Administration (LISA) conference, BayLISA has supported and educated systems, network, storage, virtualization, and other technology professionals in the Bay Area for over 20 years.
We use Meetup to coordinate meeting attendance, announcements, and reminders. We have our official organizational presence on the web, including links to membership, sponsorship, and mailing lists, at www.baylisa.org (http://www.baylisa.org/).

Building a Big Data Stack on Kubernetes