A BDaaS Combo: Distributed Data Science, DevOps, and Docker


Details
A BDaaS Combo: Distributed Data Science, DevOps, and Docker
The data-driven innovations of data scientists can be game-changing. But the siloed efforts and custom-crafted prototypes of individual data scientists can be difficult to scale, reproduce, and share across multiple users. What works for an ad-hoc model in development may not necessarily work in production; what works as a one-off prototype on a laptop might not work as a consistent and repeatable process for large-scale distributed data science.
What’s needed is an approach that brings the agility, automation, and collaboration of DevOps to your data science teams. They require an agile and lean process that enables them to iterate quickly and fail fast. They need the ability to easily share data, models, and code in a secure distributed environment. They need the portability of Docker to eliminate the “works on my machine” problems when collaborating on code with their co-workers. And they require the flexibility to use their own preferred tools and try out new technologies in the rapidly changing field of data science.
In this meetup, we’ll discuss how Big-Data-as-a-Service can help you to:
• Create on-demand environments with your users’ preferred data science tools (e.g. Spark, Kafka, Zeppelin, JupyterHub, RStudio Server)
• Deploy environments once and run them anywhere – on-premises or in the public cloud – using Docker containers
• Provide enterprise-grade security, monitoring, and auditing for your data pipelines
• Improve agility, collaboration, and productivity for your data science and engineering teams
Join this meetup with Nanda Vijaydev (director of solutions management at BlueData and our resident data scientist) and learn how to bring DevOps agility to data science operations.
Agenda
6:30-7:00pm Registration, networking, pizza7:00-7:45pm Introduction and presentation7:45-8:00pm Q&A and discussion
Food and drinks will be available.
Use the Twitter hashtag #BDaaS and follow @BDaaSmeetup – and join the BDaaS group on LinkedIn here.
Directions and Parking
Google Maps: https://goo.gl/maps/yx7DjF2iXbN2
There is a parking structure behind the twin Mission Towers buildings. Please feel free to park there. The meetup will be in building 3979 (it’s the building to your left as you exit the parking structure, or to the right when standing in front of the two Mission Towers buildings and facing them). Enter the building and the meetup is just past the elevators on the first floor.
About the Speaker
Nanda Vijaydev has more than 10 years of experience in data management and data science. At BlueData, Nanda works with Spark, Hadoop, Kafka, and related technologies to build software solutions for data science operations and big data analytics use cases. She has she worked on multiple data science and big data projects for large enterprises in the financial services, insurance, healthcare, media, telecommunications, and other industries.
Prior to BlueData, she was a principal solutions architect at Silicon Valley Data Science and director of solutions engineering at Karmasphere. She has an in-depth understanding of open-source big data frameworks, data integration, ETL, data warehousing, and reporting as well as various IDEs, notebooks, and other data science tools in the R and Python ecosystems.
https://secure.meetupstatic.com/photos/event/7/3/3/6/600_460889494.jpeg
See you at the meetup!

A BDaaS Combo: Distributed Data Science, DevOps, and Docker