add-memberalign-toparrow-leftarrow-rightbellblockcalendarcamerachatchevron-leftchevron-rightchevron-small-downchevron-upcircle-with-crosscomposecrossfacebookflagfolderglobegoogleimagesinstagramkeylocation-pinmedalmoremuplabelShape 3 + Rectangle 1pagepersonpluspollsImported LayersImported LayersImported LayersshieldstartwitterwinbackClosewinbackCompletewinbackDiscountyahoo

Cloudera, Hortonworks, MapR, and Pivotal come together to discuss Apache Spark

TOPIC: “Top Hadoop Distribution Vendors – Cloudera, Hortonworks, MapR, and Pivotal – come together to discuss Apache Spark"

Many are saying Apache Spark is the next wave of innovation in Big Data – extending the capabilities of Hadoop and providing a unified platform for batch and real-time processing.  The Open Source community is excited and supportive as evidence by Spark’s recent promotion to a full fledge Apache project in February of 2014.  Big Data vendor leaders are responding by introducing Spark’s capabilities into their architecture.  Come join us for a lively panel discussion between the top Hadoop distribution vendors – Cloudera, Hortonworks, MapR, and Pivotal – to hear their vision, strategy, and capabilities around Apache Spark.  This will be a rare opportunity to see these four leading vendors on one panel, hear from their experts, and get their insight on best practices, real use cases, and solutions around Spark implementation.  

Networking starts at 6pm and our meetup will get underway at 6:30.  More details to come!  Please save the date.

Speaker/Panelist Bios (this time in reverse alphabetical order!)

Dan Baskette is a Principal Community Engineer with Pivotal.  His role is a combination of field and engineering enablement in which he serves as a product specialist to the field and provides direct feedback to the engineering organization to enable rapid product improvement cycles.  He also supports proof of concept work for engineering to help prove out ideas and new potential products.  Prior to Pivotal, Dan spent the last 10 years working at EMC Corporation.  His last role was working in the EMC CTO Office as a Hadoop and Big Data specialist where he assisted in proving out large scale data architectures for large web-based and telecommunications customers.  Before that, he spent 6 years at Sun Microsystems where he rode the Dot in Dot Com bubble to it’s peak before leaping off for a new adventure.  Dan graduated in Computer Science from the University of Tennessee, and enjoys spending as much time as possible back in the nearby Smoky Mountains.  

Keys Botzum is a Senior Principal Technologist with MapR Technologies. He has over 15 years of experience in large scale distributed system design. At MapR his primary responsibility is working with customers as a consultant, but he also teaches classes, contributes to documentation, and works with MapR engineering.  Previously he was a Senior Technical Staff Member with IBM and a respected author of many articles on WebSphere Application Server as well as a book. He holds a Masters degree in Computer Science from Stanford University and a B.S. in Applied Mathematics/Computer Science from Carnegie Mellon University.

Casey Stella is a Principal Architect with Hortonworks with a special focus on Data Science.  He spends his time with a variety of clients, large and small, mentoring and helping them use Hadoop to solve their problems. He was an architect and software engineer at Explorys, a startup spun out of the Cleveland Clinic, focusing on data mining and medical informatics using Hadoop and HBase. Prior to that, he has worked on a number of ventures across a number of industries, including scientific programming in the oil industry, writing scalable server infrastructure for VOIP and working on metadata repositories at Oracle.  All of these things have one thing in common, they deal with large amounts of data.  In a galaxy far, far away and a long time ago, he was a graduate student at Texas A&M in the Department of Mathematics.

Ted Malaska is a Sr. Solutions Architect with Cloudera.  He has spent the last four years working with Hadoop and supporting over 40 clients with their Big Data implementations, some with over 200 clusters.  He is the co-author of the up-and-coming O’Reilly book "Hadoop Application Architecture" which will feature Apache Spark in three of the nine chapters.  He is also an active contributor to nine Hadoop Ecosystem projects including two minor contributions to Spark.  At Cloudera, he has been very involved with multiple Spark applications for customers and he has gained experience with Spark, Spark-GraphX, and Spark Streaming.

Parking and Transit

Central Library is located just a few blocks walking distance between the Ballston Metro and the Virginia Square Metro.

Free parking is available in the parking garage and surface lots near Central Library.  There is no time limit after 6pm.

Additional transit options:

Join or login to comment.

  • Chris R.

    Best Course in Apache Spark and Scala Language? Just getting into the technology.

    August 14, 2014

    • Darin J.

      For Scala look at ordersky functional and reactive programming in Scala on coursera. For spark tutorials, go to sparksummit website, the training they offered at the conference is online as is pretty good.

      August 15, 2014

  • Trevor L.

    Everyone, here are the Q&A videos from the Meetup session:

    1 · July 30, 2014

  • Dave V.

    Thanks for hosting a terrific event. The expert panel rocked, and Chiny was a terrific moderator. Very good questions and pace. My takeaways: Spark has energy and momentum in the user community, among the Hadoop Distributions, and early-adopters with systems in production. It can be used without the Hadoop platform, but it benefits from running on YARN. While known as an in-memory execution engine, Spark degrades gracefully when the data size exceeds available memory. Complimenting speed of execution, the developer community raves about speed of development resulting from compact and useful libraries. Common Spark use cases today are ETL, streaming, and Machine Learning, and query. The Machine Learning Library (MLlib) is small but growing and may leap-frog Mahout. Spark Streams is an alternative to Storm for streaming data. Spark SQL promises low latency query. While immature , the active Spark developer community will remove bugs rapidly. Spark will only improve.

    3 · July 24, 2014

    • Kartik M.

      Hi Dave, one of the use case of Spark, you mentioned is ETL. If you could refer web link of Spark for ETL, it shall be of great help.

      July 27, 2014

  • Bernadette H.

    Fantastic panel of experts. Deeply technical, very committed to sharing and advancing the Apache Spark development community -- thanks guys. Thank you to the organizers at MetiStream for assembling such an all star cast from Cloudera, Pivotal, MapR and Hortonworks. Based on the considerable turn out, questions, energy in the room and post-meeting discussion, there is huge interest in Spark & better tools to wrangle big data. Thanks.

    1 · July 23, 2014

  • Trevor L.

    One of the best meetup sessions I've been to. Ted, Casey, Keys and Dan were great speakers. Lots of good and timely info.

    1 · July 23, 2014

  • Donna F.

    THANKS to all for joining us and packing the house! Kudos to Dan, Keys, Casey, and Ted and our moderator Chiny for a lively and dynamic discussion! Thanks to Pivotal, MapR, Hortonworks, and Cloudera for sponsoring the event and getting your top guys to the table to chat about Spark! They did you proud! Special shout-out and thank you to MapR for the great F&B! Thanks to my MetiStream team and all our friends who helped pull this together! Let's keep the Spark excitement going ...if you have a speaking topic idea or would like to speak, sponsor or provide a space for the next meetup please contact me at [masked]. Thank you and see you next month! -d

    July 22, 2014

  • Christopher K.

    Informative. Stay "up to date" with what going on in the community. Great Ideas. Great panelists and interesting attendees.

    1 · July 22, 2014

  • Ann V.

    excellent! panelista were outstanding on an exciting cutting edge new technology - 3 cheers from full housr atendance to Donna Fernandez for undertaking all the efforting involved in arranging such a successful professionally productive event!!!

    1 · July 22, 2014

  • Jules S. D.

    Thanks to the distinguished panelists and MetiStream.

    2 · July 22, 2014

  • Kartik M.

    Great discussions and insights. Look forward for next meetup.

    2 · July 22, 2014

  • Kartik M.

    Has any member installed Hadoop cluster using Docker/ Vagrant. If so, would be great to get your guidance.

    July 19, 2014

Our Sponsors

  • IBM

    Speakers, facilities, F&B, and lots of support for Spark

  • MetiStream

    coordination, speakers, F&B, expertise as a Certified Spark SI & Trainer

  • Databricks

    Speakers, F&B, insight from the creators of Apache Spark!

  • Cloudera

    Speakers, F&B, and a passion for Spark!

  • Endgame

    Speakers, F&B, facilities, cool work with Spark!

  • Zoomdata

    F&B, speakers, and Certified on Spark software provider

  • O'Reilly Media

    Catering, conference discounts, books, & Strata Conf. ticket.

  • CodeNeuro

    Speakers and Spark support

  • Neustar

    facilities, F&B, support for Spark community!

  • Orchestro

    Facilities, F&B, speakers, Spark insight

  • Tableau

    F&B, speakers, and Spark + Tableau expertise

  • Platfora

    Speakers, F&B, and Platfora built on Spark

  • Raytheon

    facilities, F&B, AV, and excited for Spark

  • Tetra Concepts

    F&B, collaboration, and a great love for Spark!

  • Salient Federal Solutions

    facilities, F&B, Spark enthusiasts

  • Apex Systems

    prizes, facilities, and a Spark of enthusiasm!

  • Renewable Energy Corporation

    general sponsorship + prizes

  • DataStax

    speakers, F&B, insight on Cassandra + Spark

  • AddThis

    facilities + Spark ethusiasts + support

  • IOT DC

    speakers, joint sponsorship of events, insight to IOT

  • O'Reilly

    free books and eBooks, discounts, trinkets, and other support

  • Hortonworks

    Speakers, F&B, and Spark Ready on YARN

  • MapR

    Speakers, Spark, and F&B

  • Pivotal

    Speakers, F&B, and full Apache Spark stack on Pivotal HD

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy