Building Real-time Data Pipelines with Spark


Details
Thank you to Motus (https://www.motus.com/) for hosting this event!
Join us for dinner, drinks and some great talks about how you can use Apache Spark with Mesos and DC/OS to build real-time data pipelines.
6:30pm - Arrival and registration
7:00pm - Jörg Schad, Developer Evangelist at Mesosphere
7:30pm - Jeffrey Zampieron, Director of Software & Cloud Services at Beco Inc
8:00pm - Networking
Let’s SMACK! - Building Real-time Data pipelines
Stream-processing has helped us to achieve lower latencies compared to traditional batch processing. On the other hand, stream-processing also requires a more complex stack of tools to be fast and fault-tolerant. A typical stack in this field is the so-called SMACK stack. SMACK stands for Apache Spark (batch/stream processing), Apache Mesos (cluster manager), AKKA (JVM based actor framework), Apache Cassandra (storage layer), and Apache Kafka (message queue).
In this session, we will first look at the different components of the SMACK stack and the interaction between them. We will also discuss alternatives frameworks, for example Apache Flink for stream-processing for the different levels of the stack. Next, we will then discuss best practices for a) setting up such data pipelines and b) for how to keep them running (i.e., upgrades, monitoring, debugging, …).
The talk will conclude with a demo of an IOT application analyzing streams of taxi locations throughout New York City in real time.
Jörg Schad is a Developer Evangelist and Software Engineer at Mesosphere in Hamburg. In his previous life he implemented distributed and in memory databases and conducted research in the Hadoop and Cloud area. He has spoken at Meetups around the world, international conferences, and lecture halls. He spoke at Spark Summit EU 2016 and will be speaking this year in Boston. Watch his talk from last year here (https://www.youtube.com/watch?v=XBDIjkzgoZI&feature=youtu.be).
Using Beacons and Real-Time Data to Bring Buildings Online
This talk will outline how Beco is building scale-out architecture to drive their novel SaaS IoT solution on DC/OS. A discussion of explored alternatives and trade studies is included. Development and operational challenges as well as plans for future growth are also discussed.
Jeffrey Zampieron has over 10 years experience in the design and development of cloud services, distributed systems, sensor networks, as well as computer vision and machine learning algorithms within the commercial and defense marketplace. Since joining Beco Inc. (https://www.beco.io) as Director of Software and Cloud Services in 2015, Mr. Zampieron has been leading SaaS projects supporting their novel IoT solutions. Prior to Beco, at Systems & Technology Research, Mr. Zampieron developed software and algorithms for a variety of DoD applications in the areas of GPS-denied navigation, video compression and estimation. He has also worked as a software engineer for a number of organizations including L-3, DRS, Harris, and Intel where he has designed and implemented a variety of complex software systems across multiple architectures. Mr. Zampieron received his MS (2006) and BS (2006) from RIT in Computer Engineering.

Building Real-time Data Pipelines with Spark