ClearStory use case + HA Spark Streaming
Details
Live Stream Link: https://www.youtube.com/watch?v=jcJq3ZalXD8
Two presentations this month:
First, a deep-dive on the Spark use case at ClearStory Data, which is hosting the event at their Menlo Park office.
Also, Tathagata Das @tathadas (https://twitter.com/tathadas) (Databricks) and Hari Shreedharan @harisr1234 (https://twitter.com/harisr1234) (Cloudera) will present about the Design of Spark Streaming High Availability. Spark Streaming extends Spark’s power to real-time processing of data. Spark Streaming, though, can lose small amounts of data if the host running the driver application fails. In such a case, data equivalent received in the current batch may be lost and never processed. If the system sending the data can re-send the data, Spark can process it when re-sent, otherwise it can be a tricky problem to solve. Over the last year, engineers from the community representing several companies started working to fix this problem. In this talk, we will first look at the exact problem we want to fix, and the scope of the fix we want to implement. We will then discuss the design and implementation of the solution that is being implemented in Spark to ensure that there is no data loss in any case.
Agenda
6:00pm - 7:00pm - Drinks, food, mingle
7:00pm - 8:30pm - Presentations
8:30pm - 9:00pm - More mingling
There should be plenty of parking. We will have wifi access. The talks will have live streaming/recording -- we'll post the URL for that on the day of the event.
About The Host
A Gartner Cool Vendor in Big Data for 2014, ClearStory Data (http://clearstorydata.com/ (http://www.clearstorydata.com/)) is bringing next-generation Data Intelligence to everyone in order to accelerate the way businesses get answers across any number of data sources. By dramatically simplifying data access to internal and external sources, harmonizing disparate data on-the-fly, and enabling fast, collaborative exploration, ClearStory Data’s end-to-end solution includes an integrated platform and incredibly simple user application. The company is backed by Andreessen Horowitz, DAG Ventures, Google Ventures, Khosla Ventures, and Kleiner Perkins Caufield & Byers (KPCB).
