Intro to Apache Apex (native Hadoop) & comparison with Spark Streaming


Details
Agenda
5:45pm - Food & Networking
6:00pm - Introduction to Apache Apex - The next generation native Hadoop platform
6:30pm - Q&A
6:40pm - Architectural Comparison of Apache Apex (native Hadoop) and Spark Streaming
7:10pm - Q&A
Talks from Devendra Tagare - DataTorrent Engineer, Contributor to Apex, Data Architect experienced in building high scalability big data platforms.
More details to come.
Abstract
Talk 1: Apache Apex is a next generation native Hadoop big data platform. This talk will cover details about how it can be used as a powerful and versatile platform for big data.
Talk 2: Apache Apex is a native Hadoop data-in-motion platform. We will discuss architectural differences between Apache Apex features with Spark Streaming. We will discuss how these differences effect use cases like ingestion, fast real-time analytics, data movement, ETL, fast batch, very low latency SLA, high throughput and large scale ingestion.
We will cover fault tolerance, low latency, connectors to sources/destinations, smart partitioning, processing guarantees, computation and scheduling model, state management and dynamic changes. We will also discuss how these features affect time to market and total cost of ownership.
For deeper engagement with Apache Apex (http://apex.apache.org/)- follow ApacheApex (https://twitter.com/apacheapex), presentations (http://www.slideshare.net/ApacheApex), recordings (https://www.youtube.com/user/datatorrent), download (community (https://www.datatorrent.com/download/datatorrent-community-edition-download-meetups/), sandbox (https://www.datatorrent.com/download/datatorrent-rts-sandbox-edition-download-meetups/)), Apache Apex releases (http://apex.apache.org/downloads.html), docs (http://apex.apache.org/docs.html)

Intro to Apache Apex (native Hadoop) & comparison with Spark Streaming