Almost all companies who use Hadoop for big data processing also have real-time needs, for which they have either built a custom solution or have a separate cluster per application. DataTorrent is a platform for stream processing native to Hadoop. The platform processes incoming streams of data in real-time, all in-memory, and outputs data. We can process incoming streams at tens of millions of events/sec, with latencies in milliseconds. We support automatic load scaling, and node outage recovery in real-time with no data loss and no human intervention. We have open sourced library of pre-defined operator and generic application templates under the project name Malhar. We provide tools with a rich interactive user interface for debugging, monitoring, charting of application data.
Amol Kekre (CTO - Data Torrent, Inc.)
Chetan Narsude (Engineer - Data Torrent, Inc.)
Thomas Weise (Engineer - Data Torrent, Inc.)
Sponsored by Data Torrent - https://www.datatorrent.com/
Data Torrent will be providing Pizza and Beer.
6:00-6:30 Networking, Pizza and Beer
6:30-7:30 Essentials of a Big Data Architecture