StreamSets, For The Coding Minimalist In All of Us


Details
Craig Warman is a Solution Engineer from MapR and will be presenting StreamSets which is an open source data collection tool.
Description:
This presentation explores how StreamSets - an open-source data collection tool with a drag-and-drop IDE - can be used for building continuous data ingest pipelines with little or no coding required. A live demo is featured that examines connectivity between HBase, relational databases, Kafka, Elasticsearch, and the local filesystem.
Abstract:
Two facts about developers: 1. We like to code. 2. We're human (and, therefore, inherently lazy). Which is why we like our IDE's and libraries - They give us the flexibility to code our way through the tough problems, while automating the everyday mundane chores.
And that brings us to StreamSets. The promise: An open source IDE for building ingest pipelines with minimal (or no) code, with real-time data flow monitoring, and the ability to adapt to "data drift" caused by unanticipated changes that occur in data streams. This presentation takes a closer look at these capabilities as we step through a live demo using a sandbox environment - we'll configure connectors between HBase, relational databases (using JDBC), Kafka, Elasticsearch, and the filesystem, then we'll look at how these data flows can be monitored with anomaly alerts. Finally, we'll explore what happens when data starts to drift in directions we didn't anticipate.
Audience Take-Away:
Attendees will gain an understanding of StreamSets' capabilities, along with requirements for configuring data connectors, monitoring, and responding to change.
Prerequisite Knowledge:
Basic knowledge of typical data storage/query platforms (shared filesystems, databases, data warehouses, event logs)
Basic understanding of streaming data use cases

Sponsors
StreamSets, For The Coding Minimalist In All of Us