29. Streaming Analytics Made Easy: Hortonworks DataFlow and Druid

Details
Agenda
• 17.45: Drink, socialize
• 18.00: First talk: Streaming Analytics Made Easy: Hortonworks DataFlow and Druid
There is a growing need to have access to information for decision making faster than ever before. And part of this is because companies are looking to capitalize on insights that can only be gained from data while it is in motion; perishable information that is time sensitive and may not have value by the time it lands in an enterprise data store or lake. Stream processing technologies provide a framework for building applications that continuously run analysis on data in motion. However, the complexity involved in designing, developing, and operationalizing streaming applications require specific skill sets and may lead to extensive delivery timelines which lengthen the time for companies to realize value from these systems.
Streaming Analytics Manager (SAM) was created to simplify the development and reduce the delivery time of analytic applications geared towards data in motion. Using a drag-and-drop interface, application developers can create complex streaming analytics apps for event correlation, context enrichment, and complex pattern matching; eliminating the need for specialized skillsets. SAM also provides the concepts of service pools and environments, which help users easily define the streaming technologies and environments their application will use for execution, and a streaming operations view to give users insight into their application’s performance during runtime. With SAM as an analytics solution, users will get a rich experience for building and managing streaming analytics applications and bring these applications to market considerably faster.
Join us as Hellmar Becker, Solutions Engineer at Hortonworks gives us the latest in Big Data and Streaming Technologies using SAM. During this session, we will discuss how SAM fits into the overall solution for Hortonworks DataFlow Platform and demonstrate SAM’s capabilities in the context of a familiar use case for streaming analytics.
• 18.45: Eat, drink, socialize (more)
• 19.00: Second talk: Operational Analytics: Things I Wish I Had When I Was A Quant
Away from the idealised depiction of the world of Data Science lies the real-world of the working Data Scientist.
In popular fiction, data scientists spend their time creating these beautiful Data Models from large piles of clean, structured, correct data and instantly provide deep insights that change the business landscape.
The reality is that Data Scientists spend 75% of their time moving, cleaning and shaping unwieldy, unclean, multi-structured data into various data containers before they can even start building their models. Once the model-building is underway collaboration is called for and this only increases the level of complexity to deal with.
Venkatesh Sellappa (Venky) will walk through a classic use-case of the customer churn story and show how these increasingly complex problems of data engineering, collaborative development, model development lifecycle, varying language toolkits can be mitigated using a combination of Apache NiFI and HDP with IBM DSX.
This session is an experience report and like every experience report it tells a story - the story of a project, unvarnished and without artifice, programmer to programmer, often involving one or more of comedy, tragedy, farce etc. This report is no exception.
• 19.45: drink, socialize (even more)

29. Streaming Analytics Made Easy: Hortonworks DataFlow and Druid