Skip to content

The Data Lake Use Case: Insights in days or weeks rather than months

Photo of Uli Bethke
Hosted By
Uli B.
The Data Lake Use Case: Insights in days or weeks rather than months

Details

In a traditional data warehouse it takes anywhere from three to nine months to implement a subject area. With a data lake you can significantly shorten the time to insights and also include data types not supported by the data warehouse, e.g. unstructured data. We will describe the concept of the data lake, how it complements the enterprise data warehouse, and have a look at the available toolset to harvest it.

Presentations.

Hadoop Data Lake (Uli Bethke, Sonra). What is a data lake? Why do we need it? What are the benefits? What data lake tools are out there? As part of the presentation we will also look into the various components of a data lake. Data Ingestion, Data Curation, Data Wrangling, Data Discovery, and Governance.

Apache Drill (Stephen Holdship, MapR). Drill is a distributed system that delivers interactive analysis of large-scale datasets on Hadoop, using familiar ANSI SQL semantics. Apache Drill provides unique value in providing data exploration capabilities without the need for centralised schemas - allowing you to run SQL queries instantly on new or complex data formats including JSON and HBase tables:

Drill Introduction

Key differentiation for SQL Specialists and Business Analysts

Use Cases

Cold, Warm and Hot: 3 Tier Data Strategy & Tableau (Tableau). This session will cover how leading Big Data organizations are preparing data for analysis in Tableau. They are using a 3 tier strategy for Cold, Warm and Hot Data. We will cover how to apply the concepts of the Data Lake, Data Warehousing and In-Memory Computing to Tableau Analysis.

Demo – Visualizing Data with Tableau 9.0 (Tableau). Clustering for non-statisticians. With the advent of machine learning, advanced methods for clustering (CARS, k-mean, support vector machines, ...) have emerged. All of these methods require a deep mathematical and programming background. Visual methods can be effectively used to achieve the same results. Using the Tableau software, I will demonstrate how to create a clustering method in a few minutes. Added points: in this example, visual clustering performs better than most conventional techniques!

Photo of Data Engineering and Data Architecture Group (DEDAG) group
Data Engineering and Data Architecture Group (DEDAG)
See more events
WAYRA Ireland
28-29 Sir John Rogerson’s Quay, D2 · Dublin