Enabling fast query response and BI support from Data Lakes with Michael Sick
Details
Our speaker, Michael Sick, is a Senior Manager and Data Architect at Serene Software.
Abstract:
Data Lakes and their Big Data underpinnings were announced with great excitement and promise over the last 10-15 years. Both concepts have received broad industry criticism for over-hyping their capabilities and for failing to lower the overall costs of managing and serving data. A number of tools have emerged that increase the value of the Data Lake by enabling the lake to directly serve high-performance Business Intelligence(BI) and ad-hoc query use cases. This method reduces (though rarely eliminates) the need for higher-cost Data Warehouses.
Attendees will learn how to enable high performance BI and queries directly from the Data Lake including:
· A quick survey of the relevant established and emerging solutions
· A discussion of how Open Source Software (OSS) is dominating the space
· An introduction to the sample data set
· Sample code and demonstrations for two Query Engines (Dremio and Trino)
· Architecture: What’s the impact on my Data Warehouse?
After the talk, attendees will understand why enhancing their Data Lake for BI will help, how to get started on an evaluation, and how their data architecture and best practices should be modified for success.
Abstract Note: All data and code samples will be made available on cloud storage and Github respectively.
