Skip to content

Details

Everyone is talking about data lakes. The intended use of a data lake is as a central storage facility for performing analytics. But, Jim Scott asks, why have a separate data lake when your entire (or most of your) infrastructure can run directly on top of your storage, ​minimizing or ​eliminating the need for data movement, separate​ processes and​ clusters​,​ and ETL?

Note: This will be a combined meetup with the Atlanta Apache Spark User Group (https://www.meetup.com/Atlanta-Apache-Spark-User-Group/).

Members are also interested in