Distributed Data Lakehouse: Are you building one?

Details
A fast-growing data industry has led to fragmented solutions and unprecedented complexity of data platforms. We’ve seen data silos across data centers, regions, and clouds. There’s a strong demand for a simplified solution that can provide unification of data lakes, efficient data access, and management. Alluxio is a large distributed system that is a new layer between compute engines and storage systems. It provides complete virtualization across all data sources to serve data to applications that do not need to care about the location of data.
In this talk, we talk about an approach to architect an efficient data platform for multiple data pipelines with Spark, Delta Lake, and Alluxio, which is portable across environments, private or public clouds, for optimal cost and performance.
About The Speakers
Jasmine Wang is the Head of Community and DevRel at Alluxio. She is a former national debate champion who turned into a traveling yoga teacher with a strong passion for building teams and being the bridge at early startups in Silicon Valley. Previously, she worked as the Head of Global Talent Acquisition and Operations. Currently, she is building the Alluxio open source community, responsible for community marketing, developer relations, developer experience, and cross-community collaborations at Alluxio.
David Zhu is a software engineer manager at Alluxio. At Alluxio, David mainly focuses on metadata syncing, job service, and end-to-end performance benchmarking and optimizations. Prior to that, David completed his Ph.D. from UC Berkeley’s AMPLab, with a focus on distributed data management systems and operating systems for the data center. David also holds a Bachelor of Software Engineering from the University of Waterloo.
COVID-19 safety measures

Distributed Data Lakehouse: Are you building one?