Skip to content

Details

Introduction to Apache Tajo: Future of Data Warehouse

Presented by Jihoon Son, PMC member and a committer for Apache Tajo

https://kr.linkedin.com/in/jihoonson

Abstract: Apache Tajo is a data warehouse system for Web-scale data. It provides virtual integration of a multitude of diverse data sources, thereby facilitating easy and rapid data integration which has been regarded as an essential, but heavy step in business intelligence. In addition, it has a fault-tolerable distributed query engine for accelerating query speed. With the “query federation” and “distributed processing” capacities, Tajo is capable of providing users with reliable and efficient analysis of Web-scale data spread on multiple sources. I will introduce Apache Tajo including its overall architecture, current state and challenges, and discuss advantages what Tajo can bring to users. In addition, I will give a demo of integrated data analysis with Tajo.

Members are also interested in