Deep Dive with Shark (Hive on Spark)

Spark is an open source cluster computing framework that can outperform Hadoop by 30x by storing datasets in memory across jobs. Shark is a port of Apache Hive onto Spark, which provides a similar speedup for SQL queries, allowing interactive exploration of data in existing Hive warehouses. In this meetup, we'll go into detail on the implementation of Shark, and also show how to get started with a first alpha release.

The meetup will be hosted at Palantir Technologies in Palo Alto. Food will be available at 6:30, with talks starting at 7 PM.

 

More Details on Shark

We have ported Apache Hive, the large-scale Hadoop data warehouse solution, to run queries on Spark. The resulting system, Shark (Hive on Spark), can answer Hive QL queries 30 times faster than Hive without modification to the existing data. It is backward-compatible with the Hive QL language, metastore, and user-defined functions. We will cover the architecture and implementation of Shark, including our additions to Hive QL that allow users to cache data in memory, and a new column-oriented format we have designed for storing Hive data efficiently in memory on the JVM as arrays of primitive types.

Additionally, we will discuss our ongoing work on integrating SQL processing with machine learning, which we see as a natural future direction for Shark due to Spark's inherent efficiency at iterative algorithms. In Shark, we allow users to express their machine learning algorithms as Scala-based "distributed UDFs", which then run in the same execution engine as the SQL query processor. This enables much more efficient data pipelines, and provides a unified system for data analysis using both SQL and sophisticated statistical learning functions.

 

These topics will be presented by Reynold Xin, Cliff Engle and Antonio Lupher, the Berkeley research team behind Shark.

Join or login to comment.

People in this
Meetup are also in:

Sometimes the best Meetup Group is the one you start

Get started Learn more
Rafaël

We just grab a coffee and speak French. Some people have been coming every week for months... it creates a kind of warmth to the group.

Rafaël, started French Conversation Group

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy