addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

April: Lightning-Fast Cluster Computing with Spark and Shark

  • Apr 16, 2013 · 6:30 PM
  • Bronto Software, Inc.

Speakers: Mayuresh Kunjir and Harold Lim, Duke University

Spark is an open-source cluster-computing system developed by the AMPLab at the University of California, Berkeley. Spark provides very fast performance and ease of development for a variety of data analytics needs such as machine learning, graph processing, and SQL-like queries. Spark supports distributed in-memory computations that can be up to 100x faster than Hadoop.

Shark is a Hive-compatible data warehousing system built on Spark. Shark supports the HiveQL query language, the Hive Metastore, and all the serialization formats supported by Hive. The use of Spark and a number of built-in optimizations make Shark perform up to 100x faster than Hive.

This talk will discuss the internals of Spark and Shark, the applications that these systems support, and show a demo that includes performance comparisons with Hive.

Join or login to comment.

  • David R.

    Thanks for the time and presentation!

    April 16, 2013

  • Matthew M.

    Looking forward to learning more!

    April 12, 2013

  • Venkat M.

    New to RTP. Good to join the group.

    April 10, 2013

40 went

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy