align-toparrow-leftarrow-rightbackbellblockcalendarcamerachatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-crosscrosseditfacebookglobegoogleimagesinstagramlocation-pinmagnifying-glassmailmoremuplabelShape 3 + Rectangle 1outlookpersonplusImported LayersImported LayersImported Layersshieldstartwitteryahoo

MapR & Qubole: Apache Drill and Hive as a Service

Dear Hive members,

We have another great meetup planned for the 28th of June at 6:30 pm.

There will be two three two interesting talks. Facebook is presenting a Hive use-case, MapR will present and demo Apache Drill, and Qubole will present Hive & Hadoop as a service.


De-duplication data on the object graph with Hive

Speaker: Abhishek Doshi, Software Engineer, Facebook

Facebook enables users to easily express connections to objects, i.e the books they read, the movies they watch, the tv shows they like, etc. The objects can originate from Facebook or partner platforms. This talk presents a use-case on how Facebook uses Hive for de-duplication across the object graph, i.e. an open graph object from IMDb that represents the movie Top Gun and another one from Netflix that also represents Top Gun.

Abhi studied Electrical Engineering and Computer Science at UC Berkeley and started at Facebook in August of 2009. He worked on Ads, platform payments, and after moving to London in December of 2012 joined the platform entities team.

[The above talk will be held at another Hive meetup yet to be planned. This is due to some time constraints by the speaker.]


Apache Drill - interactive, ad-hoc query for large-scale datasets

Speaker: Michael Hausenblas, Chief Data Engineer EMEA, MapR Technologies

Apache Drill is a distributed system for interactive analysis of large-scale datasets, inspired by Google’s Dremel technology. It is designed to scale to thousands of servers and able to process Petabytes of data in seconds. Since its inception in mid 2012, Apache Drill has gained widespread interest in the community, attracting hundreds of people. In this talk we discuss how Apache Drill enables ad-hoc interactive query at scale, walking through use cases and review the system architecture. We then focus on Apache Drill's extensibility points, the supported query languages as well as data sources, including a demo of the system.

Michael works at MapR Technologies as Chief Data Engineer EMEA. His background is in large-scale data integration research and development, advocacy and standardisation. He has experience with NoSQL databases and the Hadoop ecosystem. Michael speaks at events, blogs about big data, and writes articles and books on the topic. Michael contributes to Apache Drill, a distributed system for interactive analysis of large-scale datasets.


Cloud Optimized Hadoop and Hive

Speaker: Joydeep Sen Sarma, Qubole

Qubole provides an analytics platform as a service in the Cloud. Hadoop and Hive are one of the core components of our technology stack. In this talk, I will talk about how we conceptualized Hadoop and Hive as a service and the key challenges we have faced in building a multi-tenant implementation in the cloud. I will cover some of the major operational, usability and performance enhancements we have made to Hadoop and Hive so far that have helped us achieve dramatic improvements over prior generation cloud offerings.

Joydeep is co-founder/CTO at Qubole where he's busy building the best analytics platform in the Cloud. He was at Facebook previously where he bootstrapped the Hadoop based analytics stack, started the Apache Hive project and led the Data Infrastructure team. Joydeep was a key contributor on the Facebook Messages architecture team that brought Apache HBase to Facebook. He cut his teeth building data driven applications as the lead engineer on Yahoo'sin-house Recommendation Platform.

Join or login to comment.

  • Salih O.

    For a new AWS user, it is good to see value added tools like Qubole around the ecosystem. Thanks Christian for a great event with a friendly environment.

    1 · July 1, 2013

  • Seref A.

    Great meeting. Hearing about the things that go wrong in real life projects was great, very valuable feedback.
    Thanks to Christian for organizing the meetup, the beer and pizza. A pitty such a good organization had such few attendees.

    1 · July 1, 2013

  • louis d v.

    are there any slides available?

    June 29, 2013

  • Christian P.

    Slides from the awesome fun and informative Hive London meetup on Friday. Thank you Joydeep Sen Sarma from Qubole ( ) and Michael Hausenblas from MapR and Apache Drill ( ).

    2 · July 1, 2013

  • Jacob

    Really interesting and we were able to dig deep into a lot of the technical details of both offerings. Shame there were so many no shows - but it did mean there was a lot of pizza!

    July 1, 2013

  • Sushil

    Very insightful presentation, coming from techie devoid of all marketing crap.

    1 · June 29, 2013

  • Andrey D.

    Lots of interesting discussions, tasty pizzas and cold beer! Amazing :)

    1 · June 29, 2013

  • Andrey D.

    Very interesting

    June 29, 2013

  • Christian P.

    Great talks!

    1 · June 28, 2013

  • Shivaprasad n.

    Due to some personal urgency i am not going for this, waiting list guys pls go ahead.

    June 28, 2013

  • A former member
    A former member

    Look forward to this meetup. It's my first with you.

    June 23, 2013

  • Rajamannar

    I want learn more about large data set

    May 30, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy