November Meetup: Strata special and panel discussion

Dear HUG UK members,

I am pleased to announce our November Meetup at London's Hilton Metropole (Bleinheim Room) on 11th November.

This event is scheduled to be one of the biggest Meetups this year and we have some really exciting content lined up.

This Meetup will include 3 great presentations from :

Session 1 - Cloudera

Title: Impala: A Modern, Open-Source SQL Engine for Hadoop

Abstract: The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of fast SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. With Impala, the Hadoop community now has an open-sourced codebase that helps users query data stored in HDFS and Apache HBase in real time, using familiar SQL syntax. In contrast with other SQL-on-Hadoop initiatives, Impala's operations are fast enough to do interactively on native Hadoop data rather than in long-running batch jobs. Now you have the freedom to discover relationships and explore what-if scenarios on Big Data datasets. By taking advantage of Hadoop's infrastructure, Impala lets you avoid traditional data warehouse obstacles like rigid schema design and the cost of expensive ETL jobs.

This talk starts out with an overview of Impala from the user's perspective, followed by a presentation of Impala's architecture and implementation. It concludes with a summary of Impala's benefits when compared with Apache Hive, commercial MapReduce alternatives, and traditional data warehouse infrastructure.

Speaker: Marcel Kornacker is a tech lead at Cloudera for new products development and creator of the Cloudera Impala project. Following his graduation in 2000 with a PhD in databases from UC Berkeley, he held engineering positions at several database-related start-up companies. Marcel joined Google in 2003 where he worked on several ads serving and storage infrastructure projects, then became tech lead for the distributed query engine component of Google's F1 project.


Session 2 - Syncsort

Title: Smarter Big Data Integration for Hadoop

Abstract: Hadoop has become a de facto standard in supporting Big Data analytics. A very common use case for Hadoop is data Transformation and a new way to deliver ETL and SQL migration. With this in mind, Syncsort has made a contribution to Apache Hadoop that not only makes sort pluggable, but also facilitates new and difficult real world ETL use cases and database off-load, working natively within the MapReduce framework. This session will show (including a short demo) how the Syncsort contribution optimises ETL processes which enable vertical scalability and a smarter integration tool set for Hadoop.

Speaker: Steven Haddad, Senior Big Data Solution Consultant, Syncsort


Session 3 - Spotify

Title: From 1 to 100 Hadoop developers: Scaling for developer productivity at Spotify

Abstract: The demand for processed data is increasing exponentially at Spotify and we've found our developer infrastructure to be at least as much of a barrier to that scaling as having the hardware available. This talk will tell the story of what problems we've faced in transitioning fro the Hadoop team being a small annexe of analytics to having dozens of developers throughout the organisation writing code to run on the cluster; and how we're working to solve these by building better developer infrastructure; from how data processing jobs are developed, tested and scheduled, to how the resulting datasets are catalogued to be discoverable and used by other developers

Speaker:  David Whiting, Data Infrastructure Engineer, Spotify. David spent 18 months in the data team at Last.fm and since Feburary has been developing data infrastructure at Spotify - making him something of an expert in working with music data sets. He mostly works with Hadoop, but can occasionally be found dabbling in data warehousing, SQL query optimisation and front-end web apps; as well as telling everybody else they're not doing enough testing and that everything is better with static typing. As well as generating music data, he also generates music under the guise of Demoscene Time Machine (http://music.demoscenetimemachine.com/ ), takes part in the occasional triathlon and has some very unusual dance moves.

and will also include a panel discussion (compered by Matt Aslett from 451Research) consisting of the following panelists:

• Doug Cutting from Cloudera

• Steven Totman from Syncsort

• David Whiting from Spotify

• Brett Sheppard from Splunk

The panel will cover topics and questions on the evolution and future of Hadoop, and what the collaboration between open source communities & technologies and traditional technologies & vendors will mean for the next generation of data management solutions.

 Look forward to seeing you there.

Join or login to comment.

  • Dan H.

    November 26, 2013

  • Salih O.

    Spotify presentation is one of the best i attended nowadays. It is a Big Data/Hadoop presentation but talking about generic Data Management issues. Retention policies,access privileges, team formation etc. Definitely check it again.

    2 · November 13, 2013

    • Milan

      Yes, Spotify was totally spot on including the panel diacussion. Big thank you to David!

      November 13, 2013

  • A former member
    A former member

    A

    November 13, 2013

  • Mil

    Great meetup, presentation and very informative.

    November 12, 2013

  • James H.

    Good meetup, interesting talks and discussion panel.

    November 12, 2013

  • David W.

    Platform-independent (but less pretty) edition of my slides from tonight: http://thewit.ch/presentations/scaling_development - A bit sparse but should jog your memory if you were here for the presentation.

    3 · November 11, 2013

    • Joyce S.

      Great presentation.

      November 12, 2013

  • Dan H.

    You can find the rest of the slides from last nights talk up here:

    http://www.slideshare.net/huguk/hug-london2013 - Cloudera Impala

    http://www.slideshare.net/huguk/sync-sort 0 SyncSort

    http://www.slideshare.net/huguk/spotify-28146292 - Spotify (Pretty version)

    http://www.slideshare.net/huguk/hadoop-uk-strata-panel - Discussion Panel

    Videos will be coming later this week.

    1 · November 12, 2013

  • A former member
    A former member

    Working late unfortunately - will not be able to get there in time

    November 11, 2013

  • Bryan F.

    Sorry, can't make this tonight.

    November 11, 2013

  • Alex M.

    Will there be time for socialising/networking before 7pm?

    November 11, 2013

    • Dan H.

      I'm not sure. There's space to sit in the Hilton hotel reception before hand but I'm not sure what of the conference rooms people will be allowed in before 7pm.

      November 11, 2013

  • Catherine H.

    Sadly I will no longer be able to make the meetup, will slides/videos be available after?

    November 11, 2013

    • Dan H.

      Yes we'll be posting videos and slides again after the meetup.

      November 11, 2013

  • Graham P.

    Sorry to say I will have to miss this.

    November 11, 2013

  • Jurek G.

    had to change my RSVP to No, place for grabs

    November 8, 2013

  • A former member
    A former member

    At the Big Data London Meetup

    November 5, 2013

  • Dan H.

    We won't be able to have a talk scheduled on InfiniDB as we plan further in advanced sorry. But if there's interest these are possible good discussions points for the panel and to carry on in person after the talks!

    Then as always please let us know what you'd like to hear in future meetups like this. We do take it onboard.

    November 4, 2013

  • A former member
    A former member

    InfiniDB sounds interesting, if the performance claims stack up - also the MySQLness would be of particular interest to many people

    November 4, 2013

  • A former member
    A former member

    Hi!

    October 29, 2013

  • sahera k.

    It is a good idea to hear about InfiniDB for Hadoop.

    October 26, 2013

  • Jim T.

    Hi Dan,

    I sent an email as well, but is there interest in hearing about InfiniDB for Hadoop? It is GPL V2, faster/more syntax than other SQL for Hadoop technologies including Impala.

    Let me know, trying to finalize plans for Strata in London.

    Cheers,
    Jim Tommaney
    CTO, Calpont

    October 25, 2013

Our Sponsors

People in this
Meetup are also in:

Create a Meetup Group and meet new people

Get started Learn more
Henry

I decided to start Reno Motorcycle Riders Group because I wanted to be part of a group of people who enjoyed my passion... I was excited and nervous. Our group has grown by leaps and bounds. I never thought it would be this big.

Henry, started Reno Motorcycle Riders

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy