align-toparrow-leftarrow-rightbackbellblockcalendarcamerachatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-crosscrosseditemptyheartfacebookfullheartglobegoogleimagesinstagramlocation-pinmagnifying-glassmailmoremuplabelShape 3 + Rectangle 1outlookpersonplusImported LayersImported LayersImported Layersshieldstartwitteryahoo

October Hadoop Meetup: Streaming analytics and approximation

Dear HUG UK members,

I am pleased to announce our October meetup on 'Streaming analytics and approximation'.

This event, sponsored by Strata, will be at the TechHub@Campus.

Details below.

Sebastian


TIME:

Tuesday October 22nd 2013, Doors Open 6:30pm.

Presentations 7:00pm – 8:30pm.

LOCATION:

TechHub @ Campus

5 Bonhill St, London, EC2A 4BX


AGENDA:

Session 1: Apache Samza: Distributed Stream Processing with Kafka and YARN.

Speaker: Jakob Homan, Senior Software Engineer at LinkedIn.

Abstract: Samza is a new distributed stream processing framework developed at LinkedIn and recently incubated into the Apache Software Foundation. Built atop YARN, it provides fault tolerance, durability, scalability and even local state with a simple, Map-Reduce-like interface. 

Short bio: Jakob Homan is a Senior Software Engineer at LinkedIn, an Apache Hadoop committer and PMC member and works on Samza full time.


Session 2: Storm at spider.io - Cleaning up fraudulent traffic on the internet

Speaker: Ashley Brown, Chief Architect at spider.io.

Abstract: This talk will be charting spider.io's journey from being a Storm early adopter, to their freeze of Storm releases and switch to batch processing only, to coming full circle and implementing new fraudulent traffic algorithms with Trident. 

Short bio: Ashley Brown is the Chief Architect at spider.io. He has previously worked on quantum chemical modelling, pipeline inspection robots and a control system for newspaper presses. He has published papers on the use of speculative hardware optimisations to accelerate key kernels for scientific computations.


Session 3: Scaling by Cheating: Approximation, Sampling and Fault-friendliness for Scalable Big Learning

Speaker: Sean Owen, Director of Data Science at Cloudera.

Abstract: To keep analyzing more data, and faster, we need a secret weapon: cheating. In this brief survey, learn how you may be doing too much work in your analytics and learning processes, and how giving up a little accuracy can gain a lot of performance. With examples from Apache Hadoop, Mahout, and ML tools from Cloudera.

Short bio: Sean Owen is the Director of Data Science at Cloudera. Previously he founded Myrrix, a complete, real-time, scalable clustering and recommender system, evolved from Apache Mahout, which was acquired by Cloudera in July this year.



Join or login to comment.

  • Sebastian S.

    Links to the slides of the three presentations:

    Jakob Homan - Apache Samza
    http://www.slideshare.net/huguk/london-hugsamza

    Ashley Brown - Storm at spider.io
    http://www.slideshare.net/ashleywbrown/storm-at-spiderio-london-
    storm-meetup[masked]

    Sean Owen - Scaling by Cheating
    http://www.slideshare.net/huguk/october-hug

    October 27, 2013

  • nealobrien

    Its really appreciated when presentations are done so well, it makes it all worthwhile. Engineers streaming in from Linkedin no less, and that was just one of the three. Looking forward to the next one!

    October 28, 2013

  • sahera k.

    Thank you for providing links to the Three topics. You are really great group.

    October 28, 2013

  • Afzal S

    Nice presentations based on real use case. Looking forward for next one.

    October 23, 2013

  • Shruti T.

    Thank you all presenters and organizer for great presentations last evening and freebies for questions :) I've downloaded git code from [masked]:linkedin/hello-samza.git. On importing projects, eclipse started complaining "src/test/java" is missing for both samsa-wikipedia, samsa-job-package. I've fixed eclipse errors but am curious, if test could also be shared.

    October 23, 2013

  • Ian R.

    Three very good presentations last night but I'm amazed some people spent the whole meetup playing with their phone, etc. It disrespectful to the speakers and unfair on people who wanted to attend but could not get a place.

    1 · October 23, 2013

  • A former member
    A former member

    Great first meet for me, very interesting and thankfully for me not too over my head technically and good to see business applications of the technologies... and pizza!

    Are the slides available?

    October 23, 2013

  • Robin M.

    Written up a short blog about the use of approximation, following Sean's interesting presentation - see http://smart421.wordpress.com/2013/10/22/approximations-in-big-data-processing-architectures/ - only had time to write up 1/3 of the session, other 2/3 were just as interesting!

    October 22, 2013

  • A former member
    A former member

    Great topics, great speakers, great pizza

    October 22, 2013

  • Basil B.

    Very interesting. I learned a lot. It was good to hear about real world applications that used Storm and other big data technologies and hear the story of how they evolved and the business and technical hurdles that drove their development.

    October 22, 2013

  • Mark B.

    Great meetup. I think it works well to have a collection of talks that share a number of themes, in the way that this meetup did. It means that you can really mull over some of the ideas being presented. I realise this isn't always possible, but when it is, it makes for a very stimulating evening.

    October 22, 2013

  • James H.

    Unfortunately I'm under the weather so won't make this.

    October 22, 2013

  • Liam M.

    Sorry, thought I was on track but unforeseen circumstances mean I won't make it. Will catch video online.

    October 22, 2013

  • Seref A.

    Just got an e-mail telling me I now have a spot, but too late for me to change plans for today. Sorry, I hope the spot goes to someone who can make it today :)

    October 22, 2013

  • Rakesh

    I am unable to make this event this evening

    October 22, 2013

  • Rakesh

    Sorry cant make it tonight catch up on next session

    October 22, 2013

  • Kirill K.

    Sorry, cant do it today

    October 22, 2013

  • Andrew T.

    Sorry cannot make it tonight...hope it goes well :)

    October 22, 2013

  • louis d v.

    too bad i can t make it. hope it will be recorded !

    October 22, 2013

  • Rajesh

    Hi, will this event be filmed?

    1 · October 14, 2013

    • Seref A.

      That is great. I must have been confused, I was sure I confirmed my attendance when the announcement arrived, but now I'm on the wait list. It would be great to have a recording just in case I can't make it.

      October 16, 2013

    • A former member
      A former member

      Hi, if it'll be filmed, where can I watch it then?

      October 21, 2013

  • nealobrien

    I am looking to use hadoop for a system Im working on and would love to attend.

    October 16, 2013

  • nealobrien

    I am looking to use hadoop for a system Im working on and would love to attend.

    October 16, 2013

  • jason

    I'm Hadooper

    September 17, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy