NoSQL in an Hadoop World


Details
Well finally we have our next meetup. This is the second in a series of presentations by major players in the Hadoop eco system. Our intention is to give the vendors an opportunity to present their thoughts on Hadoop and to deep dive into areas that should be considered when evaluating the plethora of "Big Data" solutions. We are calling this series of presentations "Hey big vendor". Second up is Cloudera.
Agenda
6.30pm Networking
7.00pm Welcome
7.05pm "Hey big vendor - Cloudera"
By: Lars George - EMEA Chief Architect
Times are changing fast in the rather young Hadoop world. From the batch oriented beginnings using MapReduce, complemented with a random read and write store called HBase, the use-cases seemed rather easy to lay out.
That has changed dramatically with the advent of MPP style query engines like Impala, or automated as well as interactive ad-hoc querying using Spark or Search. These days it is much more involved to find the right tool for the job at hand. Should you rely on existing "state-of-the-art" advise - or is there more to be said when preparing data for further processing in the Hadoop ecosystem. Is there such thing as "single source of truth" and if not, why? What are the strengths of Impala, Spark, Search vs MapReduce and HBase?
All of these questions are addressed while going deeper into the architectural bowels of how each of these tools plays it's tricks.
In the end there is only physics and you cannot cheat it: which is the ultimate tool for your use-case?
8.30 - 9.00pm Networking
Presenter's Biography:
Lars has a computer science degree from the University of Applied Science Giessen-Friedberg in Germany. He spent his career as a software developer, building a human resource system for large organisation. He moved to Australia in 1999 taking on a role as CTO at a start up offering machine translation services, co-authoring a "one click translation" patent, among others. Lars lead the development and infrastructure team as architect, thought leader, and head developer over more than 10 years, building web services used by, for example, Microsoft Office to translate entire documents by users world-wide. During this assignment Lars had to architect a scalable, geodistributed, and failsafe system, that could serve millions of users all over the world, all on open-source software. In 2007 an extension to the service required the storage and processing of hundreds of millions of documents, leading to the first production cluster of HBase, backed by Hadoop. Lars started as solutions architect at Cloudera in 2010, bringing his wide, yet in depth, experience the services team. Eventually he took over the role as principal solutions architect and then director of all EMEA services. Today he runs an ever expanding team of pre and post sales engineers and architects to deliver Cloudera's technologies to companies that want to succeed using their data for future growth and new ways of high-value analytics.

NoSQL in an Hadoop World