Wikipedia's entry on Big Data begins:
“In information technology, big data is a loosely-defined term used to describe data sets so large and complex that they become awkward to work with using on-hand database management tools.”
This isn't a particularly satisfactory answer, and not simply because it is both self-referrential and vacuous.
Okay, maybe that is the reason. Regardless, everyone reading this deserves a more informative answer.
To that end, the talk will start off with a mercifully brief operational definition of Big Data. We will then look at a few examples that would appear to fit this definition, and why they are generally recognized as being “big data challenged” when “traditional database technology” is applied.
The bulk of the talk, then, will consist of a description, explanation and cross comparison of the three primary approaches to Big Data, namely:
1) Parallel, shared-nothing, (columnar), database clusters
2) Distributed key/value stores (nosql)
3) Map/Reduce (Hadoop) implementations
Questions will be welcome at any time during the talk. In fact, if you have any burning ones up front, you are welcome to submit them beforehand and, time permitting, they will addressed during the course of the talk.
Admittedly this is an ambitious agenda, but since the talk will be completely devoid of Powerpoint animations and sound effects, that extra time can be put to effective use.
Howie Rosenshine has spent more years in “the industry” than he cares to admit. He also has a Masters degree in Computer Science from Penn, where his interests were functional programming and databases. (This he is perfectly happy to admit.)
His formative years included working with device control and assembly language and Logic Programming/Knowledge Representation (Prolog).
He has spent the bulk of his career at Sun Microsystems (including, Oracle), where he was a Software Consultant (Unix/C) and a Systems Engineer. Sun was a wonderful place to work for many years; this included industries such as Wall Street/Finance, PharmChem and Defense (contractors and integrators). He spent the latter part of this time primarily in support of the Intelligence Community working for...oops, that was almost a no-no.
Howie is also the inventor of the Shovelution (shovelution.com), which is both patented and public. Other inventions include an optimal resistance exercise machine and a web search augmentation algorithm that one of his (very few) beta testers has dubbed “SBE” for Search by Example. These two are neither public nor patented...yet. It is true, of course, that neither the name “Search by Example” nor the concept is new, but this algorithm is notable in that it actually works.
Finally, Howie has a Black Belt in Bull Sigma, a paradigm invented by Nathan Myhrvold, that both corporations and individuals alike can use to effectively inflate their otherwise mediocre credentials and accomplishments.