The venerable MapReduce framework has allowed Hadoop to prove its worth in the big data space, and to store and analyze much larger data sets than was possible before. But there is a lot of activity in the big data ecosystem currently surrounding other major categories of workflows beyond batch.
These emerging tools include low latency i/o (HBase), interactive queries (Drill), stream processing (Storm), and text processing / indexing (Solr). This talk discusses some of the more interesting developments in Drill and Storm, their capabilities, and how they are being put to use in real world situations.
Brad Anderson has been wrangling data for 20 years, first with enterprise data warehouses and more recently building and using non-relational big data tools. Previously, he worked on a large-scale video-on-demand platform in Erlang, helped Cloudant build its hosted NoSQL offering based on CouchDB, and organized the NoSQL East 2009 conference in Atlanta. He has recently worked with Cascading, Storm and Neo4J, and has contributed code to the HBase open-source project. Brad has founded or co-founded four technology companies, and his first company operates today as Mirus Restaurant Solutions.
(Note that Brad's original Apache Drill talk had to be postponed due to weather. We're excited about this new, expanded version!)