Hadoop on Greenplum
Donald Miner, EMC
Greenplum is using Hadoop in several interesting ways as part of a larger big data architecture with EMC Greenplum Database (a scale-out MPP SQL database) and EMC Isilon (a scale-out network-attached storage appliance). After a quick introduction of Greenplum Database and Isilon, I list some ways Greenplum is tightly integrating with Hadoop and why we would want to do such a thing. Integration points discussed include: Greenplum Database external tables to seamlessly access data in HDFS, querying HBase tables natively from Greenplum Database, Greenplum Database having its underlying storage on HDFS, and Isilon OneFS as a seamless replacement for HDFS.
Dr. Donald Miner serves as a Solutions Architect at EMC Greenplum, advising and helping customers implement and use Greenplum's big data systems. Prior to working with Greenplum, Dr. Miner architected several large-scale and mission-critical Hadoop deployments in the U.S. Intelligence Community. He is the author of the upcoming book "MapReduce Design Patterns", which will be published by O'Reilly in the Fall of 2012. He is also involved in teaching, having previously instructed industry classes on Hadoop and a variety of artificial intelligence courses at the University of Maryland, BC. Dr. Miner received his PhD from the University of Maryland, BC in Computer Science, where he focused on Machine Learning and Multi-Agent Systems in his dissertation.
Big Data- Getting Down to Brass Tacks (a consultant's story)
David Douglas, CrinLogic
The Big Data headlines are unrelenting; with each passing day seemingly bringing new discoveries, products, partnerships, venture funds, you name it into the mix. If anything, it is all a bit confusing. Listening to all this you might come to the conclusion that Big Data will solve most of your problems, place your company miles ahead of your competition, drive your Net Promoter Scores through the roof, and fall just short of solving world hunger (ok…maybe not that far).
And one can’t blame you if you think all one needs to do is install the Hadoop ecosystem of projects, conjure up some possible business use cases, throw some commodity hardware into the mix, attend some training, purchase some Big Data analytics software and VOILA, you have arrived and can enjoy the fruits of your Big Data efforts.
With tongue firmly planted in cheek, the reality is vastly different. This talk is partially a reality check on Big Data implementation strategies - starting with Big Data is easy, becoming proficient is hard, fully integrating into a broader enterprise data strategy is very hard – and partially an information sharing session on what we’re learning as we engage with customers in various industries on Big Data. Among other things we will explore: building the business case; software and hardware requirements analysis; selection process and implementation approaches; what tends to work well, not so well, and what to avoid; and how big data is likely to affect enterprise data architecture.
David Douglas is a member of Hadoop-DC User Group and is a co-founder of CrinLogic, a Big Data consultancy based in the greater DC area. He has devoted his 17 years of professional experience to helping clients maximize the value of their strategic IT initiatives. Prior to co-founding CrinLogic, David started two other companies. The first was an angel-backed Sales Force Automation software company he sold in 2002 and the second is a consulting services company that focuses on Agile and Lean software adoption and large-scale program implementation services. He helped start the Data Warehousing practice at American Management Systems and was one of the first consultants to join IBM’s Business Intelligence practice.