March SF Hadoop Users Meetup

Details
The March SF Hadoop User Group meetup will be held Wednesday, March 11 from 6:00pm to 8:00pm. This meetup will be hosted by Altiscale and held at Splice Machine, 612 Howard Street Suite 300, San Francisco. Food and drinks will be served.
Presentation: Privilege Isolation in Docker Containers
Dinesh Subhraveti, Altiscale
Containers are a fundamentally different form of virtualization designed to directly virtualize applications rather than the operating system. The absence of a guest OS layer makes containers extremely lightweight, leading to almost imperceptible runtime overhead and startup latencies, orders of magnitude higher scalability and simplified management. Additionally, the container model enables a number of use cases such as online OS upgrades which are only possible through application virtualization.
Docker is an ambitious program with a charter to make the container primitives of the kernel trivially accessible to end users. Docker achieves the goal in part through a highly intuitive user interface that hides the complexity of kernel configuration by choosing the most appropriate defaults. It also provides a community-curated repository of self-contained application images that can be portably run on any host, regardless of its underlying state and configuration.
In this talk, Dinesh Subhraveti presents a quick background on containers followed by Altiscale's recent contribution of user namespace support to make Docker containers secure for use in multitenant environments. User namespaces prevent containerized applications from compromising the security of the host or other containers by isolating the scope of their privilege to the container in which they run (see this (https://www.altiscale.com/making-docker-work-yarn/) blog for details.) This feature will be employed in Altiscale's purpose-built Hadoop as a Service to securely isolate Hadoop tasks of different tenant customers
Bio: Dinesh Subhraveti is responsible for the multi-tenancy and virtualization infrastructure at Altiscale. He developed the notion of Operating System level virtualization as a part of his Ph.D., which later came to be known in the industry as Containers. His work, published in OSDI 2002, demonstrated for the first time that enterprise applications can be virtualized and live-migrated.
Continuing his work on Containers, Dinesh drove industry's first Container virtualization product for enterprise Linux applications at Meiosys, the company behind Linux Containers that IBM acquired in 2005.
Dinesh has authored over 35 patents and papers in the areas of virtualization, storage and operating systems, and holds a Ph.D. degree in computer science from Columbia University.
Presentation: Using HBase as a Foundation of a Distributed, Transactional RDBMS
Monte Zweben, Splice Machine
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 10x price/performance improvement over traditional databases.
In this talk, Monte Zweben, co-founder and CEO of Splice Machine, will discuss how Splice Machine used HBase to build an ANSI-99 SQL database with parallelization of SQL execution plans, ACID transactions with snapshot isolation and consistent secondary indexing - all without modifying the core HBase source. He will discuss how Splice Machine serializes SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, he will show how to simultaneously support both transactions and secondary indexing.
The talk will also cover how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
Bio: A technology industry veteran, Monte’s early career was spent with the NASA Ames Research Center as the Deputy Chief of the Artificial Intelligence Branch, where he won the prestigious Space Act Award for his work on the Space Shuttle program. Monte then founded and was the Chairman and CEO of Red Pepper Software, a leading supply chain optimization company, which merged in 1996 with PeopleSoft, where he was VP and General Manager, Manufacturing Business Unit.
In 1998, Monte was the founder and CEO of Blue Martini Software – the leader in e-commerce and multi-channel systems for retailers. Blue Martini went public on NASDAQ in one of the most successful IPOs of 2000, and is now part of JDA. Following Blue Martini, he was the chairman of SeeSaw Networks, a digital, place-based media company. Monte is also the co-author of Intelligent Scheduling and has published articles in the Harvard Business Review and various computer science journals and conference proceedings.
Zweben currently serves on the Board of Directors of Rocket Fuel Inc. as well as the Dean’s Advisory Board for Carnegie-Mellon’s School of Computer Science.

March SF Hadoop Users Meetup