Bay Area Hadoop User Group (HUG) Monthly Meetup


Details
Agenda:
6:00 - 6:30 - Socialize over food and beer(s) 6:30 - 7:00 - Giraffa File System to Grow Hadoop Bigger 7:00 - 7:30 - Apache Drill for Interactive Analysis 7:30 - 8:00 - Elastic, Multi-tenant, Highly Available Hadoop on Demand
Session I: Giraffa File System to Grow Hadoop Bigger (6:30 - 7:00 PM)
HDFS scalability and availability is limited by the single namespace server design. Giraffa is an experimental file system, which uses HBase to maintain the file system namespace in a distributed way and serves data directly from HDFS DataNodes. Giraffa is intended to provide higher scalabilty, availability, and maintain very large namespaces. The presentation will explain the Giraffa architecture, the motivation, will address its main challenges, and give an update on the status of the project.
Presenter: Konstantin Shvachko (PhD), Founder, AltoScale
Session II: Apache Drill for Interactive Analysis (7:00 - 7:30 PM)
Apache Drill is a new open source Apache Incubator project for interactive analysis of large-scale datasets, inspired by Google's Dremel. It enables users to query terabytes of data in seconds. Apache Drill supports a broad range of data formats, including Protocol Buffers, Avro and JSON, and leverages Hadoop and HBase as data sources. Drill's primary query language, DrQL, is compatible with Google BigQuery. In this talk we provide an overview of the Drill project, including its design goals and architecture.
Presenter: Jason Frantz, Software Architect, MapR Technologies
Session III: Elastic, Multi-tenant, Highly Available Hadoop on Demand (7:30 - 8:00 PM)
Serengeti is an open-source project, initiated by VMware, to enable the rapid deployment of Hadoop clusters in virtual environments. While Hadoop clusters are typically run on physical machines, Serengeti aims to bridge Hadoop and virtualization, and bring the classic benefits of virtualization to the Hadoop user. Leveraging virtual machines, Serengeti-deployed clusters can be simply operated, configured for HA protection, and made elastic through the decoupling of Hadoop compute and data layers. In this talk, we explore each of these aspects of running Hadoop on a virtual platform.
Presenter: Kevin Leong, Product Manager, VMware
Yahoo Campus Map:
Detail map (http://photos4.meetupstatic.com/photos/event/2/8/e/d/600_21370477.jpeg)
Location on Wikimapia:
http://www.wikimapia.org/#lat=37.4181633&lon=-122.0250607&z=18&l=0&m=b&search=yahoo

Sponsors
Bay Area Hadoop User Group (HUG) Monthly Meetup