Skip to content

47th Bay Area Hadoop User Group (HUG) Monthly Meetup

Photo of Yahoo! HUG Organizer
Hosted By
Yahoo! HUG O.
47th Bay Area Hadoop User Group (HUG) Monthly Meetup

Details

Address: Classrooms 4/5 at Building C at Yahoo Sunnyvale campus

Detailed agenda and summaries to follow. General agenda:

6:00 - 6:30 - Socialize over food and beer(s)

6:30 - 7:00 - This ain't your Father's Search Engine

7:00 - 7:30 - Securing data in Hadoop using Apache Hive

7:30 - 8:00 - Comprehensive, Centralized Security for Hadoop

Session I (6:30 - 7:00 PM) – This ain't your Father's Search Engine

In just a few short years, search has quickly evolved from being a small text box in the nether regions of a website to being front and center in our lives. Increasingly, however, the combination of search engine and Hadoop technology is also being used for practical, real time recommendations, events processing, complex spatial functionality and time series analysis capable of not only matching user's queries in text, but also driving real time decision making and analytics. In fact, open source Apache Lucene/Solr can do all of this and more by taking advantage of new data structures and algorithms as well as deeper integration with Hadoop and related projects. In this demo-driven talk, Lucene committer Grant Ingersoll will take a look at some of the new and exciting ways users are leveraging Lucene, Solr and big data to drive deeper insight into information needs that go beyond keywords in a text box.

Speaker: Grant Ingersoll, CTO and co-founder, LucidWorks

Bio:

Grant Ingersoll is the CTO and co-founder of LucidWorks as well as an active member of the Lucene community – a Lucene and Solr committer, co-founder of the Apache Mahout machine learning project and a long standing member of the Apache Software Foundation. Grant’s prior experience includes work at the Center for Natural Language Processing at Syracuse University in natural language processing and information retrieval. Grant earned his B.S. from Amherst College in Math and Computer Science and his M.S. in Computer Science from Syracuse University. Grant is also the co-author of “Taming Text” from Manning Publications.

Session II (7:00 - 7:30 PM) – Securing data in Hadoop using Apache Hive

Apache Hive 0.13 shipped with support for SQL standards based authorization.It lets users manage access control using familiar SQL grant/revoke statements with users and roles. The model also facilitates the development of more complex access control patterns, such as the ability to restrict access to table data at the column or row level when used in conjunction with views.

This is the third authorization mode supported in Hive. In this talk, we will discuss how this compares with other available authorization modes. We will also discuss how this can be used in conjunction with Storage Based Authorization to address the different use cases of Hive.

Speaker: Tejas Nair, Software Engineer , Hortonworks

Bio:

Thejas Nair is a software engineer working on Apache Hive and Apache Pig at Hortonworks. He is a committer and PMC member of these Apache projects. His most recent work has focussed on improving security features in Hive. Previously, he worked at Yahoo for 9 years, developing solutions for large scale distributed data processing.

Speaker: Chris Drome, Technical, Yahoo

Bio:

Chris Drome is Tech lead for Hive/HCat at Yahoo

Session III (7:30 - 8:00 PM) – Comprehensive, Centralized Security for Hadoop

With the advent of YARN, enterprises can adopt a true data lake architecture using Hadoop, supporting multiple use cases and applications within the same platform. And with the multi tenant environment comes the challenges of protecting sensitive data, controlling access and monitoring behavior across multiple user groups and different datasets. There is an increased focused on data privacy and compliance controls. Data security is now an important pillar in the enterprise Hadoop strategy. Enterprises are looking for enhanced support across authentication, authorization, auditing and data protection with a centralized framework for managing security in one place. The open source community, along with Hortonworks, is committed to bring comprehensive security across the Hadoop platform.

In this technical session, we’ll talk about the current work in enabling comprehensive security across the Hadoop platform, with a centralized security administration, capabilities for fine gained authorization, detailed auditing across HDFS, Hive and HBase, and data protection.

Speaker: Bosco Durai, Enterprise Security Architect , Hortonworks

Bio:

Bosco Durai is an Apache committer and currently working at Hortonworks, focused on enabling enterprise grade security within Hadoop platform. Bosco brings years of experience building and managing enterprise data security products. Before Hortonworks, Bosco was the co-founder and Chief Security Architect of big data security startup, XA Secure. XA Secure was built ground up to address the unique security challenges that big data environments bring. XA Secure was subsequently acquired by Hortonworks in May 2014. Bosco also was co-founder at Bharosa, a fraud detection startup which was acquired by Oracle in 2007.

Yahoo Campus Map:

Detail map (http://photos4.meetupstatic.com/photos/event/2/8/e/d/600_21370477.jpeg)

Location on Wikimapia:

http://www.wikimapia.org/#lat=37.4181633&lon=-122.0250607&z=18&l=0&m=b&search=yahoo

Photo of Bay Area Hadoop Meetup group
Bay Area Hadoop Meetup
See more events