2 Sessions: What is Lambda?; Securing Hadoop

Name: 2 Sessions: What is Lambda?; Securing Hadoop
Start: 2015-03-24T18:00:00-04:00
End: 2015-03-24T21:00:00-04:00
Location: Localytics NEW Office

Hosted by Michal K.

Boston Hadoop User Group

Details

Session 1: "What is Lambda?", John Hugg, Software Engineer, VoltDB

Session 2: "Securing Hadoop - What are your options?", Sudeep Venkatesh, VP of Solution Architecture, HP Security Voltage

SESSION 1 DETAILS: What is Lambda?

The architecture is designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods. It proposes that both speed/streaming and batch workloads be run in parallel on the same incoming data. The speed layer can achieve faster responses, while the batch layer can be more robust and serve as the system of record for historical data. Lambda also requires a serving layer to serve results. What’s a real-world example? Ed Solovey of Twitter (formerly Crashlytics) has given several talks on the use of the Lambda Architecture for the Crashlytics service, including a 20-minute presentation given at the October Boston Facebook @Scale Conference ( http://youtu.be/56wy_mGEnzQ ).

The company needed to identify how many times end-users access a mobile app, which means handling hundreds of thousands of unique ids per second. To solve this problem the company implemented the Lambda Architecture. In the speed layer, they enlisted Kafka for ingestion, Storm for processing, Cassandra for state and Zookeeper for distributed agreement. In the batch layer, tuples were loaded in batches into S3, then processed with Cascading and Amazon Elastic Map-Reduce. The problem? It’s complicated. Sure, this approach can meet high-performance needs, but it comes with tremendous cost. Focusing on just the speed layer, getting Zookeeper, Kafka, Storm and Cassandra combined into a reliable fast data engine is expensive in developer time, computing resources and in operational ongoing support. Each system requires at least three nodes to run, meaning your speed layer is at least 12 nodes, and often larger. And once the speed layer is working, the batch layer is a second problem with its own complexity. Even with reliable components, the odds that any single component will have issues, goes up as the number of components rises. And when a component fails, how well trained is the operational support when there is such breadth to the app? Isolating which component is the issue is difficult when you must also consider inter-component interaction when hunting for problems.

What’s a better answer? Simplify. Simplify. Simplify. Removing just a single component of a typical Rube-Goldbergian Lambda implementation can reduce complexity and cost, but also will make it easier to change the application as business needs change. Look to replace all or part of your Lambda stack with a more integrated solution and this discussion will show you how with a clear example. See how thousands of lines of code becomes 30 when you collapse disparate systems with those that integrate ingestion, state, agreement and processing.

SESSION 2 DETAILS:

Securing Hadoop – What are your options? / HP Security Voltage:

A key driver to getting Hadoop into production is to enable rapid time-to-insight for your company. Unfortunately, the realization that sensitive, regulated data—payments transactions, customer personally identifiable information (PII), and more—will be flowing unprotected into the Hadoop “data lake”, can present a big hurdle to implementation. Join us to understand architectural options for securing Hadoop data, illustrated by real-world use cases.

Securing Hadoop Data
Get the theory–learn how data-centric encryption and tokenization technologies enable successful Hadoop adoption, neutralize data breaches and answer privacy and regulatory concerns. Get clear on related standards. And understand how data-centric security fits with the latest authentication, authorization and audit controls in Hadoop.

Use Cases for Data-centric Security in Hadoop
How it works–find out how Hadoop deployments are rolled out with data-centric protection in place. In this customer case-driven talk, presents technical and business specifics around 4 - 5 recent Hadoop deployments in pharma, healthcare insurance, telecommunications, and retail. Includes what you need to know, how to get started, what the deployments look like, and options for integration with Hive, Sqoop, MapReduce and other Hadoop specific interfaces, in these multi-platform Enterprise environments.

Bio:

Sudeep Venkatesh is a noted expert in data protection solutions, bringing over a decade of industry and technology experience in this area to HP Security Voltage. His expertise spans data protection in Hadoop and Big Data ecosystems, security infrastructures, cloud security, identity and access management, encryption, and the PCI standards both for the commercial and government sectors. He has worked on numerous global security projects with Fortune 500 firms in the United States and globally.

At HP Security Voltage, Sudeep serves in the position of Vice President of Solution Architecture, with responsibility over designing solutions for some of HP Security Voltage's largest customers in the end-to-end data protection portfolio. This includes email, file and document encryption, as well as the protection of sensitive data in databases, applications and payments systems. Prior to this, he was part of the Sales Engineering team at RSA Security where he designed technical solutions for some of its largest customers.

Sudeep holds a B. E. (Hons.) degree in Electronics Engineering from the Shivaji University, India.

Events in Boston, MA

2 Sessions: What is Lambda?; Securing Hadoop

Boston Hadoop User Group

Details

Members are also interested in